Categories
| Category | Focus |
|---|---|
| Application security | Classic injection/code attack patterns in the request. |
| Data loss prevention | PII and secret detection + masking. |
| Content security | Jailbreaks, toxicity, moderation, document/URL analysis. |
| Behavioral security | Abusive-actor and anomaly detection across requests. |
| Agent & MCP security | Tool/function definitions and tool-call validation. |
The catalog
Each detector is identified by a stableslug. Sides = which directions it supports.
Mutable detectors can rewrite the payload (so they support redact).
Application security
Detector (slug) | Detects | Sides | Protocols | Mutable |
|---|---|---|---|---|
code_sanitation | Dangerous code-injection patterns by language (JS, Python, PHP, SQL, shell, HTML) + custom patterns. | input | all | — |
injection_protection | Classic injection patterns (SQL, NoSQL, command, path traversal, XSS, LDAP, XPath, header, file inclusion) in chosen request scopes. | input | all | — |
Data loss prevention
Detector (slug) | Detects | Sides | Protocols | Mutable |
|---|---|---|---|---|
data_loss_prevention | 60+ PII entities and secrets (passwords, API keys, tokens, JWTs); masks them in flight. | input, output | all | ✅ |
Content security
Detector (slug) | Detects | Sides | Protocols | Mutable |
|---|---|---|---|---|
prompt_guard | Jailbreaks / prompt injections, scored by the NeuralTrust Firewall. | input, output | all | — |
multiturn_guard | Multi-turn jailbreaks that build up across a session_id. | input | all | — |
toxicity | Toxic content, via NeuralTrust Firewall or OpenAI. | input, output | all | — |
toxicity_openai | Toxicity via the OpenAI Moderation API. | input, output | all | — |
toxicity_azure | Toxicity via Azure AI Content Safety. | input, output | all | — |
prompt_moderation | Off-topic / disallowed content via keyword+regex and/or topic probability. | input, output | all | — |
url_analyzer | Fetches URLs in the content (SSRF-guarded) and screens them for jailbreaks and PII. | input | llm, mcp | — |
doc_analyzer | Extracts text from uploaded documents (incl. OCR) and screens for PII and jailbreaks. | input | llm | — |
bedrock_guardrail | Applies an AWS Bedrock guardrail (topic / content / sensitive-info policies). | both | all | — |
Behavioral security
Detector (slug) | Detects | Sides | Protocols | Mutable |
|---|---|---|---|---|
anomaly_detector | Abusive actors keyed on consumer_id: bot-like timing, repeated payloads, escalation, cross-collector abuse. | input | all | — |
Agent & MCP security
Detector (slug) | Detects | Sides | Protocols | Mutable |
|---|---|---|---|---|
tool_guard | Jailbreaks/injections planted in the agent’s own system prompt and tool descriptions. | input | mcp | — |
tool_permission | Tools requested in an MCP call against an allow/deny list. | input | mcp | — |
tool_selection | Tool calls the model emitted, against a known-tool catalog (hallucinated tools, bad arguments). | output | mcp | — |
Hidden detectors.injection_protection,tool_permission, andtool_selectionare functional but currently not shown in the catalog picker. Contact NeuralTrust if you need them enabled for your team.
How detectors and modes fit together
- A catalog detector is a fixed capability — you can’t change its code, only its settings.
- You create a named, reusable instance of it with a
mode,protocol,direction, andsettings, then attach it to collectors. See Detectors. - Only
data_loss_preventionis mutable — it’s the only detector whereredactis valid and the only one that can populatetransformed_payload.