Skip to main content
Agentic systems add an attack surface that prompt-only defenses miss: the agent’s own tool and function definitions, the tools a caller is allowed to use, and the tool calls the model emits. These detectors are scoped to the mcp protocol.
DetectorSlugSidesProtocolsBackend
Tool Guardtool_guardinputmcpNeuralTrust Firewall
Tool Permissiontool_permissioninputmcpin-process list
Tool Selectiontool_selectionoutputmcpoptional NeuralTrust/OpenAI
tool_permission and tool_selection are functional but currently not shown in the catalog picker. Contact NeuralTrust if you need them enabled for your team.

Tool Guard — tool_guard

Scans the agent’s own definition — system prompt and tool/function descriptions — for jailbreaks and prompt injections planted there (e.g. a poisoned MCP tool description). Uses the NeuralTrust Firewall jailbreak detector.
FieldTypeRequiredNotes
jailbreak.thresholdnumberScore in [0,1].
credentials.*objectOverride global firewall creds.
{ "type": "tool_guard", "mode": "block", "protocol": "mcp", "direction": "input",
  "settings": { "jailbreak": { "threshold": 0.6 } } }

Tool Permission — tool_permission

Checks the tools declared in an MCP request against an allow/deny list. In-process, no external calls.
FieldTypeDefaultNotes
allowed_toolsarray<string>Empty = permit any non-denied tool.
denied_toolsarray<string>Checked first; always wins.
tools_fieldstring"tools"JSON path to the tools list in the body.
{ "type": "tool_permission", "mode": "block", "protocol": "mcp", "direction": "input",
  "settings": { "allowed_tools": ["search", "calculator"], "denied_tools": ["shell_exec"] } }

Tool Selection — tool_selection

Validates the tool calls a model emits against a catalog of known tools and their argument schemas — catching hallucinated tools and malformed arguments. Runs on output.
FieldTypeRequiredNotes
known_toolsarray<{ name, parameters: { type, required[] } }>The legitimate tool catalog.
providerenumneuraltrust (default) or openai, used only if semantic check is on.
semantic_check.thresholdnumberrequired if blockingScore in [0,1] for the semantic match.
credentials.*objectProvider creds.
{ "type": "tool_selection", "mode": "observe", "protocol": "mcp", "direction": "output",
  "settings": {
    "known_tools": [
      { "name": "search", "parameters": { "type": "object", "required": ["query"] } }
    ]
  } }

When to use

  • tool_guard (input) whenever you load third-party or user-supplied MCP servers / tool descriptions — it catches injections hidden in tool metadata.
  • tool_permission (input) to enforce least-privilege: which tools a given collector may invoke.
  • tool_selection (output) to catch a compromised or hallucinating model calling tools that don’t exist or with bad arguments.