NeuralTrust | Platform for Agent Security.

TrustGuard is NeuralTrust’s runtime security engine for generative-AI and agentic traffic. It inspects prompts, model output, documents, URLs, and agent/tool activity in flight and returns a structured security verdict for each request. TrustGuard is deliberately decision-only: it tells you what it found and what it would do, but it never blocks or drops traffic itself. The component that calls TrustGuard (the collector) decides whether to allow, mask, or block based on the verdict. This keeps TrustGuard safe to deploy inline anywhere — a failure or timeout can never take your application down.

TrustGuard is one of four NeuralTrust products. It pairs naturally with TrustGate (the AI gateway, the most common collector), TrustLens (AI security posture), and TrustTest (red teaming).

What it does

Prompt & content security

Jailbreak and prompt-injection detection, topic moderation, toxicity scoring, and document/URL analysis on both input and output.

Data loss prevention

Detect and mask 43 PII entities and secrets (API keys, tokens, JWTs) in flight, returning a transformed payload.

URL & document analysis

Fetch and screen URLs, and extract text from uploaded documents (including OCR) for jailbreaks and PII before they reach the model.

Secrets in flight

Catch leaked credentials and sensitive tokens in prompts and model outputs alongside PII masking.

The model in one minute

TrustGuard has four building blocks. You compose them once, then send traffic.

Concept	What it is
Detector	A reusable, named detection you create from the catalog — a catalog detector (e.g. `prompt_guard`, `data_loss_prevention`) plus its `settings` (thresholds, entity lists, …). Detection‑only — it decides what it finds, not what to do.
Policy	Where enforcement lives: gates (match request attributes before detection) + detector rules (run detectors with a Monitor / Block / Transform action), plus a Report / Enforce switch.
Collector	A traffic tap (gateway, SDK, browser, WAF) authenticated by an API key. It routes each request to a policy.
Finding	What a gate or detector reported — its `source`, `signal`, `outcome`, and `evidence`.

        POST /v1/evaluate  (Authorization: Bearer <collector api key>)
                              │
 ┌──────────┐  resolves  ┌────▼─────┐  routes to  ┌──────────────────────────────┐
 │ API key  │───────────▶│Collector │────────────▶│ Policy                       │
 └──────────┘  collector └──────────┘   a policy   │  gates → detector rules      │
                                                    └──────────────┬───────────────┘
                                                                   ▼
                    { status, findings[], transformed_payload, trace_id }

A request names no detectors: TrustGuard resolves the collector’s policy and runs its gates and the detector rules that match the request’s direction and conditions. See How it works.

What makes it safe to run inline

Traffic with no matching policy is unguarded: TrustGuard returns status: "allow" with no inspection. Attach a default (or per-consumer) policy on the collector after setup.

Never drops traffic. A detection is always HTTP 200 with the finding in the response body. Non-2xx is reserved for auth (401), bad requests (400), and system failures (500).
Fail-open or fail-closed — your choice. If a detector errors (e.g. an upstream model API is down), TrustGuard either drops that detector’s result and continues (fail-open) or returns 500 so you can hold traffic (fail-closed). The mode is configured per deployment (ask NeuralTrust if unsure). A structured block decision always applies.
Enforcement is the caller’s choice. The response status (allow / report / transform / block) is advisory. Your collector decides what to do with it.

Get started

Create a collector (TrustGate, SDK/REST, or another collector).
Create detectors from the catalog.
Create a policy with Input and Output detectors chained and configure rules if needed.
Attach the policy to the collector created (default and/or per-consumer).
Test in playground.
Verify traffic in Activity.

Where to go next

How it works

The guard pipeline, gates, and the findings model.

Core concepts

Detectors, policies, and collectors.

Policies

Gates, detector rules, and the Report / Enforce switch.

Detector catalog

Every built-in detection and its settings.

Evaluate API

The POST /v1/evaluate request and response contract.

Telemetry Alerts

Turn guard findings into prioritized alerts — prompt-injection spikes, leaked PII/secrets, toxicity bursts.

Introduction

Core concepts

Detector catalog

Integrations

Evaluate API

Overview

What it does

Prompt & content security

Data loss prevention

URL & document analysis

Secrets in flight

The model in one minute

What makes it safe to run inline

Get started

Where to go next

How it works

Core concepts

Policies

Detector catalog

Evaluate API

Telemetry Alerts

​What it does

Prompt & content security

Data loss prevention

URL & document analysis

Secrets in flight

​The model in one minute

​What makes it safe to run inline

​Get started

​Where to go next

How it works

Core concepts

Policies

Detector catalog

Evaluate API

Telemetry Alerts

What it does

The model in one minute

What makes it safe to run inline

Get started

Where to go next