The NeuralTrust Firewall provides in-cluster prompt and response safety for TrustGate and the Control Plane. It deploys as a lightweight gateway plus a pool of specialized workers. The same chart and image versions support both CPU and GPU inference. The firewall subchart is off by default (Documentation Index
Fetch the complete documentation index at: https://docs.neuraltrust.ai/llms.txt
Use this file to discover all available pages before exploring further.
neuraltrust-firewall.firewall.enabled: false). Enable it when you need TrustGate or the Control Plane to call into local safety classifiers instead of (or in addition to) NeuralTrust-hosted endpoints.
Architecture
| Worker | Purpose |
|---|---|
toxicity | Toxicity detection on prompts and responses |
toolguard | Tool-use guardrails for agent calls |
prompt-jailbreak | Prompt-side jailbreak detection |
prompt-moderation | Prompt moderation classifier |
response-jailbreak | Response-side jailbreak detection |
CPU vs GPU
Two images share the same version tag (e.g.v2.6.0):
| Image | Use case |
|---|---|
firewall-cpu | Default for both gateway and workers. No GPU scheduling required. |
firewall-gpu | Workers with GPU inference. Requires nvidia.com/gpu, nodeSelector, tolerations, and hostIPC: true. |
Enable the Firewall
- CPU (default)
- GPU workers
Minimal config — workers run on CPU using the chart defaults:Use this when GPUs aren’t available or for lower-volume workloads. Latency is higher than GPU but no specialized node pool is needed.
The gateway uses
firewall-cpu even when workers are on GPU — don’t override gateway.image to a GPU image.TrustGate integration
TrustGate calls the firewall via two values stored intrustgate-secrets:
global.autoGenerateSecrets: true (the default), NEURAL_TRUST_FIREWALL_SECRET_KEY is automatically aligned with the firewall’s JWT. With pre-generated secrets, you must set both to matching values yourself.
After changing firewall secrets, restart TrustGate to pick up the new values:
Per-worker overrides
workerDefaults applies to every worker. To override a single worker (e.g. give toxicity more memory), set values under workers.<name>:
workers.* mirror those available under workerDefaults, including image, resources, nodeSelector, tolerations, and hostIPC.
OpenShift notes
GPU workers on OpenShift typically need a permissive SCC forhostIPC: true and a node pool taint that matches your MachineSet:
Verification
Troubleshooting
Gateway can’t reach a worker
firewall-worker-<name>).
GPU worker stuck pending
- No node has the requested
nodeSelectorlabel — checkkubectl get nodes --show-labels. - Required toleration missing — confirm the GPU taint key matches.
- Cluster is out of GPU capacity — scale up the GPU node pool.
CUDA MPS errors
If you see CUDA MPS errors at startup, ensure bothcudaMpsActiveThreadPercentage and cudaMpsPinnedDeviceMemLimit are set, or remove both. Setting only one causes the worker to start without a usable MPS configuration.
TrustGate not calling the firewall
trustgate.global.env.* and helm upgrade, then restart TrustGate deployments.
Related guides
- Install on Kubernetes — base install workflow
- Secrets management —
trustgate-secretsandfirewall-secretslifecycle - OpenShift — SCC and node-pool considerations for GPU workers
- Configuration scenarios —
values-dataplane-gpu.yaml.exampleand other ready-made topologies