This guide walks you end-to-end through a hybrid deployment on GKE — the Data Plane, TrustGate, and Firewall run in your cluster; the Control Plane UI, API, and Scheduler run on NeuralTrust SaaS. If you want everything (including the UI) in your own cluster, see GCP self-hosted instead.Documentation Index
Fetch the complete documentation index at: https://docs.neuraltrust.ai/llms.txt
Use this file to discover all available pages before exploring further.
What you’ll end up with
| Component | Location | Replicas (default) |
|---|---|---|
| Data Plane API | Your GKE cluster | 2 |
| Data Plane worker | Your GKE cluster | 1 |
| Kafka Connect | Your GKE cluster | 1 |
| TrustGate admin / gateway / actions | Your GKE cluster | 2 each |
| Firewall gateway + 5 workers | Your GKE cluster | 2 + 5 |
| ClickHouse, Kafka, PostgreSQL, Redis | Your GKE cluster (or external) | 1 each |
| Control Plane API, UI, Scheduler | NeuralTrust SaaS | — |
Prerequisites
| Resource | Recommended |
|---|---|
| GKE version | 1.28+ |
| CPU pool machine type | e2-standard-8 (8 vCPU / 32 GiB) |
| Min CPU nodes | ≥ 4 across 3 zones (regional cluster). Drop to 3 if Firewall workers run on GPU nodes. |
| GPU pool (optional, for GPU Firewall) | n1-standard-4 + 1 × T4 — 5 nodes (one per default Firewall worker) |
| Storage | pd-balanced (default) or pd-ssd for ClickHouse |
| Sizing baseline | ~20.5 vCPU / 58.5 GiB requests / 80 GiB PVC (defaults with CPU Firewall) |
| DNS | A control over a base domain (e.g. platform.example.com) |
| Image pull | gcr-keys.json from NeuralTrust |
| NeuralTrust tenant | A SaaS Control Plane tenant — request from [email protected] if you don’t have one |
Step 1 — Provision the GKE cluster
--num-nodes 2 is per-zone in a regional cluster → 6 worker nodes across 3 zones. This is sufficient for the hybrid CPU pool with HA. For tighter cost, move Firewall workers to a GPU pool and drop the CPU pool to --num-nodes 1 (= 3 nodes) plus --enable-autoscaling --min-nodes 1 --max-nodes 3.
Step 2 — Create the namespace and image pull secret
Step 3 — Write your values overlay
Save asmy-values.yaml:
External infrastructure (optional)
To use Cloud SQL, ClickHouse Cloud, or Confluent Cloud instead of in-cluster, swap in:Step 4 — Install
<VERSION> with a chart version from the release list.
Initial install takes 3–5 minutes. Wait for all pods to be ready:
Step 5 — Wire up DNS and certificates
The chart creates GCE Ingresses for each component. Get the assigned IPs:| Host | Component |
|---|---|
data-plane-api.platform.example.com | Data Plane API (needed for SaaS Control Plane to reach your Data Plane) |
admin.platform.example.com | TrustGate admin |
gateway.platform.example.com | TrustGate proxy |
actions.platform.example.com | TrustGate actions |
Add Managed Certificates
Once DNS resolves, add aManagedCertificate and a FrontendConfig, then reference them from the ingress:
my-values.yaml:
helm upgrade … to apply.
Step 6 — Enroll the Data Plane with NeuralTrust SaaS
This is the hybrid-specific step that connects your in-cluster Data Plane to the NeuralTrust SaaS Control Plane.Get the Data Plane JWT secret
The chart auto-generated this on first install:Save it — you’ll paste it into the portal.
Open the NeuralTrust portal
Log in to your tenant at the URL provided by NeuralTrust (typically
https://app.neuraltrust.ai).Connect the Data Plane
Navigate to Team Settings → Advanced → Connect Data Plane (see Platform › Advanced for the full UI walkthrough).Provide:
| Field | Value |
|---|---|
| Data Plane API URL | https://data-plane-api.platform.example.com |
| Data Plane JWT | (the secret you copied above) |
| Region | Match the region of your GKE cluster |
Step 7 — Send traffic through TrustGate
Point your AI applications at the TrustGate gateway URL (https://gateway.platform.example.com). All telemetry will flow through the local Data Plane, into ClickHouse, and surface in the NeuralTrust SaaS dashboards.
For TrustGate gateway / route / plugin configuration, see TrustGate › Getting started.
Verification
- Data Plane status: Connected
- TrustGate: receiving traffic, dashboards populating
- Firewall: classifying prompts (if enabled and routed)
Upgrading
Migration to self-hosted
To bring the Control Plane in-house later, flip one flag and upgrade:api.platform.example.com, app.platform.example.com, scheduler.platform.example.com. The rest of the stack keeps running unchanged — see Self-hosted on GKE for the full picture.
Troubleshooting
| Symptom | Likely cause | Fix |
|---|---|---|
| Portal says “Data Plane unreachable” | DNS not resolved, certificate provisioning incomplete, or firewall rule blocking the SaaS Control Plane | Confirm curl https://data-plane-api.<domain>/health succeeds from outside your VPC |
ManagedCertificate stays Provisioning | DNS not yet pointing at the load balancer | Confirm dig <host> returns the LB IP, then wait up to 60 minutes |
ImagePullBackOff | Missing or wrong gcr-secret | Recreate the secret with the JSON key from NeuralTrust |
| Data Plane API restarts | ClickHouse not reachable, Kafka not reachable | Check kubectl logs on the API; verify dependency pods are Running |
| TrustGate can’t reach the Firewall | Service name mismatch | Default is http://firewall:80 — verify NEURAL_TRUST_FIREWALL_URL in trustgate-secrets |
Related guides
- Self-hosted deployment on GKE — Control Plane in your cluster
- GCP overview — cluster prerequisites and GCP-specific defaults
- Deployment models — hybrid vs self-hosted comparison
- Image catalog — what runs in hybrid mode
- Secrets management — auto-generation, External Secrets Operator
- Firewall deployment — GPU workers on GKE