Skip to main content

What this covers

Agents built and published in Microsoft Foundry Agent Service. The primary, production-grade pattern is to put a Gateway in front of a published Agent Application, so every invocation of your agent — including tool-call arguments and outputs that appear in the response — flows through TrustGate before hitting Foundry.
  • Surface: Gateway.
  • Who is this for: applications invoking a Foundry Agent Application over the Responses API protocol (OpenAI-compatible). Works from Python, TypeScript, .NET, and any HTTP client.
Three related patterns are covered further down:
  • Pattern A — Protect an Agent Application (recommended, production).
  • Pattern B — Protect the Foundry project endpoint for pre-publish and orchestrator-owned agents.
  • Pattern C — Protect agent tool calls (OpenAPI tools, self-hosted MCP servers, Azure Functions tools).

Foundry concepts, quickly

An agent is authored inside a Foundry project and is immutable per agent version. To expose it to production consumers, you publish the agent version, which creates an Agent Application — a separate Azure Resource Manager resource with:
  • A stable endpoint URL (keeps working across new agent versions).
  • Its own Entra agent identity (distinct from the project’s identity).
  • Its own Azure RBAC scope — invokers need the Azure AI User role (or custom role with applications/invoke/action) on the application resource.
  • One active deployment that routes 100% of traffic to a specific agent version.
An Agent Application exposes the Responses protocol by default. Publishing to Microsoft 365 Copilot or Teams switches it to the Activity Protocol used by Azure Bot Service. Only one protocol is active at a time.
Foundry’s publishing model is evolving. This guide follows the Agent Applications experience, which is the current way to expose a stable, production endpoint. The URL shape may change once the newer publishing model reaches general availability; the TrustGate integration shape (Gateway on the Foundry endpoint, Entra upstream auth) stays the same.
The Agent Application exposes an OpenAI-compatible Responses endpoint at:
https://<foundry-resource>.services.ai.azure.com/api/projects/<project>/applications/<app>/protocols/openai
You create a Gateway that routes to this base URL, then repoint your client at the Gateway. Your client keeps using the standard openai SDK — the only changes are base_url, api_key, and that the Gateway, not the client, handles Entra authentication to Foundry.

Architecture

your app ──► TrustGate Gateway ──► Foundry Agent Application
            (TrustGate API key)    (Entra bearer acquired by
             inspects prompt,       the Gateway's managed
             tool calls, response)  identity or app credential)

Step-by-step setup

Foundry only accepts Microsoft Entra bearer tokens — the Gateway handles this by acquiring and caching tokens on the Foundry provider Integration, where you configure the Entra identity once. The client keeps using a standard TrustGate Gateway API key; the Entra swap happens entirely server-side.
1

Publish the Foundry Agent Application

In Microsoft Foundry, publish the agent version to an Agent Application. Note the resource’s account name, project name, and application name — you’ll wire them into the Foundry Integration in Step 3.
2

Set up an Entra identity for the Gateway

Pick one identity type for the Gateway to authenticate to Foundry with:
  • User-assigned managed identity — simplest if the TrustGate data plane runs in Azure.
  • Entra application registration + client secret / certificate — works everywhere.
  • Workload identity federation — for data planes running outside Azure (AWS, GCP, on-prem); no long-lived secret.
Grant that identity the Azure AI User role (or a custom role with Microsoft.CognitiveServices/accounts/projects/applications/invoke/action) on the Agent Application resource only — not the whole project or subscription.
3

Register Azure AI Foundry as an Integration

Integrations → Add Integration → Azure AI Foundry:
  • Upstream base URLhttps://<foundry-resource>.services.ai.azure.com/api/projects/<project>/applications/<app>/protocols/openai.
  • Upstream authAzure Entra ID, scope https://ai.azure.com/.default, using the identity from Step 2. The Gateway acquires, caches, and refreshes tokens transparently.
This is where all the Entra and Foundry-addressing details live — the Route just references this Integration.
4

Create a Gateway Integration

Integrations → Add Integration → Gateway. Pick Serverless or Dedicated, name it, save, and copy the Endpoint from Gateway → Overview.A default Route for the Foundry Integration from Step 3 is created automatically, exposing the OpenAI Responses path that maps to the Foundry upstream. Add Use Cases or Tags under Gateway → Routes for policy scoping if you want; no manual Route creation is needed.
5

Issue a Gateway API key

API Keys tab on the Gateway Integration. This is the only credential the client ever sees — no Entra on the client.
6

Swap base URL and api_key in your client

See the snippet below. The client goes from DefaultAzureCredential + Entra bearer to a standard TrustGate API key.
7

Verify

Call the Gateway with a simple prompt and confirm a 200. In the Azure portal, open the Agent Application’s activity log and confirm the Gateway’s Entra identity is the caller — not individual end users. In Runtime → Explorer, confirm the request, detectors, tokens used, and any tool-call items appear.

Client code

Before — direct call to Foundry (from the Azure docs):
from openai import OpenAI
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

BASE_URL = (
    "https://<foundry-resource>.services.ai.azure.com"
    "/api/projects/<project>/applications/<app>/protocols/openai"
)

client = OpenAI(
    api_key=get_bearer_token_provider(
        DefaultAzureCredential(), "https://ai.azure.com/.default"
    ),
    base_url=BASE_URL,
    default_query={"api-version": "2025-11-15-preview"},
)

response = client.responses.create(input="Write a haiku")
After — call through TrustGate. No Entra credentials on the client; TrustGate acquires them upstream.
from openai import OpenAI

client = OpenAI(
    api_key="<trustgate-gateway-api-key>",
    base_url="https://<gateway-host>.neuraltrust.ai/<route-prefix>",
    default_query={"api-version": "2025-11-15-preview"},
)

response = client.responses.create(input="Write a haiku")
TypeScript is analogous — replace the openai client’s baseURL and apiKey.

What TrustGate inspects

  • The user input (input field on POST /responses).
  • The assistant’s final output text.
  • Tool-call requests and tool outputs that appear as items on the response object — including OpenAPI-tool and MCP-tool invocations the agent performs server-side.
  • Usage telemetry and response metadata used by analytics and audit.
TrustGate does not see the Foundry model’s internal reasoning or any agent steps that don’t appear in the Responses payload.

Stateless Responses caveat

Agent Applications currently support only the stateless POST /responses/conversations is not available. Your client is responsible for storing conversation history and replaying it in the input array on every turn. This is a Foundry constraint, not a TrustGate one; the Gateway transparently forwards whatever your client sends.

Pattern B — Protect the Foundry project endpoint

Use this when you’re still developing the agent (no Agent Application yet) or when your application owns the orchestration and calls the model directly via the project endpoint — for example, with FoundryChatClient, FoundryAgent, or raw Responses calls with agent_reference. Upstream URL for the Gateway:
https://<foundry-resource>.services.ai.azure.com/api/projects/<project>
Everything else is identical to Pattern A: Entra upstream auth with scope https://ai.azure.com/.default, TrustGate API key on the client, stateless or stateful Responses calls as supported by the project endpoint.
Project-endpoint calls are more powerful than Agent Application calls — /conversations, /files, and /vector_stores are available. The tradeoff is that every caller with project access can see every other caller’s conversations, so treat this as a pre-production or single-tenant integration path. For external consumers, prefer Pattern A.

Pattern C — Protect agent tool calls

Foundry agents can call tools you host: OpenAPI endpoints, self-hosted MCP servers, or Azure Functions. You can protect each of those inbound tool invocations by putting a Gateway or API engine in front of the tool, independent of how the agent itself is invoked.
  • OpenAPI tools — the agent issues plain HTTPS calls to the endpoint you registered in the tool catalog. Point the Foundry tool at https://<gateway-host>.neuraltrust.ai/<route> instead of the direct backend. Standard Gateway route, no special handling needed.
  • Self-hosted MCP servers — put the MCP server behind a Gateway route. Foundry accepts any HTTPS URL as server_url, so swapping it for a TrustGate URL works; confirm the MCP streaming (SSE) profile is enabled on the route.
  • Azure Functions tools — secure the Function’s outbound calls (to LLM providers, internal APIs, etc.) the same way you would any other service, by routing those calls through a Gateway or Actions API check from inside the Function.
Pattern C composes with Pattern A — you can protect both the agent’s inbound invocation and its outbound tool calls independently, which is useful when the tools talk to sensitive internal systems.

Authentication deep-dive

Foundry’s Responses endpoints only accept Microsoft Entra bearer tokens. API key authentication is not supported for Agent Applications or the project endpoint. This is the key difference versus OpenAI, Anthropic, or Bedrock integrations, where TrustGate can forward the client’s provider API key. The TrustGate Gateway handles this by swapping authentication at the edge:
  1. Client → Gateway: the client sends a TrustGate Gateway API key in the Authorization header. This key is scoped to your route and governs which TrustGate policies apply.
  2. Gateway → Foundry: the Gateway’s configured Entra identity acquires a bearer token for the scope https://ai.azure.com/.default, caches it, and attaches it to the upstream request. The Gateway refreshes tokens transparently before they expire.
Pick the Entra identity type that fits how the data plane runs:
  • User-assigned managed identity — simplest when the data plane runs in Azure (AKS with workload identity, Container Apps, App Service). No secrets to rotate.
  • App registration with client credentials — works everywhere; store the client secret in the TrustGate secret store. Rotate on your own schedule.
  • Workload identity federation — for data planes running outside Azure (AWS, GCP, on-prem). Federate a trust relationship with Entra so no long-lived secret is needed.
Whichever identity you pick, grant it the Azure AI User role (or a minimal custom role with Microsoft.CognitiveServices/accounts/projects/applications/invoke/action) on the Agent Application resource only — not the whole Foundry account or subscription.

Policies to apply

Foundry’s Responses payload surfaces the user input, the assistant’s final output, and every tool-call request and tool output the agent issued server-side — so Gateway policies can act on all of them on a single hop. Read Policies & Enforcement for the Where / When / Then authoring model and precedence. Scope with Gateways = <your-gateway> and, if you front multiple Agent Applications from the same Gateway, narrow further with Routes.

Block prompt injection on user input

  • WhereGateway + filter Gateways = <your-gateway>
  • WhenInput · Triggers · Prompt Injection, Jailbreak
  • ThenBlock

Mask PII on input and final response

  • WhereGateway + filter Gateways = <your-gateway>
  • WhenInput or Output · Triggers · Email Address, Phone Number, Credit Card, Social Security Number
  • ThenMask
Agent Applications often ground on internal Foundry data (vector stores, file search, OpenAPI tools). Masking on the response prevents that data from leaking to the caller.

Tool-call argument inspection

  • WhereGateway + filter Gateways = <your-gateway>
  • WhenTool Call · Triggers · Suspicious Arguments, Prompt Injection
  • ThenBlock
Foundry agents dispatch tools server-side, but the request to dispatch them appears in the Responses payload. This policy blocks unsafe tool invocations before Foundry executes them.

Block credential leakage in the response

  • WhereGateway + filter Gateways = <your-gateway>
  • WhenOutput · Triggers · API Key / Secret
  • ThenBlock

Moderate the final response

  • WhereGateway + filter Routes = <customer-facing-routes>
  • WhenOutput · Triggers · Toxicity, Harmful Content
  • ThenBlock

Keyword / topic block for scope

  • WhereGateway + filter Gateways = <your-gateway>
  • WhenInput · Triggers · Keyword Match = <your-list>
  • ThenBlock
Useful on domain-specific Agent Applications (for example, a HR-only agent that must never discuss compensation specifics). Remember the pattern — start each policy in Log, review hits in Runtime → Logs, promote to Mask / Block once tuned.

Limitations

LimitationWhyWhat you can do
Activity Protocol (Teams / M365 Copilot) is not interceptableWhen an Agent Application is published to M365 Copilot or Teams, the wire protocol switches to the Azure Bot Service Activity Protocol, and the user traffic is Teams ↔ Microsoft cloud ↔ Bot Service ↔ Foundry — all inside Microsoft.Use policies inside the agent’s instructions, Microsoft Purview, or Content Safety for that channel. TrustGate cannot sit on the wire.
Microsoft-hosted tools (Agent 365, Microsoft Learn MCP, Azure Logic Apps)These run inside Microsoft’s network and the agent calls them over internal routes.Not covered by TrustGate. Covered by Microsoft’s own controls.
Stateless Responses only on Agent ApplicationsFoundry doesn’t yet enforce end-user isolation on managed conversations for published agents.Store conversation history on your client. Not a TrustGate limitation — lifts automatically when Foundry supports it.
Streaming responses require stream-aware routesFoundry’s Responses streaming uses SSE; the Gateway needs the streaming route profile enabled to inspect chunks in order.Enable the streaming profile on the route. Non-streaming calls work out of the box.
The Agent Application publishing model is evolvingThe URL path shape may change as Foundry migrates to its newer publishing experience.Keep upstream URLs in configuration, not code. The Gateway’s upstream URL and auth model stay stable.