Skip to main content
Functional testing evaluates whether your AI model produces correct, relevant, and high-quality responses. Unlike threat detection which focuses on security vulnerabilities, functional testing ensures your model performs its intended tasks accurately.

What is Functional Testing?

Functional testing validates that your AI model:
  • Answers questions correctly based on provided context or knowledge
  • Maintains consistency across similar queries
  • Provides relevant responses that address user intent
  • Meets quality standards for your specific use case

Test Generation Methods


When to Use Functional Testing

Use CaseRecommended Approach
RAG applicationsFrom RAG - tests against your actual knowledge base
Customer support botsFrom Dataset - curated Q&A pairs
General assistantsFrom Prompt - dynamic test generation
Domain-specific modelsCombination of all approaches

Evaluation Methods

Functional tests can be evaluated using:
  • LLM-as-Judge: Use an LLM to assess response quality
  • Heuristics: Use BLEU, exact match, regex patterns
  • Custom evaluators: Define your own evaluation logic
Learn more about evaluation →

Quick Example

from trusttest.catalog import FunctionalScenario
from trusttest.targets.http import HttpTarget, PayloadConfig

target = HttpTarget(
    url="https://your-model-endpoint.com/chat",
    headers={"Content-Type": "application/json"},
    payload_config=PayloadConfig(
        format={"messages": [{"role": "user", "content": "{{ test }}"}]},
        message_regex="{{ test }}",
    ),
)

# Generate functional tests from your knowledge base
scenario = FunctionalScenario(
    target=target,
    knowledge_base=your_knowledge_base,
    num_tests=50,
)

test_set = scenario.probe.get_test_set()
results = scenario.eval.evaluate(test_set)
results.display_summary()