Functional testing evaluates whether your AI model produces correct, relevant, and high-quality responses. Unlike threat detection which focuses on security vulnerabilities, functional testing ensures your model performs its intended tasks accurately.Documentation Index
Fetch the complete documentation index at: https://docs.neuraltrust.ai/llms.txt
Use this file to discover all available pages before exploring further.
What is Functional Testing?
Functional testing validates that your AI model:- Answers questions correctly based on provided context or knowledge
- Maintains consistency across similar queries
- Provides relevant responses that address user intent
- Meets quality standards for your specific use case
Test Generation Methods
From RAG
Generate tests from your knowledge base
From Dataset
Use existing Q&A datasets
From Prompt
Generate tests dynamically with LLMs
When to Use Functional Testing
| Use Case | Recommended Approach |
|---|---|
| RAG applications | From RAG - tests against your actual knowledge base |
| Customer support bots | From Dataset - curated Q&A pairs |
| General assistants | From Prompt - dynamic test generation |
| Domain-specific models | Combination of all approaches |
Evaluation Methods
Functional tests can be evaluated using:- LLM-as-Judge: Use an LLM to assess response quality
- Heuristics: Use BLEU, exact match, regex patterns
- Custom evaluators: Define your own evaluation logic