NeuralTrust Docs home page
Search...
⌘K
Support
Dashboard
Dashboard
Search...
Navigation
Create tests
Overview
TrustTest
Red team GenAI apps for safety and reliability
Blog
Github
Community
Getting Started
Overview
Installation
Quickstart
Tutorials & Guides
Core Concepts
Overview
Test Generation
Evaluation
Connect your app
Overview
Custom Targets
HTTP Target
LLMs & Embeddings
Connect to NeuralTrust
Create tests
Overview
From Dataset
From Prompt
From Knowledge Bases
Responsibility
Off-Topic
Data Leakage
Capture the Flag
Crescendo
Echo Chamber
Evaluate results
Overview
Evaluation Context
LLM as judge
Heuristic
On this page
Key Areas
Why It Matters
Create tests
Overview
In TrustTest, test sets are collections of input-output pairs used to evaluate the performance and reliability of AI models. These test sets serve as benchmarks to systematically assess how well models handle various scenarios and requirements.
Key Areas
Predefined Test Sets
TrustTest comes with a variety of built-in test sets designed for common evaluation scenarios:
Off-topic use cases.
Data leakage.
Responsibility.
Automatic Test Generation
TrustTest provides tools to automatically generate test sets through:
Knowledge base.
LLM-assisted test creation.
Custom Test Sets
Users can create their own test sets by:
Defining specific input-output pairs.
Importing existing datasets.
Why It Matters
Comprehensive Evaluation
Test sets provide a structured way to evaluate models across different scenarios and requirements.
Reproducible Testing
Predefined and automatically generated test sets ensure consistent evaluation across different runs and environments.
Efficient Testing
Automated test generation saves time and resources while maintaining test quality and coverage.
Customizable Assessment
The ability to create and modify test sets allows for tailored evaluation based on specific needs and requirements.
Quality Assurance
Well-designed test sets help identify model weaknesses and areas for improvement, leading to better overall performance.
Connect to NeuralTrust
From Dataset
Assistant
Responses are generated using AI and may contain mistakes.