NeuralTrust | The leading security platform for generative AI

In TrustTest there are specialized components designed to assess if an AI model response is compliant with a specific set of criteria. They provide a systematic way to measure various aspects of model outputs against predefined criteria, ensuring reliable and consistent evaluation across different use cases.

Key Areas

Heuristic Evaluators These evaluators use rule-based approaches and predefined metrics to assess responses. They include:

Language-based evaluations ( checks if the response is in the correct language)
Exact matching and pattern recognition
BLEU score for text similarity
Regular expression pattern matching

LLM-based Evaluators These evaluators leverage language models to perform more nuanced assessments:

Response correctness
Response completeness
Tone and style analysis
URL correctness validation
Custom evaluation criteria
True/false assessment

Why It Matters

Quality Assurance Evaluators provide objective metrics to ensure AI responses meet quality standards and requirements.
Consistent Assessment By standardizing evaluation criteria, evaluators enable reproducible and comparable results across different models and use cases.
Flexible Evaluation The modular design allows for custom evaluators to be created for specific needs while maintaining a consistent interface.
Comprehensive Analysis Different types of evaluators can be combined to provide a holistic assessment of model performance across multiple dimensions.
Trust and Reliability Systematic evaluation helps build confidence in AI systems by providing clear metrics and explanations for assessment results.

On this page

Key Areas
Why It Matters

Key Areas

Heuristic Evaluators These evaluators use rule-based approaches and predefined metrics to assess responses. They include:

Language-based evaluations ( checks if the response is in the correct language)
Exact matching and pattern recognition
BLEU score for text similarity
Regular expression pattern matching

LLM-based Evaluators These evaluators leverage language models to perform more nuanced assessments:

Response correctness
Response completeness
Tone and style analysis
URL correctness validation
Custom evaluation criteria
True/false assessment

Why It Matters

Quality Assurance Evaluators provide objective metrics to ensure AI responses meet quality standards and requirements.
Consistent Assessment By standardizing evaluation criteria, evaluators enable reproducible and comparable results across different models and use cases.
Flexible Evaluation The modular design allows for custom evaluators to be created for specific needs while maintaining a consistent interface.
Comprehensive Analysis Different types of evaluators can be combined to provide a holistic assessment of model performance across multiple dimensions.
Trust and Reliability Systematic evaluation helps build confidence in AI systems by providing clear metrics and explanations for assessment results.

On this page

Key Areas
Why It Matters

Overview

Key Areas

Why It Matters

Deployment

Security

Data privacy

Overview

Key Areas

Why It Matters

​Key Areas

​Why It Matters

Deployment

Security

Data privacy

​Key Areas

​Why It Matters

Key Areas

Why It Matters

Key Areas

Why It Matters