In TrustTest, test sets are collections of input-output pairs used to evaluate the performance and reliability of AI models. These test sets serve as benchmarks to systematically assess how well models handle various scenarios and requirements.


Key Areas

Predefined Test Sets TrustTest comes with a variety of built-in test sets designed for common evaluation scenarios:

  • Off-topic use cases.
  • Data leakage.
  • Responsibility.

Automatic Test Generation TrustTest provides tools to automatically generate test sets through:

  • Knowledge base.
  • LLM-assisted test creation.

Custom Test Sets Users can create their own test sets by:

  • Defining specific input-output pairs.
  • Importing existing datasets.

Why It Matters

  • Comprehensive Evaluation Test sets provide a structured way to evaluate models across different scenarios and requirements.

  • Reproducible Testing Predefined and automatically generated test sets ensure consistent evaluation across different runs and environments.

  • Efficient Testing Automated test generation saves time and resources while maintaining test quality and coverage.

  • Customizable Assessment The ability to create and modify test sets allows for tailored evaluation based on specific needs and requirements.

  • Quality Assurance Well-designed test sets help identify model weaknesses and areas for improvement, leading to better overall performance.