The Equals Evaluator is a specialized tool designed to perform exact string matching between a response and an expected output. It provides a binary evaluation where the response must match the expected output exactly to pass.

Purpose

The Equals Evaluator is particularly useful when you need to:

  • Verify exact string matches
  • Ensure responses match predefined templates
  • Validate fixed-format outputs
  • Check for precise command or code outputs
  • Test exact response requirements

How It Works

The evaluator uses a simple binary scoring system:

  • Score: 1: The response exactly matches the expected output
  • Score: 0: The response does not match the expected output

The evaluation is strict and case-sensitive, requiring an exact character-by-character match between the response and the expected output.

Usage Example

import asyncio

from trusttest.evaluation_contexts import ExpectedResponseContext
from trusttest.evaluators import EqualsEvaluator


async def evaluate():
    evaluator = EqualsEvaluator()
    result = await evaluator.evaluate(
        response="Hello, World!",
        context=ExpectedResponseContext(
            expected_response="Hello, World!"
        )
    )
    print(result)

if __name__ == "__main__":
    asyncio.run(evaluate())

The evaluator returns a tuple containing:

  • A binary score (0 or 1) indicating exact match status
  • A list of explanations including:
    • Success message if matched
    • Failure message with both expected and received responses if not matched

When to Use

Use the Equals Evaluator when you need to:

  • Verify exact command outputs
  • Validate fixed-format responses
  • Check for precise string matches
  • Test template-based responses
  • Ensure exact compliance with specifications
  • Validate code snippets or commands