NeuralTrust | The leading security platform for generative AI

The Equals Evaluator is a specialized tool designed to perform exact string matching between a response and an expected output. It provides a binary evaluation where the response must match the expected output exactly to pass.

Purpose

The Equals Evaluator is particularly useful when you need to:

Verify exact string matches
Ensure responses match predefined templates
Validate fixed-format outputs
Check for precise command or code outputs
Test exact response requirements

How It Works

The evaluator uses a simple binary scoring system:

Score: 1: The response exactly matches the expected output
Score: 0: The response does not match the expected output

The evaluation is strict and case-sensitive, requiring an exact character-by-character match between the response and the expected output.

Usage Example

import asyncio

from trusttest.evaluation_contexts import ExpectedResponseContext
from trusttest.evaluators import EqualsEvaluator


async def evaluate():
    evaluator = EqualsEvaluator()
    result = await evaluator.evaluate(
        response="Hello, World!",
        context=ExpectedResponseContext(
            expected_response="Hello, World!"
        )
    )
    print(result)

if __name__ == "__main__":
    asyncio.run(evaluate())

The evaluator returns a tuple containing:

A binary score (0 or 1) indicating exact match status
A list of explanations including:
- Success message if matched
- Failure message with both expected and received responses if not matched

When to Use

Use the Equals Evaluator when you need to:

Verify exact command outputs
Validate fixed-format responses
Check for precise string matches
Test template-based responses
Ensure exact compliance with specifications
Validate code snippets or commands

Getting Started

Core Concepts

Connect your app

Create tests

Evaluate results

Self-hosted

Equals

Purpose

How It Works

Usage Example

When to Use

Getting Started

Core Concepts

Connect your app

Create tests

Evaluate results

Self-hosted

​Purpose

​How It Works

​Usage Example

​When to Use

Purpose

How It Works

Usage Example

When to Use