NeuralTrust | The leading security platform for generative AI

The Completeness Evaluator is a specialized tool designed to assess how well a response captures all the relevant information from an expected or ground truth response. It uses an LLM (Large Language Model) as a judge to determine the extent to which an actual response covers the critical aspects of the expected response.

Purpose

The Completeness Evaluator is particularly useful when you need to:

Verify if all essential information is included in responses
Ensure no critical components are missing from answers
Evaluate the coverage of key points in responses
Assess the thoroughness of information provided

How It Works

The evaluator uses a 5-point scale to rate responses:

Score: 1 (No Information): The actual response does not contain any information from the expected response
Score: 2 (Very Little Information): The actual response contains very little information from the expected response
Score: 3 (Missing Key Information): The actual response lacks some key information and also adds extra information
Score: 4 (Most Key Information): The actual response contains most of the key information and adds extra information
Score: 5 (All Key Information): The actual response contains all the key information from the expected response

Usage Example

import asyncio

from trusttest.evaluation_contexts import ExpectedResponseContext
from trusttest.evaluators import CompletenessEvaluator


async def evaluate():
    evaluator = CompletenessEvaluator()
    result = await evaluator.evaluate(
        response="The capital of Osona is Vic, which is located in Catalonia.",
        context=ExpectedResponseContext(
            expected_response="The capital of Osona is Vic."
        )
    )
    print(result)

if __name__ == "__main__":
    asyncio.run(evaluate())

The evaluator returns a tuple containing:

A score (1-5) indicating the level of completeness
A list of explanations for the given score

When to Use

Use the Completeness Evaluator when you need to:

Verify comprehensive coverage of topics in responses
Ensure no critical information is omitted
Check the thoroughness of AI-generated content
Evaluate the completeness of automated responses
Assess the coverage of key points in information retrieval systems

Getting Started

Core Concepts

Connect your app

Create tests

Evaluate results

Completeness

Purpose

How It Works

Usage Example

When to Use

Getting Started

Core Concepts

Connect your app

Create tests

Evaluate results

​Purpose

​How It Works

​Usage Example

​When to Use

Purpose

How It Works

Usage Example

When to Use