The Completeness Evaluator is a specialized tool designed to assess how well a response captures all the relevant information from an expected or ground truth response. It uses an LLM (Large Language Model) as a judge to determine the extent to which an actual response covers the critical aspects of the expected response.
import asynciofrom trusttest.evaluation_contexts import ExpectedResponseContextfrom trusttest.evaluators import CompletenessEvaluatorasync def evaluate(): evaluator = CompletenessEvaluator() result = await evaluator.evaluate( response="The capital of Osona is Vic, which is located in Catalonia.", context=ExpectedResponseContext( expected_response="The capital of Osona is Vic." ) ) print(result)if __name__ == "__main__": asyncio.run(evaluate())
The evaluator returns a tuple containing:
A score (1-5) indicating the level of completeness