The Language Evaluators are specialized tools designed to validate the language of text responses. There are two types of language evaluators:

  1. Expected Language Evaluator: Checks if the response is in a specific expected language
  2. Equal Language Evaluator: Checks if the response is in the same language as the question

Purpose

The Language Evaluators are particularly useful when you need to:

  • Ensure responses are in the correct language
  • Verify language consistency between questions and answers
  • Validate multilingual content
  • Check language requirements compliance
  • Monitor language-specific responses

How It Works

Both evaluators use a binary scoring system based on language detection:

Expected Language Evaluator

  • Score: 1: The response is in the expected language
  • Score: 0: The response is not in the expected language

Equal Language Evaluator

  • Score: 1: The response is in the same language as the question
  • Score: 0: The response is in a different language than the question

The evaluation uses the langdetect library to detect the language of the text, with special handling for Spanish and Portuguese languages in the Equal Language Evaluator.

Usage Examples

Expected Language Evaluator

import asyncio

from trusttest.evaluation_contexts import Context
from trusttest.evaluators import ExpectedLanguageEvaluator


async def evaluate():
    evaluator = ExpectedLanguageEvaluator(
        expected_language="es"
    )
    result = await evaluator.evaluate(
        response="Hola, ¿cómo estás?",
        context=Context()
    )
    print(result)

if __name__ == "__main__":
    asyncio.run(evaluate())

Equal Language Evaluator

import asyncio

from trusttest.evaluation_contexts import QuestionContext
from trusttest.evaluators import EqualLanguageEvaluator


async def evaluate():
    evaluator = EqualLanguageEvaluator()

    result = await evaluator.evaluate(
        response="Hola, ¿cómo estás?",
        context=QuestionContext(
            question="¿Qué tal estás?"
        )
    )
    print(result)

if __name__ == "__main__":
    asyncio.run(evaluate())

The evaluators return a tuple containing:

  • A binary score (0 or 1) indicating language match status
  • A list of explanations including:
    • Success message with the detected language if matched
    • Failure message with both detected and expected languages if not matched

When to Use

Use the Language Evaluators when you need to:

  • Ensure responses are in the correct language
  • Verify language consistency in conversations
  • Validate multilingual content
  • Check language requirements
  • Monitor language-specific responses
  • Ensure proper language handling in chatbots
  • Validate language-specific content generation