Heuristic
Overview
Heuristic evaluators uses mathematical and logical formulas to aproximate if a response is correct or incorrect.
Why Heuristic Evaluators are Important
Heuristic evaluators are valuable because they:
- Consistency: Provide consistent evaluations across different runs and scenarios
- Speed: Execute quickly without requiring additional API calls
- Cost-Effective: Don’t require additional LLM API calls, making them more economical
However, there are some limitations:
- Rigidity: May miss nuanced or context-dependent aspects of responses
- Limited Scope: Can only evaluate what has been explicitly defined in the rules
- Maintenance: Require regular updates to handle new patterns or edge cases
- Complexity: May become unwieldy when trying to capture complex evaluation criteria
Current TrustTest Heuristic Evaluators
TrustTest provides several specialized heuristic evaluators:
- Regex Evaluator: Uses regular expressions to validate response patterns
- Equals Evaluator: Checks if responses exactly match expected values
- BLEU Evaluator: Measures the similarity between responses using the BLEU score metric
- Expected Language Evaluator: Verifies if responses are in the expected language
- Equal Language Evaluator: Compares the language of responses to ensure consistency
While heuristic evaluators are fast and consistent, we recommend using LLM as a Judge evaluators when possible as they can better understand semantic relationships and reason about content in a more human-like way.