Purpose
The BLEU Evaluator is particularly useful when you need to:- Measure the similarity between generated and reference text
- Evaluate machine translation quality
- Assess text generation quality
- Compare different text generation models
- Set quality thresholds for text generation
How It Works
The evaluator calculates a BLEU score between 0 and 1 (or 0-100 when converted to percentage), where:- Score: 0: The generated text is completely different from the reference
- Score: 1: The generated text perfectly matches the reference
- N-gram precision (default: 1-gram)
- Smoothing method (default: method1)
- Customizable weights for different n-gram orders
- Configurable threshold (default: 0.7)
Usage Example
- A score (0-100) indicating the BLEU score percentage
- A list of explanations including the BLEU score, n-gram configuration, and threshold comparison
When to Use
Use the BLEU Evaluator when you need to:- Evaluate machine translation systems
- Assess text generation quality
- Compare different text generation models
- Set quality thresholds for automated text generation
- Measure similarity between generated and reference text
- Evaluate the performance of language models