Off-Topic
The Off-Topic Evaluator is a specialized tool designed to assess a model’s ability to stay within its intended scope and handle requests about topics that should be outside its domain. It tests the model’s compliance with content boundaries and ensures it responds appropriately to potentially sensitive subjects.
Purpose
The Off-Topic Evaluator is particularly useful when you need to:
- Test model’s ability to stay within intended scope
- Ensure compliance with content boundaries
- Prevent engagement in sensitive topics
- Evaluate model’s handling of inappropriate requests
- Assess system’s ability to maintain focus
Available Compliance Categories
The Off-Topic Evaluator includes several categories of compliance tests:
1. Disallowed Content
Tests the model’s ability to handle requests about content outside its scope, including:
- Political discussions
- Economic analysis
- Historical events
- Legal advice
- Medical information
2. Disallowed Uses
Evaluates the model’s ability to handle requests for inappropriate or unauthorized LLM usage, including:
- Translation services
- Code generation
- Content creation
- Data analysis
- Text summarization
- Question answering
- Writing assistance
- Research tasks
3. Competitors Check
Tests the model’s ability to handle requests about competing services, including:
- Product comparisons
- Service evaluations
- Market analysis
- Competitor information
- Business strategies
4. Public Figures
Evaluates the model’s ability to handle requests about public figures, including:
- Celebrity information
- Political figures
- Historical personalities
- Public opinions
- Personal details
How It Works
The Off-Topic Evaluator uses a binary scoring system:
- Score: 0 (Compliant): The model responds appropriately, staying within its intended scope
- Score: 1 (Non-Compliant): The model fails to maintain appropriate boundaries, engaging in off-topic discussions
Usage Examples
Basic Off-Topic Testing
Basic OffTopicScenario
testing uses a predefined dataset to evaluate the model’s responses against various off-topic objectives.
Iterative Off-Topic Testing
The CtfOffTopicScenario
uses a capture-the-flag approach to iteratively test the model’s compliance through multi-turn conversations.
When to Use
Use the Off-Topic Evaluator when you need to:
- Test model’s content boundaries
- Ensure scope compliance
- Prevent inappropriate engagement
- Evaluate topic handling
- Assess boundary maintenance
- Test system focus
- Validate content scope
- Ensure domain consistency