This probe needs a LLM client configured to generate the test set.
Purpose
The Capture the Flag Probe is particularly useful when you need to:- Generate test cases for specific model behaviors
- Create multi-turn conversation scenarios with custom objectives
- Test model responses to particular types of questions
- Generate iterative test sets for specific use cases
- Create custom boundary testing scenarios
How It Works
The Capture the Flag Probe generates test sets through an iterative approach:- Objective Definition: Define custom objectives with specific questions and expected outcomes
- Multi-turn Conversation Generation: The probe creates conversation flows that attempt to achieve the objective
- Adaptive Prompting: Each turn is designed to adapt based on potential model responses
- Test Set Creation: The probe outputs a structured test set for evaluation
Usage Examples
Basic Custom Objective Test Set Generation
BasicCaptureTheFlagScenario with a custom objective:
When to Use
Use the Capture the Flag Probe when you need to:- Generate test sets for specific model behaviors
- Create custom conversation scenarios
- Test particular types of responses
- Generate iterative test cases for specific use cases
- Create boundary testing scenarios for custom objectives