Purpose
The Dataset Probe is particularly useful when you need to:- Create test cases from existing datasets
- Create your own test cases by hand.
How It Works
The probe works with four main data formats:- JSON Format: A list of objects containing questions and contexts
- Parquet Format: A columnar storage format with ‘question’ and ‘context’ columns
- YAML Format: A human-readable format for storing questions and contexts
- Python List of dictionaries: A list of dictionaries containing questions and contexts
- Load the dataset from the specified format
- For each item in the dataset, query the model with the question
- Create test cases containing the interactions between the model and the questions
- Allow saving the results back to JSON, Parquet, or YAML format
When to Use
Use the Dataset Probe when you need to:- Convert existing datasets into test cases
- Work with large datasets efficiently (using Parquet format)
- Use human-readable formats (using YAML)
- Automate the process of creating test cases from structured data
- Maintain consistency in test case generation
- Store and share test cases in a standardized format