The Dataset Probe is a specialized tool designed to create test cases from structured datasets. It allows you to load questions and contexts from various data formats (JSON, Parquet, or YAML) and automatically generate test cases by querying a model with these questions.
from trusttest.dataset_builder import Dataset, DatasetItemfrom trusttest.evaluation_contexts import ExpectedResponseContextfrom trusttest.targets.testing import DummyTargetfrom trusttest.probes import DatasetProbetarget = DummyTarget()# Create a Dataset with test casesdataset = Dataset( [ [ # this test case represents a conversation DatasetItem( question="What is Python?", context=ExpectedResponseContext( expected_response="Python is a high-level, interpreted programming language." ), ), DatasetItem( question="What is JavaScript?", context=ExpectedResponseContext( expected_response="JavaScript is a programming language used primarily for web development." ), ) ], [ # this test case represents a single question DatasetItem( question="What is Python?", context=ExpectedResponseContext( expected_response="Python is a high-level, interpreted programming language." ), ) ] ])# Create the probe with the datasetprobe = DatasetProbe(target=target, dataset=dataset)
You can load test cases from a JSON file and save the results back to JSON:
Copy
from trusttest.dataset_builder import Datasetfrom trusttest.probes import DatasetProbefrom trusttest.targets.testing import DummyTargettarget = DummyTarget()# Load dataset from JSONdataset = Dataset.from_json("path/to/your/dataset.json")# Save dataset to JSONdataset.to_json("path/to/save/dataset.json")# Create probe with the datasetprobe = DatasetProbe(target=target, dataset=dataset)# Get test settest_set = probe.get_test_set()
For handling large datasets efficiently, you can use Parquet format:
Copy
from trusttest.dataset_builder import Datasetfrom trusttest.probes import DatasetProbefrom trusttest.targets.testing import DummyTargettarget = DummyTarget()# Load dataset from Parquetdataset = Dataset.from_parquet("path/to/your/dataset.parquet")# Save dataset to Parquetdataset.to_parquet("path/to/save/dataset.parquet")# Create probe with the datasetprobe = DatasetProbe(target=target, dataset=dataset)# Get test settest_set = probe.get_test_set()
For human-readable configuration, you can use YAML format:
Copy
from trusttest.dataset_builder import Datasetfrom trusttest.probes import DatasetProbefrom trusttest.targets.testing import DummyTargettarget = DummyTarget()# Load dataset from YAMLdataset = Dataset.from_yaml("path/to/your/dataset.yaml")# Save dataset to YAMLdataset.to_yaml("path/to/save/dataset.yaml")# Create probe with the datasetprobe = DatasetProbe(target=target, dataset=dataset)# Get test settest_set = probe.get_test_set()
Each of these methods provides flexibility in how you create and manage your test cases, allowing you to choose the most appropriate format for your specific use case.