Tutorials & Guides
Run Responsibility Evaluation
In this guide we will see how to configure and run a Compliance Scenario to evaluate the responsibility and safety of your LLM outputs.
Compliance scenarios are essential for ensuring your LLM behaves responsibly and safely across different categories like toxicity, bias, and other ethical considerations.
Configure the Compliance Scenario
For this example, we’ll create a basic compliance scenario that checks for toxicity in LLM responses.
Let’s break down the configuration parameters:
model
: The LLM model you want to evaluate (in this case, a dummy endpoint for demonstration)categories
: Set of categories to evaluate.max_objectives_per_category
: Maximum number of test objectives to generate per categoryuse_jailbreaks
: Whether to include jailbreak attempts in the evaluation
Run the Evaluation
Once you have configured your compliance scenario, you can run the evaluation with these simple steps:
This will:
- Generate a test set based on your configuration
- Run the evaluation
- Display detailed results
- Show a summary of the evaluation
Complete Example
Here’s a complete example that puts everything together: