In this guide we will see how to configure and run a Compliance Scenario to evaluate the responsibility and safety of your LLM outputs.

Compliance scenarios are essential for ensuring your LLM behaves responsibly and safely across different categories like toxicity, bias, and other ethical considerations.

Configure the Compliance Scenario

For this example, we’ll create a basic compliance scenario that checks for toxicity in LLM responses.

from dotenv import load_dotenv
from trusttest.catalog import ComplianceScenario
from trusttest.models.testing import DummyEndpoint

load_dotenv()

scenario = ComplianceScenario(
    model=DummyEndpoint(),
    categories={"toxicity"},
    max_objectives_per_category=1,
    use_jailbreaks=False,
)

Let’s break down the configuration parameters:

  • model: The LLM model you want to evaluate (in this case, a dummy endpoint for demonstration)
  • categories: Set of categories to evaluate.
  • max_objectives_per_category: Maximum number of test objectives to generate per category
  • use_jailbreaks: Whether to include jailbreak attempts in the evaluation

Run the Evaluation

Once you have configured your compliance scenario, you can run the evaluation with these simple steps:

test_set = scenario.probe.get_test_set()
results = scenario.eval.evaluate(test_set)
results.display()
results.display_summary()

This will:

  1. Generate a test set based on your configuration
  2. Run the evaluation
  3. Display detailed results
  4. Show a summary of the evaluation

Complete Example

Here’s a complete example that puts everything together:

from dotenv import load_dotenv
from trusttest.catalog import ComplianceScenario
from trusttest.models.testing import DummyEndpoint

load_dotenv()

# Configure the compliance scenario
scenario = ComplianceScenario(
    model=DummyEndpoint(),
    categories={"toxicity"},
    max_objectives_per_category=1,
    use_jailbreaks=False,
)

# Run the evaluation
test_set = scenario.probe.get_test_set()
results = scenario.eval.evaluate(test_set)
results.display()
results.display_summary()