Skip to main content
Content bias testing evaluates your model for both cognitive biases that affect reasoning and stereotypical biases that lead to discriminatory outputs. Detecting and mitigating these biases is essential for building fair and trustworthy AI systems.

Purpose

Bias testing helps ensure your model:
  • Provides fair and equitable responses across different groups
  • Avoids reinforcing harmful stereotypes
  • Maintains consistent reasoning regardless of framing
  • Meets ethical AI standards and compliance requirements

Bias Categories

Cognitive Biases

Cognitive biases affect how the model reasons and makes decisions:
Sub-CategoryDescription
Anchoring BiasTests if the model over-relies on initial information when making judgments
Framing BiasEvaluates if responses change based on how questions are framed
Positional BiasAssesses if the order of presented options affects model decisions
Status Quo BiasTests preference for current state over alternatives
Temporal BiasEvaluates if time-related framing affects reasoning

Stereotypical Biases

Stereotypical biases lead to unfair treatment of different groups:
Sub-CategoryDescription
Ethnic BiasTests for discriminatory responses based on ethnicity or nationality
Gender BiasEvaluates fairness across gender identities
LGBTIQ+ BiasAssesses treatment of LGBTIQ+ topics and individuals
Religion BiasTests for religious discrimination or preferential treatment

How It Works

Cognitive Bias Testing

Uses objective-based probes that present scenarios designed to trigger specific cognitive biases. The evaluator assesses whether the model’s reasoning is affected by the bias.

Stereotypical Bias Testing

Uses curated datasets with paired examples to detect differential treatment. The model’s responses are compared across demographic variations of the same question. Scoring:
  • Pass: The model demonstrates unbiased behavior
  • Fail: The model exhibits the tested bias

Usage Example

Testing for Cognitive Bias

from trusttest.catalog import ContentBiasScenario
from trusttest.targets.http import HttpTarget, PayloadConfig

target = HttpTarget(
    url="https://your-model-endpoint.com/chat",
    headers={"Content-Type": "application/json"},
    payload_config=PayloadConfig(
        format={
            "messages": [
                {"role": "user", "content": "{{ test }}"}
            ]
        },
        message_regex="{{ test }}",
    ),
)

scenario = ContentBiasScenario(
    target=target,
    sub_category="framing-bias",
    max_attacks=15,
)

test_set = scenario.probe.get_test_set()
results = scenario.eval.evaluate(test_set)
results.display_summary()

Testing for Stereotypical Bias

from trusttest.catalog import ContentBiasScenario

scenario = ContentBiasScenario(
    target=target,
    sub_category="gender-bias",
    use_dataset=True,
    max_attacks=50,
)

test_set = scenario.probe.get_test_set()
results = scenario.eval.evaluate(test_set)
results.display_summary()

When to Use

Use content bias testing when you need to:
  • Ensure fair treatment across demographic groups
  • Validate reasoning consistency
  • Meet responsible AI requirements
  • Audit models for discriminatory behavior
  • Prepare for deployment in sensitive applications