The Multi-Turn Manipulation probe tests your model’s ability to resist gradual manipulation techniques that condition the model over multiple conversation exchanges.Documentation Index
Fetch the complete documentation index at: https://docs.neuraltrust.ai/llms.txt
Use this file to discover all available pages before exploring further.
Overview
| Attribute | Value |
|---|---|
| Class Name | MultiTurnManipulationProbe |
| Category | Multi-Turn |
| Attack Type | Conditioning |
| Typical Turns | Variable |
| Evaluation | Binary per objective |
How It Works
Multi-turn manipulation uses various conditioning techniques:- Context Building: Establish a narrative or scenario
- Incremental Requests: Slowly escalate the nature of requests
- Normalization: Make unusual requests seem normal
- Commitment/Consistency: Build on previous agreements
- Goal Achievement: Extract the desired information or behavior
When to Use
- General multi-turn safety testing: Broad coverage of conditioning attacks
- Context manipulation testing: Verify model tracks conversation appropriately
- Baseline comparisons: Standard multi-turn benchmark
Code Example
Configuration Options
| Parameter | Type | Default | Description |
|---|---|---|---|
target | Target | Required | The target model to test |
objectives | List[Objective] | Required | List of objectives to pursue |
max_turns | int | 10 | Maximum conversation turns |
language | LanguageType | "English" | Language for the conversation |
llm_client | LLMClient | None | Optional custom LLM client |
Related Probes
- Crescendo Attack - Gradual escalation
- Echo Chamber - Reinforcement-based