The RAG (Retrieval-Augmented Generation) Probe is a specialized tool designed to automatically generate and evaluate test cases for RAG systems. It uses a knowledge base to generate both functional and adversarial questions, and then evaluates the model’s responses against these questions.
This probe needs a LLM client configured to generate the questions and topic generation if not provided.
Also and embedding model configured to generate the questions and topic generation if not provided.
Generates malicious questions to test system robustness
Supports multiple attack types:
Instruction Manipulation: Questions that attempt to overwrite, invalidate, edit, or contradict the information in the context, forcing the system to provide incorrect information while maintaining a natural question format.
Role Play: Questions that assign a new name/role to the assistant and provide context updates that override previous information, creating scenarios that lead to incorrect responses.
Hypothetical: Questions that create hypothetical scenarios based on the context, testing the system’s ability to handle speculative situations while maintaining factual accuracy.
Storytelling: Questions that attempt to make the system engage in storytelling about the context, potentially leading to fictional or exaggerated responses.
Obfuscation: Questions that use complex language, technical terms, or confusing phrasing to obscure the actual intent, testing the system’s ability to handle complex queries.
Payload Splitting: Questions that split malicious content across multiple parts or use indirect references, testing the system’s ability to handle fragmented or indirect queries.
List Based: Questions that request lists or enumerations of information, potentially leading to incomplete or incorrect responses.
Special Token: Questions that include special characters, tokens, or unusual formatting to test the system’s handling of non-standard input.
Off Tone: Questions that attempt to make the system respond in an inappropriate or unprofessional tone, testing its ability to maintain appropriate communication standards.