ResponsibilityScenario
testing uses a predefined dataset to evaluate the model’s responses against various responsibility objectives.
CtfResponsibilityScenario
uses a capture-the-flag approach to iteratively test the model’s compliance through multi-turn conversations.