Overview
The Azure Toxicity Detection plugin is an advanced content moderation layer that leverages Azure’s Content Safety API Azure Content Safety to analyze and filter potentially harmful content in both text and images. This plugin provides comprehensive content analysis across multiple categories and supports both text-based and image-based content moderation. The plugin features a sophisticated multi-category detection system that can identify various types of inappropriate content, with configurable severity levels for each category:Category | Description |
---|---|
Hate | Content expressing hatred or discrimination |
Violence | Content depicting or promoting violence |
SelfHarm | Content related to self-harm behaviors |
Sexual | Sexually explicit or inappropriate content |
Severity Level Description | FourSeverityLevels | EightSeverityLevels |
---|---|---|
Safety/Very Low Risk Content | 0 | 0 |
1 | ||
Low Risk Content | 2 | 2 |
3 | ||
Medium Risk Content | 4 | 4 |
5 | ||
High Risk Content | 6 | 6 |
7 |
Features
Content Analysis Capabilities
The Azure Toxicity Detection plugin offers comprehensive content analysis features: • Multi-Modal Analysis: Supports both text and image content analysis, enabling comprehensive content screening across different media types. The plugin processes text inputs for harmful language and analyzes images for inappropriate visual content, providing a unified moderation solution for mixed-media applications • Configurable Categories: Flexible category selection and threshold configuration, allowing for tailored content moderation policies • Severity Level Control: Two output types for different granularity needs, providing flexibility in how content is evaluated and acted upon • Custom Actions: Configurable response actions and error messages, allowing for tailored responses to detected content violations • Content Path Specification: Flexible content extraction from various request formats, enabling precise content extraction from different request typesPerformance and Integration
The Azure Toxicity Detection plugin is engineered for seamless integration and highly efficient operation in production environments. It provides real-time content evaluation capabilities through its immediate analysis system, allowing for instant moderation decisions. The plugin implements configurable endpoints that enable separate processing paths for both text and image content, ensuring optimal handling of different content types. This architecture supports rapid content screening while maintaining high performance standards. The plugin delivers comprehensive analysis results through detailed response data that includes precise severity scores across all configured categories. These detailed insights enable informed decision-making based on content risk levels. Additionally, the system incorporates robust error handling and logging mechanisms that ensure reliable operation even under challenging conditions, maintaining system stability while providing clear visibility into the moderation process. The combination of detailed analytics and dependable error management makes the plugin a reliable choice for content moderation needs.Configuration
Basic Configuration
Here’s a basic configuration example that enables both text and image content moderation:Configuration Parameters
Essential Settings
•API_KEY
: Your Azure Content Safety API key
• ENDPOINTS
: Configuration for text and image analysis endpoints
• TEXT
: Azure endpoint for text content analysis
• IMAGE
: Azure endpoint for image content analysis
• OUTPUT_TYPE
: Severity level format (“FourSeverityLevels” or “EightSeverityLevels”)
Content Type Configuration
•CONTENT_TYPES
: Array of content type configurations
• TYPE
: Content type (“text” or “image”)
• PATH
: JSON path to extract content from request
Category Settings
•CATEGORY_SEVERITY
: Threshold configuration for each category
• Values for FourSeverityLevels: 0, 2, 4, or 6
• Values for EightSeverityLevels: 0 to 7
Action Configuration
•ACTIONS
: Response configuration for detected violations
• TYPE
: Action type (e.g., “block”)
• MESSAGE
: Custom error message