Why Data Masking Matters
- Privacy Protection Obfuscating personal information like names, addresses, or payment details helps safeguard users from identity theft or targeted attacks.
- Regulatory Compliance Many data protection regulations (e.g., GDPR, HIPAA, PCI-DSS) require that sensitive data be handled securely. Masking ensures compliance without hindering normal business processes.
- Secure Logging and Monitoring Since masked data cannot reveal the actual sensitive values, it’s safe to include these details in logs, debug traces, or monitoring systems without risking accidental leaks.
- Reduced Exposure In the event of an unauthorized access or breach, masked data is significantly less useful to attackers, minimizing the potential harm.
Plugin Highlights
- Automatic PII Detection The plugin can identify a variety of PII entities (e.g., emails, phone numbers) and replace them with sanitized placeholders.
- Custom Rules Define your own patterns or keywords to mask, giving you fine-grained control over exactly which data gets obfuscated.
- Flexible Application Apply masking rules to requests, responses, or both, depending on your security and auditing requirements.
- Reversible Hashing Temporarily mask sensitive data during processing while preserving the ability to restore original values in the final response.
- Logging-Friendly Ensures logs remain useful for debugging while preventing unauthorized disclosure of sensitive information.
Technical Overview
The Data Masking plugin implements sophisticated pattern matching and text processing to identify and mask sensitive information in both requests and responses.Configuration Parameters
Parameter | Type | Description | Default |
---|---|---|---|
apply_all | boolean | Enable all predefined entities | false |
similarity_threshold | float | Threshold for fuzzy matching (0.0-1.0) | 0.8 |
max_edit_distance | integer | Maximum allowed character differences | 1 |
predefined_entities | array | List of predefined entities to enable | [] |
rules | array | Custom masking rules | [] |
reversible_hashing | object | Configuration for reversible hashing | {enabled: false} |
Predefined Entity Options
Parameter | Type | Description | Required |
---|---|---|---|
entity | string | Entity type from predefined list | Yes |
enabled | boolean | Enable/disable this entity | Yes |
mask_with | string | Custom mask to apply | No |
Custom Rule Options
Parameter | Type | Description | Required |
---|---|---|---|
type | string | ”regex” or “keyword” | Yes |
pattern | string | Pattern to match | Yes |
mask_with | string | Mask to apply | No |
Reversible Hashing Options
Parameter | Type | Description | Required |
---|---|---|---|
enabled | boolean | Enable/disable reversible hashing | Yes |
secret | string | Secret key used for generating secure hash values | Yes (if enabled) |
Core Components
- Pattern Detection Engine
- Pre-compiled regex patterns
- Keyword matching system
- Fuzzy matching support
- Pattern variant generation
- Content Processor
- JSON data handling
- Plain text processing
- Streaming support
- Character encoding handling
- Masking System
- Configurable mask patterns
- Context-aware masking
- Format preservation
- Custom mask rules
- Reversible hashing capability
Reversible Hashing
The Data Masking plugin supports a powerful feature called Reversible Hashing that allows sensitive data to be temporarily masked during processing but restored in the final response.How It Works
-
Pre-Request Stage:
- When enabled, instead of replacing sensitive data with mask placeholders, the plugin generates secure HMAC-SHA256 hashes of the sensitive values
- These hashes are used as replacements in the processed data
- The original values and their corresponding hashes are stored in a temporary memory cache
-
Processing Stage:
- All operations work with the hashed values, keeping sensitive data protected during processing
-
Post-Response Stage:
- Before returning the response, the plugin replaces all hash values with their original sensitive data
- This ensures that the final output contains the original values, while keeping them protected during processing
Use Cases
- API Proxying: Mask sensitive data while it passes through your API gateway, but deliver the original data to the end client
- Logging & Analytics: Process and analyze data in a secure form, but maintain the ability to restore original values when needed
- Temporary Protection: Provide an additional layer of security during multi-step processing without permanently altering the data
Security Considerations
- The secret key used for hashing should be strong and kept secure
- Hashed values are only stored in memory temporarily and are automatically cleared after processing
- This feature should be used with caution, as it does ultimately reveal the original sensitive data in the final response
Implementation Details
Pattern Matching System
The plugin uses multiple matching strategies:- Exact Pattern Matching
- Regular expression based
- Pre-compiled patterns
- Optimized execution
- Cache-friendly design
- Fuzzy Matching
- Similarity threshold control
- Edit distance calculation
- Character substitution handling
- Pattern variants support
Content Processing Pipeline
- Input Processing
- Content type detection
- Encoding validation
- Size verification
- Format parsing
- Pattern Application
- Priority-based scanning
- Multi-pattern matching
- Context preservation
- Performance optimization
- Masking Execution
- Format-specific masking
- Character preservation
- Length maintenance
- Context awareness
Performance Optimization
Pattern Compilation
- Pre-compiled regex patterns
- Pattern caching
- Optimized matching order
- Early termination
Content Processing
- Streaming processing
- Chunked analysis
- Buffer management
- Memory efficiency
Security Considerations
- Mask Selection
- Use meaningful masks
- Maintain data format
- Consider data context
- Regular security review
- Pattern Testing
- Comprehensive test cases
- Edge case validation
- Performance testing
- Security validation
Additional Topics
- PII Entities: A breakdown of built-in entity types the plugin can detect.
- Custom Rules: How to create and apply your own masking rules.
- Data Masking Examples: Real-world scenarios and configuration examples to demonstrate the plugin’s capabilities.