Skip to main content

Data Masking

Overviewโ€‹

The Data Masking plugin provides comprehensive protection for sensitive data by detecting and masking various types of information in requests and responses.

Predefined Entitiesโ€‹

Financial Informationโ€‹

Entity TypeDescriptionDefault MaskExample Pattern
credit_cardCredit card numbers[MASKED_CC]4111-1111-1111-1111
cvvCard verification values[MASKED_CVV]CVV: 123
bank_accountBank account numbers[MASKED_ACCOUNT]12345678901234
ibanInternational Bank Account Numbers[MASKED_IBAN]GB29NWBK60161331926819
swift_bicSWIFT/BIC codes[MASKED_BIC]BOFAUS3N
routing_numberBank routing numbers[MASKED_ROUTING]021000021
stripe_keyStripe API keys[MASKED_API_KEY]sk_test_1234567890

Personal Identificationโ€‹

Entity TypeDescriptionDefault MaskExample Pattern
ssnSocial Security Numbers[MASKED_SSN]123-45-6789
drivers_licenseDriver's license numbers[MASKED_LICENSE]D123-4567-8901
passportPassport numbers[MASKED_PASSPORT]A12345678
tax_idTax identification numbers[MASKED_TAX_ID]12-3456789

Contact Informationโ€‹

Entity TypeDescriptionDefault MaskExample Pattern
emailEmail addresses[MASKED_EMAIL]user@example.com
phone_numberPhone numbers[MASKED_PHONE]+1-234-567-8900
addressPhysical addresses[MASKED_ADDRESS]123 Main St
zip_codePostal codes[MASKED_ZIP]12345-6789

Technical Identifiersโ€‹

Entity TypeDescriptionDefault MaskExample Pattern
ip_addressIPv4 addresses[MASKED_IP]192.168.1.1
ip6_addressIPv6 addresses[MASKED_IP6]2001:db8::1
mac_addressMAC addresses[MASKED_MAC]00:1A:2B:3C:4D:5E
uuidUniversal Unique Identifiers[MASKED_UUID]550e8400-e29b-41d4-a716-446655440000
device_imeiDevice IMEI numbers[MASKED_IMEI]123456789012345
vehicle_vinVehicle identification numbers[MASKED_VIN]1HGCM82633A123456

Authentication & Securityโ€‹

Entity TypeDescriptionDefault MaskExample Pattern
passwordPassword fields[MASKED_PASSWORD]password=secret123
api_keyAPI keys[MASKED_API_KEY]api_key=abcd1234
access_tokenAccess tokens[MASKED_TOKEN]Bearer xyz789
jwt_tokenJWT tokens[MASKED_JWT_TOKEN]eyJhbGc...

Cryptocurrencyโ€‹

Entity TypeDescriptionDefault MaskExample Pattern
crypto_walletCryptocurrency wallet addresses[MASKED_WALLET]0x71C7656EC7ab88b098defB751B7401B5f6d8976F

International Identifiersโ€‹

Europeanโ€‹

Entity TypeDescriptionDefault MaskExample Pattern
spanish_dniSpanish national ID[MASKED_DNI]12345678A
spanish_nieSpanish foreigner ID[MASKED_NIE]X1234567L
spanish_cifSpanish company tax ID[MASKED_CIF]B12345678
spanish_nssSpanish social security[MASKED_NSS]12 34567890 12
spanish_ibanSpanish IBAN[MASKED_ES_IBAN]ES91 2100 0418 4502 0005 1332
french_nirFrench social security[MASKED_FR_NIR]1 84 12 76 451 089 46
italian_cfItalian fiscal code[MASKED_IT_CF]RSSMRA85T10A562S
german_idGerman ID[MASKED_DE_ID]L01X00T47H

Latin Americanโ€‹

Entity TypeDescriptionDefault MaskExample Pattern
brazilian_cpfBrazilian individual taxpayer[MASKED_BR_CPF]123.456.789-09
brazilian_cnpjBrazilian company registry[MASKED_BR_CNPJ]12.345.678/0001-95
mexican_curpMexican personal ID[MASKED_MX_CURP]BADD110313HCMLNS09
mexican_rfcMexican tax ID[MASKED_MX_RFC]VECJ880326XXX
argentine_dniArgentine national ID[MASKED_DNI]12345678
chilean_rutChilean tax ID[MASKED_RUT]12.345.678-9
colombian_ccColombian citizen ID[MASKED_CC]12345678
peruvian_dniPeruvian national ID[MASKED_DNI]12345678

Otherโ€‹

Entity TypeDescriptionDefault MaskExample Pattern
us_medicareUS Medicare ID[MASKED_MEDICARE]1234-567-890A
isinInternational Securities ID[MASKED_ISIN]US0378331005
dateDate formats[MASKED_DATE]2024-03-14

Configuration Examplesโ€‹

Basic Configurationโ€‹

{
"name": "data_masking",
"enabled": true,
"settings": {
"apply_all": true,
"similarity_threshold": 0.8,
"max_edit_distance": 1
}
}

Selective Entity Configurationโ€‹

{
"name": "data_masking",
"enabled": true,
"settings": {
"apply_all": false,
"predefined_entities": [
{
"entity": "credit_card",
"enabled": true,
"mask_with": "[MASKED_CC]"
},
{
"entity": "email",
"enabled": true,
"mask_with": "[MASKED_EMAIL]"
},
{
"entity": "ssn",
"enabled": true
}
]
}
}

Custom Rules Configurationโ€‹

{
"name": "data_masking",
"enabled": true,
"settings": {
"apply_all": false,
"rules": [
{
"type": "regex",
"pattern": "\\b\\d{6}\\b",
"mask_with": "[MASKED_PIN]"
},
{
"type": "keyword",
"pattern": "internal-secret",
"mask_with": "[MASKED_SECRET]"
}
]
}
}

Combined Configurationโ€‹

{
"name": "data_masking",
"enabled": true,
"settings": {
"apply_all": false,
"predefined_entities": [
{
"entity": "credit_card",
"enabled": true
},
{
"entity": "ssn",
"enabled": true
}
],
"rules": [
{
"type": "regex",
"pattern": "custom-[A-Z]{3}\\d{4}",
"mask_with": "[MASKED_CUSTOM]"
}
],
"similarity_threshold": 0.8,
"max_edit_distance": 1
}
}

Configuration Parametersโ€‹

ParameterTypeDescriptionDefault
apply_allbooleanEnable all predefined entitiesfalse
similarity_thresholdfloatThreshold for fuzzy matching (0.0-1.0)0.8
max_edit_distanceintegerMaximum allowed character differences1
predefined_entitiesarrayList of predefined entities to enable[]
rulesarrayCustom masking rules[]

Predefined Entity Optionsโ€‹

ParameterTypeDescriptionRequired
entitystringEntity type from predefined listYes
enabledbooleanEnable/disable this entityYes
mask_withstringCustom mask to applyNo

Custom Rule Optionsโ€‹

ParameterTypeDescriptionRequired
typestring"regex" or "keyword"Yes
patternstringPattern to matchYes
mask_withstringMask to applyNo

Technical Overviewโ€‹

The Data Masking plugin implements sophisticated pattern matching and text processing to identify and mask sensitive information in both requests and responses.

Core Componentsโ€‹

  1. Pattern Detection Engine

    • Pre-compiled regex patterns
    • Keyword matching system
    • Fuzzy matching support
    • Pattern variant generation
  2. Content Processor

    • JSON data handling
    • Plain text processing
    • Streaming support
    • Character encoding handling
  3. Masking System

    • Configurable mask patterns
    • Context-aware masking
    • Format preservation
    • Custom mask rules

Implementation Detailsโ€‹

Pattern Matching Systemโ€‹

The plugin uses multiple matching strategies:

  1. Exact Pattern Matching

    • Regular expression based
    • Pre-compiled patterns
    • Optimized execution
    • Cache-friendly design
  2. Fuzzy Matching

    • Similarity threshold control
    • Edit distance calculation
    • Character substitution handling
    • Pattern variants support

Content Processing Pipelineโ€‹

  1. Input Processing

    • Content type detection
    • Encoding validation
    • Size verification
    • Format parsing
  2. Pattern Application

    • Priority-based scanning
    • Multi-pattern matching
    • Context preservation
    • Performance optimization
  3. Masking Execution

    • Format-specific masking
    • Character preservation
    • Length maintenance
    • Context awareness

Performance Optimizationโ€‹

Pattern Compilationโ€‹

  • Pre-compiled regex patterns
  • Pattern caching
  • Optimized matching order
  • Early termination

Content Processingโ€‹

  • Streaming processing
  • Chunked analysis
  • Buffer management
  • Memory efficiency

Security Considerationsโ€‹

  1. Mask Selection

    • Use meaningful masks
    • Maintain data format
    • Consider data context
    • Regular security review
  2. Pattern Testing

    • Comprehensive test cases
    • Edge case validation
    • Performance testing
    • Security validation