AI Detection Results Vary
Most PII detection tools use AI/ML models that produce probabilistic results. Run the same document twice, get different answers. Explain that to an auditor.
When regulators ask "How did you identify this data as personal information?", you need a clear, repeatable answer. Not "the model thought so."
✓ Deterministic (Regex)
- Same input = same output, always
- Fully auditable pattern rules
- No model drift over time
- Explainable to regulators
- 100% reproducible results
✗ Probabilistic (AI/ML)
- Results vary between runs
- Black box decision making
- Model drift over updates
- Hard to explain to auditors
- Confidence scores, not certainty
317 Pattern Recognizers
cloak.business uses 317 deterministic regex patterns for structured data like IDs, tax numbers, credit cards, IBANs, and email addresses. NLP models supplement for names and locations.
Built on Microsoft Presidio with custom recognizers optimized for global PII formats. ISO 27001:2022 certified servers in Germany. Data never leaves EU jurisdiction.
Benefits for Compliance Teams
Regex + NLP Hybrid Approach
Structured data (emails, SSNs, credit cards, IBANs) uses deterministic regex patterns. 100% reproducible. Perfect for compliance.
Unstructured data (names, organizations, locations) uses NLP models (spaCy, Stanza, XLM-RoBERTa) with confidence scores. All processing on German servers—no third-party AI services.
Five anonymization methods: Replace, Redact, Mask, Hash (SHA-256), or Encrypt (AES-256-GCM).
Try the PII Website Scanner
Scan any website for exposed personal information. Free tier includes 200 tokens monthly.
Common Questions About Detection
Further Reading
Why 317 Pattern Recognizers Beat 30
How custom recognizers with checksum validation achieve 82% higher accuracy than generic ML models.
How to Detect PII in Documents
Complete guide covering regex patterns, NLP models, and hybrid approaches for GDPR compliance.
ISO 27001 Annex A Compliance Mapping
How deterministic detection maps to 14 control domains across access, cryptography, and incident management.
When SaaS-Only Isn't Enough
Air-gapped networks, offline requirements, and why desktop apps still matter for sensitive environments.