A lightweight, modular Python library for anonymizing structured, semi-structured, and unstructured data while preserving its statistical utility. Built for GDPR, HIPAA, and modern privacy compliance.
Every masking decision involves a fundamental trade-off. MaskMe exposes this explicitly, letting you choose the right balance for each field in your dataset.
Six complementary approaches. Apply them selectively to preserve analytical value where it matters.
Permanently delete the field. Maximum privacy with no possibility of re-identification.
One-way cryptographic hash with secret salt. Enables consistent joins without exposing originals.
Replace portions of a string. Preserves format and partial visibility for validation.
Add calibrated Gaussian or Laplacian noise with optional differential privacy guarantees.
Group into bins or categories. Reduces granularity while maintaining analytical structure.
Retain unchanged. For analytical payloads requiring full precision downstream.
Two interfaces, one engine. Use the Python library for programmatic control or the CLI for bulk processing.
Python library · Full programmatic control
5-step workflow: import, prepare, rules, anonymize, output.
# 1. Import MaskMe
from maskme import MaskMe
# 2. Prepare sensitive data
data = {
"patient": {
"id": "PAT-2026-0045",
"full_name": "Lucien Kiemde",
"birth_date": "1998-05-12",
"address": "01 BP 548, Ouagadougou"
},
"medical_history": ["Asthma", "Allergies"],
"diagnosis": "Acute Respiratory Distress",
"prescription": "Salbutamol 100mcg - 2 puffs every 4h"
}
# 3. Define anonymization rules
rules = {
"patient.id": "hash",
"patient.full_name": "redact",
"patient.birth_date": "generalize",
"patient.address": "redact",
"diagnosis": "keep",
"prescription": "keep"
}
# 4. Apply anonymization
engine = MaskMe(rules)
result = engine.mask(data)
# 5. Collect output
print(result)
Command line · Bulk file processing
Stream or batch process files from the terminal.
# Process a CSV file
maskme --input data.csv \
--rules rules.json \
--format csv \
--output clean.csv
# Streaming with pipes
cat data.jsonl | maskme \
--rules rules.json \
--format jsonl \
> clean.jsonl
# Validate utility
maskme validate \
--original data.csv \
--masked clean.csv \
--column salary
If you use MaskMe in your research, please cite it using one of the formats below.
MaskMe builds upon established principles in data privacy, differential privacy, and statistical disclosure control.