v0.1.0 · Open Source · MIT License

Unlock sensitive data for universal statistical utility.

A lightweight, modular Python library for anonymizing structured, semi-structured, and unstructured data while preserving its statistical utility. Built for GDPR, HIPAA, and modern privacy compliance.

Get Started View on GitHub
Privacy Engineering Data Anonymization Statistical Utility GDPR Compliant HIPAA Ready Open Source Privacy Engineering Data Anonymization Statistical Utility GDPR Compliant HIPAA Ready Open Source

The Privacy-Utility Spectrum

Every masking decision involves a fundamental trade-off. MaskMe exposes this explicitly, letting you choose the right balance for each field in your dataset.

Maximum Privacy Maximum Utility
DROP
HASH
REDACT
NOISE
GENERALIZE
KEEP
6 Strategies One for every privacy requirement
Per-Field Control Dot notation for nested targeting
Built-in Validation Statistical utility metrics

Masking Strategies

Six complementary approaches. Apply them selectively to preserve analytical value where it matters.

Drop

Complete Removal

Permanently delete the field. Maximum privacy with no possibility of re-identification.

No parameters
Direct identifiers, unnecessary columns
Hash

Deterministic Hashing

One-way cryptographic hash with secret salt. Enables consistent joins without exposing originals.

algo: sha256, sha512 · salt
User IDs, foreign keys, usernames
Redact

Character Masking

Replace portions of a string. Preserves format and partial visibility for validation.

mask_char · visible_chars
Email addresses, names, phone numbers
Noise

Statistical Noise

Add calibrated Gaussian or Laplacian noise with optional differential privacy guarantees.

sigma · epsilon · delta · seed
Salaries, ages, clinical metrics
Generalize

Value Generalization

Group into bins or categories. Reduces granularity while maintaining analytical structure.

step · bins · depth
Age ranges (20-30), geographic zones
Keep

Explicit Preservation

Retain unchanged. For analytical payloads requiring full precision downstream.

No parameters
Diagnosis codes, symptoms, labels

Get Started

Two interfaces, one engine. Use the Python library for programmatic control or the CLI for bulk processing.

MaskMe Engine

Python library · Full programmatic control

5-step workflow: import, prepare, rules, anonymize, output.

# 1. Import MaskMe
from maskme import MaskMe

# 2. Prepare sensitive data
data = {
    "patient": {
        "id": "PAT-2026-0045",
        "full_name": "Lucien Kiemde",
        "birth_date": "1998-05-12",
        "address": "01 BP 548, Ouagadougou"
    },
    "medical_history": ["Asthma", "Allergies"],
    "diagnosis": "Acute Respiratory Distress",
    "prescription": "Salbutamol 100mcg - 2 puffs every 4h"
}

# 3. Define anonymization rules
rules = {
    "patient.id": "hash",
    "patient.full_name": "redact",
    "patient.birth_date": "generalize",
    "patient.address": "redact",
    "diagnosis": "keep",
    "prescription": "keep"
}

# 4. Apply anonymization
engine = MaskMe(rules)
result = engine.mask(data)

# 5. Collect output
print(result)

MaskMe CLI

Command line · Bulk file processing

Stream or batch process files from the terminal.

# Process a CSV file
maskme --input data.csv \
       --rules rules.json \
       --format csv \
       --output clean.csv

# Streaming with pipes
cat data.jsonl | maskme \
       --rules rules.json \
       --format jsonl \
       > clean.jsonl

# Validate utility
maskme validate \
       --original data.csv \
       --masked clean.csv \
       --column salary

MaskMe Engine

Python library · Full programmatic control

# 1. Import MaskMe
from maskme import MaskMe

# 2. Prepare sensitive data
data = {
    "patient": {
        "id": "PAT-2026-0045",
        "full_name": "Lucien Kiemde",
        "birth_date": "1998-05-12",
        "address": "01 BP 548, Ouagadougou"
    },
    "medical_history": ["Asthma", "Allergies"],
    "diagnosis": "Acute Respiratory Distress",
    "prescription": "Salbutamol 100mcg - 2 puffs every 4h"
}

# 3. Define anonymization rules
rules = {
    "patient.id": {"strategy": "hash"},
    "patient.full_name": "redact",
    "patient.birth_date": {"strategy": "generalize"},
    "patient.address": "redact",
    "diagnosis": "keep",
    "prescription": "keep"
}

# 4. Apply anonymization
engine = MaskMe(rules)
result = engine.mask(data)

# 5. Collect output
print(result)

References & Citations

If you use MaskMe in your research, please cite it using one of the formats below.

BibTeX

@software{kiemde2026maskme, author = {Kiemde, Lucien}, title = {MaskMe: Agnostic Python library for data anonymization}, year = {2026}, url = {https://github.com/k13lucien/maskme}, version = {0.1.0}, address = {Ouagadougou, Burkina Faso} }

APA

Kiemde, L. (2026). MaskMe: Agnostic Python library for data anonymization (Version 0.1.0) [Computer software]. Retrieved from https://github.com/k13lucien/maskme

Vancouver

Kiemde L. MaskMe: Agnostic Python library for data anonymization [Computer software]. Version 0.1.0. Ouagadougou, Burkina Faso; 2026. Available from: https://github.com/k13lucien/maskme

MLA

Kiemde, Lucien. "MaskMe: Agnostic Python library for data anonymization." Version 0.1.0, 2026, https://github.com/k13lucien/maskme.

Scientific Foundations

MaskMe builds upon established principles in data privacy, differential privacy, and statistical disclosure control.

Differential Privacy

Dwork, C., & Roth, A. (2014). The Algorithmic Foundations of Differential Privacy. Foundations and Trends in Theoretical Computer Science, 9(3-4), 211-407.

Statistical Disclosure Control

Hundepool, A., et al. (2012). Statistical Disclosure Control. Wiley Series in Survey Methodology. John Wiley & Sons.

GDPR Article 32

European Parliament. (2016). General Data Protection Regulation (GDPR). Article 32: Security of Processing. Official Journal of the European Union, L119, 1-88.

k-Anonymity

Sweeney, L. (2002). k-Anonymity: A Model for Protecting Privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 10(5), 557-570.