Local PII mask · before cloud AI

Ask AI without
exposing people.

Mask Aadhaar, PAN, UPI, names and more on-device. Only safe text hits the cloud.

See it mask a prompt Hugging Face

108.9Mparameters

89.37%real-world F1

17PII types

Maskara · BERT-NER^™

Wrap a prompt, call any LLM, restore the reply. Raw values never leave the device.

How it works Eval report

Apache-2.0 EN + Hinglish

Watch Maskara mask
before the cloud sees it.

Detect sensitive spans, swap them for safe twins, restore after the model replies. Hover the demo to pause.

Detect

The NER model finds every sensitive span in the prompt, on-device.

Replace

Raw values swap for safe twins. Only masked text leaves the device.

Restore

Cloud reply comes back; originals are stitched in locally.

Hinglish support note Aadhaar · Phone · Person name

Your prompt Raw

Cloud sees Waiting

Cloud reply · restored on device

BERT-grade,
India-first.

A BERT token-classification checkpoint tuned for Indian PII formats and bilingual Hinglish text. Public on Hugging Face, Apache-2.0 licensed, and built to drop in front of any LLM call.

108.9Mparameters · BERT token classifier

89.37%F1 on real-world eval

17PII entity types

35 + OBIO labels in config

Two ways to run it

Detect tokens straight from Transformers, or wrap the prompt path with the masking SDK.

1 · ProtectDetect & mask PII locally.

2 · CallSend only safe context.

3 · RestorePut originals back locally.

from maskara import Maskara

maskara = Maskara()                  # loads the 108.9M-param NER model
safe, ctx = maskara.protect(prompt)  # PII swapped for safe twins
reply = llm.complete(safe)           # only masked text hits the cloud
final = maskara.restore(reply, ctx)  # originals re-inserted locally

from transformers import (AutoTokenizer,
    AutoModelForTokenClassification, pipeline)

name = "somukandula/maskara"
tok = AutoTokenizer.from_pretrained(name)
model = AutoModelForTokenClassification.from_pretrained(name)

ner = pipeline("token-classification", model=model, tokenizer=tok)
text = "Mera naam Rohit Kumar hai, Aadhaar 4220-4122-1200 aur email rohit.k@gmail.com."
print(ner(text))

Hugging Face

somukandula/maskara ↗

BERT token classification · Safetensors · F32 · Apache-2.0 license.

Built for code-mixed text

Trained on Hinglish and English-mixed prompts, so it catches PII even when sentences switch languages mid-stream.

Drop-in middleware

Wrap a prompt, call your provider, restore the response. No new app architecture, no data sent out to detect.

17 entity types

Five Indian-specific formats added in this checkpoint, on top of twelve global identifiers.

AADHAARIN PAN_CARDIN PASSPORTIN UPI_IDIN VEHICLE_REGIN PERSON_NAME ADDRESS PHONE EMAIL DATE_OF_BIRTH CREDIT_CARD PASSWORD API_KEY IP_ADDRESS USERNAME SSN DRIVER_LICENSE

Indian-specific (new) Global identifier

500K synthetic base · targeted retraining 89.30% template-disjoint F1 89.37% real-world F1 84.42% precision · 94.95% recall

Ask AI withoutexposing people.

Watch Maskara maskbefore the cloud sees it.

BERT-grade,India-first.