Indian Address Parser
Parse unstructured Indian addresses into structured components using mBERT-CRF (Multilingual BERT with Conditional Random Field).
Features
- Supports Hindi + English (Devanagari and Latin scripts)
- 15 entity types: House Number, Floor, Block, Gali, Colony, Area, Khasra, Pincode, etc.
- Delhi-specific locality gazetteer for improved accuracy
- < 30ms inference time
Example Addresses
Results
Highlighted Entities
Extracted Entities
Structured Output
Entity Legend
Model: IndicBERTv2-SS + CRF (ai4bharat/IndicBERTv2-SS + CRF layer) | Training Data: 600+ annotated Delhi addresses | GitHub: indian-address-parser