Email Security Backend
Production email threat detection system that automatically quarantines phishing and malware before they reach the inbox.
Overview
Multi-signal threat scoring API monitoring IMAP inboxes. Combines SPF/DKIM/DMARC header analysis, URL reputation (VirusTotal, Google Safe Browsing, URLHaus), attachment hash scanning, and a calibrated NLP phishing classifier.
Key Features
- NLP phishing classifier: TF-IDF + LinearSVC, calibrated with sigmoid — trained on labelled email datasets
- Weighted threat score → auto-tag / quarantine / delete pipeline
- VirusTotal + Google Safe Browsing + URLHaus URL reputation checks
- Token-bucket rate limiter for free-tier API compliance
- REST API with Prometheus metrics endpoint — deployed on Render + PostgreSQL
Architecture
FastAPI REST API + IMAP polling loop + PostgreSQL + multi-signal threat scoring pipeline
What I Learned
“Threat scoring is inherently probabilistic, and the cost of a false positive (legitimate email deleted) is different from the cost of a false negative (phishing email delivered). Tuning the classifier meant thinking carefully about asymmetric error costs — not just accuracy.”