Oct – Dec 2025
Computer Systems
Security
PE Malware Scanner
Trained an XGBoost classifier on 3.2 million PE file samples from the EMBER 2024 dataset to detect malware through static analysis, no code execution required. Built a FastAPI backend and React frontend as a team of four.
98.4%
test accuracy
90%
fresh malware caught
29%
2018 model vs 90%, recency gap
- • Extracts 2,568 static features (imports, byte entropy, PE headers, sections)
- • Zero false positives on independently-collected December 2024 samples
- • 60+ point improvement over EMBER 2018 (concept drift)
Technologies