MS Business Analytics @ Northeastern · Scaled a product from $1.2M to $11M with ML
Data Scientist with 2+ years of experience shipping ML models, data pipelines, and analytics solutions that drive real business impact. At Paisabazaar (India's largest credit marketplace), I built XGBoost and CatBoost models on 500K+ samples that increased customer acquisition by 28% and reduced loan defaults by 32%, and helped scale a financial product from $1.2M to $11M in 3 months. I've processed 30M+ daily records, built PySpark data marts, and deployed end-to-end ML systems on AWS ECS Fargate. MS in Business Analytics from Northeastern University (GPA 3.8).
Data manipulation, scripting, and database querying
Machine learning, deep learning, and AI frameworks
Interactive dashboards, KPI tracking, and data storytelling
Version control, IDEs, and development environments
Cloud infrastructure, data warehousing, and ETL pipelines
For more information, have a look at my curriculum vitae .
End-to-end ML pipeline for Medicare fraud detection across 5,400+ providers. Engineered 44 predictive features, Optuna hyperparameter tuning, FastAPI + Gradio UI, and CI/CD to AWS ECS Fargate.
Ingested 20M+ taxi trip records into GCS and BigQuery. Provisioned GCP infrastructure with Terraform, built 10+ Kestra workflow orchestrations, and developed dbt models for analytics-ready schemas.
LightGBM classification model on 380K customer records with SMOTE for class imbalance. Applied SHAP for model interpretation, identifying key predictors of purchase intent.
Scalable ETL pipeline using Selenium + Pandas, storing 1000+ daily records in AWS S3. Automated with Airflow DAGs on EC2 with error handling and retry logic.
Linear programming model using PuLP to optimize shipping across a 3-warehouse network. Interactive Streamlit dashboard with what-if analysis for scenario planning.
Automated resume customization using Gemini API with keyword extraction and ATS optimization. End-to-end pipeline from JD parsing to PDF generation via LaTeX.
Content-based engine using KNN with cosine similarity on sparse feature vectors from 3000+ movies. Dockerized Streamlit app with CI/CD via GitHub Actions and TMDB API integration.
12-week purchase order forecasting model for retail using ensemble SARIMA and Monte Carlo simulation. Risk-based reorder points across 508 ASINs with statistical validation.
MS Business Analytics from Northeastern (GPA 3.8) with 2+ years at Paisabazaar building ML models on 500K+ samples and scaling products to $11M. Actively seeking Data Scientist, ML Engineer, and Business Analyst roles. Based in Boston, MA - open to relocation.