Building intelligent and scalable AI/ML systems - from research to production.
I'm Raghav Gali, an AI/ML Engineer based in Boston with a Master's in Data Science from Northeastern University. Most recently an ML Intern at Transformly AI, where I led SKU rationalization on ~50k SKUs and built churn models deployed behind FastAPI services.
I work at the intersection of applied research and engineering - fine-tuning vision-language models, building multi-agent RAG platforms, and shipping end-to-end MLOps pipelines with real evaluation gates rather than hand-waving.
I care about making machine learning reliable, observable, and fast - from Dask-scale feature pipelines to Cloud Run deployments with model registries, rollback workflows, and meaningful quality metrics.
Built LangGraph-based agent routers with provider fallbacks, hybrid retrieval (vector + BM25), and RAGAS / LLM-judge evaluation gates wired into CI/CD.
Deep expertise in fine-tuning, RAG architectures, and deploying large language models in enterprise environments.
Cloud Run deployments, 20+ GitHub Actions workflows, model registries with rollback, W&B tracing, and Dask-based feature pipelines at SKU scale.
Primary contributor on a SKU rationalization POC for Stonewall Kitchen. Built Dask-based metadata workflows across 50K+ SKUs and ~5M scoped transaction records, cutting runtime from 120 minutes to seconds. Engineered 10+ behavioral features, ran 20+ UMAP + K-Means variations to produce 10 SKU value tiers, and identified $1M+ in annual margin opportunity. Also built a CatBoost 30-day churn model and shipped FastAPI inference services with async handling, retries, and timeout controls.
Owned end-to-end analytics for B2B SKU demand and market drivers across ~2,000 SKUs and 3 sales channels. Engineered 15+ demand features (velocity, seasonality, promotion dependency) from ERP, sales, and inventory data, built PCA + K-Means segmentation into 5 revenue tiers, and delivered an XGBoost demand forecaster. Replaced manual reporting with automated cross-channel views and data-quality checks, reducing reporting latency from hours to minutes.
Applied ML and deep learning to text classification and visual inspection tasks for a technology consulting firm. Built NLP classification pipeline using TF-IDF and SVM, improving text classification accuracy by 20% and boosting downstream marketing personalization by 25%. Applied ResNet-based CNN models for automated visual inspection, improving validation accuracy by 15%
Cloud-deployed HR assistant with LangGraph agent routing (PTO, HR ticketing, RAG, website extraction) across 19 tenants. 7-stage PDF-to-vector ingestion pipeline, LLM provider fallback chains, RAG evaluation with quality gates, and 20+ GitHub Actions workflows on Cloud Run.
View Project →Top 3 in the Hacknation World2Data track (1500+ builders, 60 countries). AI/ML architect for a 3-model parallel vision pipeline (YOLO, SAM3, LFM2.5-VL) on Modal GPUs, with a temporal signal compiler and human-in-the-loop calibration for robot navigation ground truth.
View Project →Bank account opening fraud detector built on the BAF Suite dataset. Engineered 30+ behavioral features (velocity windows, device signals, risk scores), trained XGBoost with SMOTE and custom threshold tuning for ~78% recall at FPR < 5%. Full MLOps stack — MLflow tracking, S3 artifacts, FastAPI inference, and Docker.
View Project →Fine-tuned GLM-4.6V-Flash with LoRA + 8-bit quantization on RunPod multi-GPU for vehicle damage assessment. Implemented PyTorch DDP via torchrun with 92.6% scaling efficiency (1.85× speedup) on 2 GPUs, with metadata-leakage cleanup and BLEU/METEOR/ROUGE evaluation on a corrected test split.
View Project →RAG chatbot for U.S. immigration policy with hybrid retrieval (70% vector / 30% BM25) on BGE embeddings + Pinecone, Llama-3-8B generation, and a RAGAS evaluation framework. Weight tuning lifted context recall by 13% over a vector-only baseline.
View Project →Coursework: Machine Learning, Large Language Models, Machine Learning Operations (MLOps), Cloud Computing, Data Structures & Algorithms, Intro to Data Management.
Foundation in signal processing, embedded systems, and applied mathematics — the groundwork for later pivoting into data science and applied ML.
Ready to build
something intelligent?
Open to full-time roles, collaborations, and interesting problems. If you're working on something that pushes AI/ML forward, let's talk.
Get In Touch