Raghav Gali · AI/ML Engineer

( about )

Who I am & what I build

I'm Raghav Gali, an AI/ML Engineer with a Master's in Data Science from Northeastern University. I'm currently a Machine Learning Engineer at Squark AI, working on Seer, a production AutoML platform, and was previously an ML Intern at Transformly AI, where I led SKU rationalization on ~50k SKUs and built churn models deployed behind FastAPI services.

I work at the intersection of applied research and engineering: fine-tuning vision-language models, building multi-agent RAG platforms, and shipping end-to-end MLOps pipelines with real evaluation gates rather than hand-waving.

I care about making machine learning reliable, observable, and fast: from Dask-scale feature pipelines to Cloud Run deployments with model registries, rollback workflows, and meaningful quality metrics.

Multi-Agent & RAG Systems

Built LangGraph-based agent routers with provider fallbacks, hybrid retrieval (vector + BM25), and RAGAS / LLM-judge evaluation gates wired into CI/CD.

Generative AI & LLMs

Deep expertise in fine-tuning, RAG architectures, and deploying large language models in enterprise environments.

End-to-End MLOps

Cloud Run deployments, 20+ GitHub Actions workflows, model registries with rollback, W&B tracing, and Dask-based feature pipelines at SKU scale.

( experience )

Where I've worked

jun 2026 - present

Machine Learning Engineer

Squark AI · Remote

Working on Seer, Squark's automated ML platform: a production AutoML pipeline that trains on datasets up to ~3.7M rows and ~150 features across three model backends. Took over clustering / feature-reduction R&D mid-handoff: mapped the 6-stage containerized pipeline end-to-end, diagnosed why the prior clustering-based feature reducer capped below baseline, and reframed the work around training-time reduction with AUC as a guardrail, a two-track plan adopted by the project lead.

Python scikit-learn AutoML Clustering Dimensionality Reduction

jan 2025 - jun 2025

Machine Learning Intern

Transformly AI · Boston, MA

Primary contributor on a SKU rationalization POC for Stonewall Kitchen. Built Dask-based metadata workflows across 50K+ SKUs and ~5M scoped transaction records, cutting runtime from 120 minutes to seconds. Engineered 10+ behavioral features, ran 20+ UMAP + K-Means variations to produce 10 SKU value tiers, and identified $1M+ in annual margin opportunity. Also built a CatBoost 30-day churn model and shipped FastAPI inference services with async handling, retries, and timeout controls.

Python Dask UMAP CatBoost FastAPI Clustering

sep 2022 - feb 2023

Data Scientist

Axon Electric Corporation · Bengaluru, India

Owned end-to-end analytics for B2B SKU demand and market drivers across ~2,000 SKUs and 3 sales channels. Engineered 15+ demand features (velocity, seasonality, promotion dependency) from ERP, sales, and inventory data, built PCA + K-Means segmentation into 5 revenue tiers, and delivered an XGBoost demand forecaster. Replaced manual reporting with automated cross-channel views and data-quality checks, reducing reporting latency from hours to minutes.

Python SQL XGBoost PCA / K-Means Feature Engineering

sep 2021 - dec 2021

Data Science Intern

Digital Shark Technologies

Applied ML and deep learning to text classification and visual inspection tasks for a technology consulting firm. Built NLP classification pipeline using TF-IDF and SVM, improving text classification accuracy by 20% and boosting downstream marketing personalization by 25%. Applied ResNet-based CNN models for automated visual inspection, improving validation accuracy by 15%.

Python TensorFlow scikit-learn NLP Computer Vision

( selected work )

Projects, with receipts

project domain proof

FrontShiftAI Multi-agent voice RAG platform cloud-deployed · 19 tenants +

Cloud-deployed HR assistant with LangGraph agent routing (PTO, HR ticketing, RAG, website extraction) across 19 tenants. 7-stage PDF-to-vector ingestion pipeline, LLM provider fallback chains, RAG evaluation with quality gates, and 20+ GitHub Actions workflows on Cloud Run.

view project

FastAPI LangGraph ChromaDB GCP

BAF Fraud Detection MLOps / classification ~0.90 ROC-AUC @ 5% FPR +

Production-shaped fraud detection on 1M bank-account-opening applications (~1.1% fraud prevalence). Engineered 30+ behavioral features, trained XGBoost with training-only SMOTE, and selected the decision threshold from the ROC curve at a fixed 5% FPR budget: ~0.55 recall and ~0.90 ROC-AUC at the operating point. Full local MLOps stack on one docker-compose file: MLflow registry with gated @champion promotion, FastAPI serving, Prometheus/Grafana monitoring, Evidently drift detection, and GitHub Actions CI.

view project

XGBoost MLflow Prometheus Docker

Vehicle Damage VLM Vision-language fine-tuning 92.6% scaling efficiency +

Fine-tuned GLM-4.6V-Flash with LoRA + 8-bit quantization on RunPod multi-GPU for vehicle damage assessment. Implemented PyTorch DDP via torchrun with 92.6% scaling efficiency (1.85× speedup) on 2 GPUs, with metadata-leakage cleanup and BLEU/METEOR/ROUGE evaluation on a corrected test split.

view project

GLM-4.6V LoRA DDP W&B

VisaWise Policy RAG +13% context recall +

RAG chatbot for U.S. immigration policy with hybrid retrieval (70% vector / 30% BM25) on BGE embeddings + Pinecone, Llama-3-8B generation, and a RAGAS evaluation framework. Weight tuning lifted context recall by 13% over a vector-only baseline.

view project

LlamaIndex Pinecone Hybrid Retrieval RAGAS

NYC Taxi Demand Intelligence Lakehouse & forecasting 8.43M trips · in progress +

Databricks lakehouse for NYC Yellow Taxi demand: a medallion pipeline (Bronze / Silver / Gold Delta tables) built with PySpark over 8.43M cleaned trips and over $152M in recorded revenue, powering a 5-page, 37-query Databricks SQL dashboard. Compared Spark ML forecasters against a historical zone-day-hour baseline; the baseline won on holdout MAE and ships as v1. Solo build, actively expanding.

view project

Databricks PySpark Delta Lake Spark ML

( also on github: Store-Item Forecasting · Chicago Rideshare ETL · see everything → )

( hackathons )

Built in a weekend, judged well

1st

track 1 · brainstorm 2026 bci hackathon

BrainStorm Neural Decoder

Real-time neural decoder classifying auditory stimuli from streaming 1024-channel ECoG brain recordings under edge constraints (<25MB, causal-only). PCA channel reduction with a depthwise-separable EEGNet and sliding-window streaming inference. Team win against 10 teams of grad students and post-docs.

PyTorch EEGNet Edge ML

view project

2nd

bu questrom hackathon 2026 · solo build

Responza

Multilingual AI voice triage for city 911/311 services across 4 languages: Vapi orchestration, Deepgram Nova-3 STT, ElevenLabs TTS, and LLM tool-calling that hands true emergencies to 911 and files everything else as structured 311 tickets routed across ~40 Boston departments onto a live React operator dashboard.

Vapi Deepgram FastAPI React

view project

top 3

world2data track · hacknation, 1500+ builders

RoboSight

AI/ML architect for a 3-model parallel vision pipeline (YOLO, SAM3, LFM2.5-VL) on Modal GPUs, with a temporal signal compiler and human-in-the-loop calibration for robot navigation ground truth. Placed top 3 across 1500+ builders from 60 countries.

YOLO SAM3 VLM Modal GPU

view project

( more weekend builds: SharedLM · FlowSight · AlphaEarth )

( what i do )

Three ways I'm useful

01 GenAI & RAG Systems +

LLM applications end to end: agent routing, hybrid retrieval, evaluation gates, and deployment. Not just prompts. Provider fallbacks, RAGAS / LLM-judge scoring wired into CI, and multi-tenant serving.

LangGraph LangChain Multi-Agent Systems RAG & Hybrid Retrieval ChromaDB / Pinecone / FAISS RAGAS LoRA & QLoRA Vision-Language Models

02 ML Engineering & MLOps +

Training, serving, and operating models with the boring parts done right: model registries with gated promotion, distributed fine-tuning, monitoring and drift detection, and CI/CD that actually blocks bad models.

PyTorch / Lightning HF Transformers XGBoost / CatBoost / LightGBM FastAPI / AsyncIO Docker MLflow Weights & Biases Prometheus / Grafana GitHub Actions GCP / AWS

03 Data Science & Analytics +

From raw tables to decisions: feature engineering at SKU scale, segmentation and clustering, demand forecasting, and dashboards that answer specific business questions.

Python SQL Pandas / NumPy Dask / PySpark PCA / UMAP / K-Means Time Series & Forecasting Tableau / Plotly PostgreSQL

( education )

Academic background

Master of Science
in Data Science

Northeastern University · Boston, MA

sep 2023 - dec 2025 · gpa 3.9 / 4.0

Coursework: Machine Learning, Large Language Models, Machine Learning Operations (MLOps), Cloud Computing, Data Structures & Algorithms, Intro to Data Management.

Bachelor of Engineering
in Electronics & Communication

Visvesvaraya Technological University · India

aug 2018 - may 2022

Foundation in signal processing, embedded systems, and applied mathematics: the groundwork for later pivoting into data science and applied ML.

( contact )

Ready to build something intelligent?

Open to full-time roles, collaborations, and interesting problems. If you're working on something that pushes AI/ML forward, let's talk.

get in touch

I build machine-learning systems that actually ship.

raghav gali.

Who I am & what I build

Multi-Agent & RAG Systems

Generative AI & LLMs

End-to-End MLOps

Where I've worked

Machine Learning Engineer

Machine Learning Intern

Data Scientist

Data Science Intern

Projects, with receipts

Built in a weekend, judged well

BrainStorm Neural Decoder

Responza

RoboSight

Three ways I'm useful

Academic background