AI / ML Engineer

Raghav
Gali

Building intelligent and scalable AI/ML systems - from research to production.

M.S.
Data Science
AI/ML
Engineering
+
Curiosity
Hero visual
About

Who I Am & What I Build

I'm Raghav Gali, an AI/ML Engineer based in Boston with a Master's in Data Science from Northeastern University. Most recently an ML Intern at Transformly AI, where I led SKU rationalization on ~50k SKUs and built churn models deployed behind FastAPI services.

I work at the intersection of applied research and engineering - fine-tuning vision-language models, building multi-agent RAG platforms, and shipping end-to-end MLOps pipelines with real evaluation gates rather than hand-waving.

I care about making machine learning reliable, observable, and fast - from Dask-scale feature pipelines to Cloud Run deployments with model registries, rollback workflows, and meaningful quality metrics.

Multi-Agent & RAG Systems

Built LangGraph-based agent routers with provider fallbacks, hybrid retrieval (vector + BM25), and RAGAS / LLM-judge evaluation gates wired into CI/CD.

Generative AI & LLMs

Deep expertise in fine-tuning, RAG architectures, and deploying large language models in enterprise environments.

End-to-End MLOps

Cloud Run deployments, 20+ GitHub Actions workflows, model registries with rollback, W&B tracing, and Dask-based feature pipelines at SKU scale.

Experience

Work History

Jan 2025 - Jun 2025

Machine Learning Intern

Transformly AI - Boston, MA

Primary contributor on a SKU rationalization POC for Stonewall Kitchen. Built Dask-based metadata workflows across 50K+ SKUs and ~5M scoped transaction records, cutting runtime from 120 minutes to seconds. Engineered 10+ behavioral features, ran 20+ UMAP + K-Means variations to produce 10 SKU value tiers, and identified $1M+ in annual margin opportunity. Also built a CatBoost 30-day churn model and shipped FastAPI inference services with async handling, retries, and timeout controls.

Python Dask UMAP CatBoost FastAPI Clustering
Sep 2022 - Feb 2023

Data Scientist

Axon Electric Corporation - Bengaluru, India

Owned end-to-end analytics for B2B SKU demand and market drivers across ~2,000 SKUs and 3 sales channels. Engineered 15+ demand features (velocity, seasonality, promotion dependency) from ERP, sales, and inventory data, built PCA + K-Means segmentation into 5 revenue tiers, and delivered an XGBoost demand forecaster. Replaced manual reporting with automated cross-channel views and data-quality checks, reducing reporting latency from hours to minutes.

Python SQL XGBoost PCA / K-Means Feature Engineering
Sep 2021 - Dec 2021

Data Science Intern

Digital Shark Technologies

Applied ML and deep learning to text classification and visual inspection tasks for a technology consulting firm. Built NLP classification pipeline using TF-IDF and SVM, improving text classification accuracy by 20% and boosting downstream marketing personalization by 25%. Applied ResNet-based CNN models for automated visual inspection, improving validation accuracy by 15%

Python Tensorflow scikit-learn NLP Computer Vision
Projects

Selected Work

01

FrontShiftAI — Multi-Agent Voice RAG Platform

Cloud-deployed HR assistant with LangGraph agent routing (PTO, HR ticketing, RAG, website extraction) across 19 tenants. 7-stage PDF-to-vector ingestion pipeline, LLM provider fallback chains, RAG evaluation with quality gates, and 20+ GitHub Actions workflows on Cloud Run.

FastAPI LangGraph ChromaDB GCP
View Project →
02

RoboSight — Vision Pipeline for Humanoid Robots

Top 3 in the Hacknation World2Data track (1500+ builders, 60 countries). AI/ML architect for a 3-model parallel vision pipeline (YOLO, SAM3, LFM2.5-VL) on Modal GPUs, with a temporal signal compiler and human-in-the-loop calibration for robot navigation ground truth.

YOLO SAM3 VLM Modal GPU
View Project →
03

BAF Fraud Detection — End-to-End ML System

Bank account opening fraud detector built on the BAF Suite dataset. Engineered 30+ behavioral features (velocity windows, device signals, risk scores), trained XGBoost with SMOTE and custom threshold tuning for ~78% recall at FPR < 5%. Full MLOps stack — MLflow tracking, S3 artifacts, FastAPI inference, and Docker.

XGBoost MLflow FastAPI Docker
View Project →
04

Insurance VLM — Damage Assessment Fine-Tune

Fine-tuned GLM-4.6V-Flash with LoRA + 8-bit quantization on RunPod multi-GPU for vehicle damage assessment. Implemented PyTorch DDP via torchrun with 92.6% scaling efficiency (1.85× speedup) on 2 GPUs, with metadata-leakage cleanup and BLEU/METEOR/ROUGE evaluation on a corrected test split.

GLM-4.6V LoRA DDP W&B
View Project →
05

VisaWise — USCIS Policy RAG

RAG chatbot for U.S. immigration policy with hybrid retrieval (70% vector / 30% BM25) on BGE embeddings + Pinecone, Llama-3-8B generation, and a RAGAS evaluation framework. Weight tuning lifted context recall by 13% over a vector-only baseline.

LlamaIndex Pinecone Hybrid Retrieval RAGAS
View Project →
More on GitHub
+
Skills

Tech Stack

ML / AI
  • PyTorch / Lightning
  • Hugging Face Transformers
  • Scikit-learn
  • XGBoost / CatBoost / LightGBM
  • LangChain / LangGraph
  • LoRA & QLoRA Fine-Tuning
Languages & Data
  • Python
  • SQL (PostgreSQL, Athena)
  • JavaScript / TypeScript
  • Pandas / NumPy
  • Dask (distributed)
  • ETL Pipelines
MLOps & Infra
  • FastAPI / AsyncIO
  • Docker
  • GCP (Cloud Run, GCS, Vertex AI)
  • AWS (S3, Glue, Athena)
  • MLflow / Weights & Biases
  • GitHub Actions CI/CD
Domains
  • Large Language Models
  • Multi-Agent Systems
  • RAG & Hybrid Retrieval
  • Vision-Language Models
  • Computer Vision
  • Time Series & Forecasting
Education

Academic Background

Master of Science
in Data Science
Northeastern University — Boston, MA
Sep 2023 — Dec 2025 · GPA 3.9 / 4.0

Coursework: Machine Learning, Large Language Models, Machine Learning Operations (MLOps), Cloud Computing, Data Structures & Algorithms, Intro to Data Management.

Bachelor of Engineering
in Electronics & Communication
Visvesvaraya Technological University — India
Aug 2018 — May 2022

Foundation in signal processing, embedded systems, and applied mathematics — the groundwork for later pivoting into data science and applied ML.

Contact

Let's Connect

Ready to build
something intelligent?

Open to full-time roles, collaborations, and interesting problems. If you're working on something that pushes AI/ML forward, let's talk.

Get In Touch
Tweaks
Accent Color
Background Depth
Section Separator