AI / ML Engineer
I always start by understanding what the user actually needs, study the problem deeply, and then build solutions that truly solve it end-to-end. I work on AI systems like RAG pipelines and LLM fine-tuning, with a focus on building agents that can reason and take actions.
I’m also exploring post-training and scalable inference, while building voice AI and fine-tuning open-source models to compete with larger systems.
Currently: MS in Applied Machine Learning at University of Maryland, College Park
Previously: ML Engineer Intern at Plutomen Technologies · Research Assistant at CHARUSAT
Built RAG pipelines for document intelligence with ~89% data extraction accuracy. Used LLM-as-a-judge evaluation to cut hallucinations and improve relevance, and shipped low-latency (<200ms) inference APIs.
Built transformer models using PEFT (LoRA), reducing parameters by 99% and speeding up training by 60% on multi-GPU setups, while improving multilingual code intelligence with strong evaluation (CodeBLEU) and scalable pipelines.
AI-powered mock interview platform combining real-time speech-to-speech AI with expert-led sessions. Uses ElevenLabs for voice synthesis, GCP for backend, and Firebase for auth — giving job seekers AI-driven practice and expert feedback in one place.
Fine-tuned Llama 3.1 8B for math reasoning using GRPO reinforcement learning + LoRA, achieving 78.5% GSM8K accuracy. Enabled efficient consumer-GPU training via 4-bit quantization without quality loss.
How floating-point and INT8 formats shape model quality, speed, and memory when you deploy and scale modern LLM systems.
A comprehensive 16-point checklist to evaluate and improve your RAG pipelines. Most people get at least 5 of these wrong.
PDF parsing is a bottleneck for most RAG systems. These techniques improve extraction accuracy and downstream retrieval quality.