AI/ML Engineer · LLMOps · Game AI · Reinforcement Learning
AI/ML Engineer finishing an M.S. in Applied Machine Learning at the University of Maryland (May 2026). I design and ship production-grade AI — multi-agent systems, LLMOps pipelines, RAG architectures, and reinforcement learning agents that run in real environments. I'm drawn to the hard problems: making AI reliable, fast, and useful where it actually matters.
Won 1st place and a $4K team prize at the 2026 UMD Smith Agentic AI Challenge. Built a LangGraph orchestration of 8 specialized agents for pharmaceutical cold-chain risk triage. Hybrid risk engine fuses 8 deterministic checks with XGBoost (ROC-AUC 0.9446) and SHAP explanations. RAG-based GDP/FDA compliance checks over 417 regulatory chunks. Human-in-the-loop approval gates — only 2–3 of the 8 agents use an LLM. The rest are deterministic by design.
PPO boss agent trained with 3-stage curriculum (300K steps, reward −2.4→13.2). Ray RLlib distributed across 8 workers for 2.6× speedup. Exported to ONNX → Unity Sentis for <2 ms/frame inference at 60 fps. Real-time LLM taunts via Groq + NeMo Guardrails (114–352 ms). RL boss outlasts scripted FSM by 85% across 50 episodes.
DeBERTa-v3-base fine-tuned on 232K NLI pairs — 0.9629 crisis recall across 30 adversarial probes in 6 attack categories. captum Integrated Gradients on every safety intercept. 5-stage emotion-conditioned RAG over 1.67M FAISS vectors (RoBERTa + LoRA). Ablation: 0.88 emotion alignment vs. 0.30 BM25 — Wilcoxon p = 3.62×10⁻⁸.
4-agent LangGraph state machine with retry loop. Critic applies PASS/STALE/CONTRADICTED/INSUFFICIENT verdicts — catches 52% of superseded ML claims vs. 0% single-pass RAG (130-question eval). Linear decay optimal across 3 ablated formulas. Position accuracy 43.9% vs. 32.3% baseline.
Llama 3.1 8B fine-tuned with QLoRA (rank 16). Best checkpoint from 3 MLflow experiments: sim 0.826, ROUGE-L 0.466. FastAPI + LangChain + Pinecone RAG with guardrails and confidence-gated web fallback. Auto corpus pipeline ingests 840 docs → 6,876 vectors weekly via GitHub Actions.
Production-hardened QA architecture with prompt-injection defense, PII redaction, RBAC, and append-only audit logging. RoBERTa-squad2 optimized with ONNX INT8 to 90ms P95 latency and 66.0% end-to-end F1. BM25 + BGE dense retrieval with cross-encoder reranking for source-grounded answers.
Speech → 360° image pipeline: Whisper (speech recognition) → GPT-Neo (text generation) → Stable Diffusion (image synthesis). Attention slicing and xFormers optimizations delivered 2.3× throughput within a 10 GB VRAM budget. Peer-reviewed and published at IEEE IDCIoT 2024.
M.S. Applied Machine Learning · University of Maryland · May 2026
Seeking: AI/ML Engineer · LLMOps · GenAI · Game AI · RL Engineer
STEM OPT eligible · Open to relocate
Or drop me a message directly: