
Jesse Doniel
San Francisco, United States
Jesse Doniel
Full Stack AI Engineer
Category : Web development
Jesse Doniel — Senior Full Stack AI Engineer
I don't just build AI prototypes — I ship production-grade LLM systems that serve 300K+ monthly active users, cut latency by nearly half, and reduce cloud costs by a third. The gap between "we have an API key" and "AI is transforming our operations" is an engineering problem. That's my specialty.
What I Actually Do
🧩 Multi-Agent LLM Systems & Orchestration
Architected multi-agent GenAI platforms using Claude 3.7 Sonnet, GPT-4o, Gemini 2.0 Flash, and LangGraph — including MCP tool integrations serving 300K+ MAU. I handle the full chain: prompt engineering, context management, output parsing, structured validation, and ReAct reasoning loops for production reliability.
📚 RAG Pipelines & Knowledge Systems
Built and deployed 12+ production RAG microservices backed by Qdrant hybrid retrieval, reducing AI response latency by 47%. I turn your internal documents, databases, and domain knowledge into precise, business-aware AI — using Pinecone, Qdrant, Weaviate, FAISS, and Chroma.
🤖 AI Agent & Workflow Automation
Eliminated 41% of manual triage work for 200+ operations staff via LangChain + LangGraph agentic workflows. I design multi-step pipelines that autonomously handle document processing, enrichment, reporting, and CRM updates — reliably, in production.
⚙️ Fine-Tuning & Model Optimization
Implemented LoRA/QLoRA fine-tuning on Hugging Face Transformers for domain-specific financial NLP, achieving 33% accuracy improvement over base models. I bring full LLM observability with LangSmith and Weights & Biases across all production AI endpoints.
🔌 Scalable Cloud-Native Backend
Designed and scaled backend services handling 5M+ daily API requests at 99.95% uptime on AWS ECS and Lambda using Celery and Redis for async LLM workloads. Deep experience with Docker, Kubernetes, Terraform, and AWS Bedrock auto-scaling.
🚀 Full-Stack AI Application Delivery
Built production dashboards with Next.js 15 App Router streaming and WebSocket-based real-time model output — boosting GenAI UI performance by 38%. End-to-end ownership: FastAPI/Node.js backends, React 19 frontends, Docker deployments, GitHub Actions CI/CD.
My Stack
AI / LLMs: Claude 3.7 Sonnet · GPT-4o · Gemini 2.0 Flash · LLaMA · Mistral · Hugging Face
Orchestration: LangChain · LangGraph · LlamaIndex · MCP
Vector & RAG: Pinecone · Qdrant · FAISS · Weaviate · Chroma
Languages: Python 3.12 · TypeScript 5 · Rust · SQL · Bash
Frontend: React 19 · Next.js 15 · Tailwind CSS v4 · Zustand · Vercel AI SDK · WebSockets
Backend: FastAPI · Node.js 22 · tRPC · GraphQL · gRPC · Celery · Redis
Cloud & DevOps: AWS Bedrock · ECS · Lambda · SageMaker · Docker · Kubernetes · Terraform · GitHub Actions
ML / Evals: PyTorch 2 · LoRA/QLoRA · LangSmith · Weights & Biases · pandas · NumPy
Why Clients Hire Me
✅ Production specialist — not research, not theory. I ship AI systems that serve hundreds of thousands of users reliably.
✅ I've authored 60+ async architecture RFCs and design documents, communicating clearly across 4 distributed remote teams.
✅ I document everything. You'll own what I build and know how to maintain it.
✅ I scope honestly upfront — no surprise pivots, no silent scope creep.
✅ I move fast without cutting corners on architecture: from zero to 12 production microservices.
Not a Fit If
❌ You need a data scientist to train models from scratch
❌ You want "magic AI" without understanding your own workflows first
❌ You're not ready to iterate — real integration projects require feedback loops
💬 Let's Work Together
Send me a message with:
1. The tools and systems you're currently using
2. The process you want AI to improve or automate
3. Your timeline
I'll respond within a few hours with a clear plan — and tell you honestly if I'm the right fit.
I don't just build AI prototypes — I ship production-grade LLM systems that serve 300K+ monthly active users, cut latency by nearly half, and reduce cloud costs by a third. The gap between "we have an API key" and "AI is transforming our operations" is an engineering problem. That's my specialty.
What I Actually Do
🧩 Multi-Agent LLM Systems & Orchestration
Architected multi-agent GenAI platforms using Claude 3.7 Sonnet, GPT-4o, Gemini 2.0 Flash, and LangGraph — including MCP tool integrations serving 300K+ MAU. I handle the full chain: prompt engineering, context management, output parsing, structured validation, and ReAct reasoning loops for production reliability.
📚 RAG Pipelines & Knowledge Systems
Built and deployed 12+ production RAG microservices backed by Qdrant hybrid retrieval, reducing AI response latency by 47%. I turn your internal documents, databases, and domain knowledge into precise, business-aware AI — using Pinecone, Qdrant, Weaviate, FAISS, and Chroma.
🤖 AI Agent & Workflow Automation
Eliminated 41% of manual triage work for 200+ operations staff via LangChain + LangGraph agentic workflows. I design multi-step pipelines that autonomously handle document processing, enrichment, reporting, and CRM updates — reliably, in production.
⚙️ Fine-Tuning & Model Optimization
Implemented LoRA/QLoRA fine-tuning on Hugging Face Transformers for domain-specific financial NLP, achieving 33% accuracy improvement over base models. I bring full LLM observability with LangSmith and Weights & Biases across all production AI endpoints.
🔌 Scalable Cloud-Native Backend
Designed and scaled backend services handling 5M+ daily API requests at 99.95% uptime on AWS ECS and Lambda using Celery and Redis for async LLM workloads. Deep experience with Docker, Kubernetes, Terraform, and AWS Bedrock auto-scaling.
🚀 Full-Stack AI Application Delivery
Built production dashboards with Next.js 15 App Router streaming and WebSocket-based real-time model output — boosting GenAI UI performance by 38%. End-to-end ownership: FastAPI/Node.js backends, React 19 frontends, Docker deployments, GitHub Actions CI/CD.
My Stack
AI / LLMs: Claude 3.7 Sonnet · GPT-4o · Gemini 2.0 Flash · LLaMA · Mistral · Hugging Face
Orchestration: LangChain · LangGraph · LlamaIndex · MCP
Vector & RAG: Pinecone · Qdrant · FAISS · Weaviate · Chroma
Languages: Python 3.12 · TypeScript 5 · Rust · SQL · Bash
Frontend: React 19 · Next.js 15 · Tailwind CSS v4 · Zustand · Vercel AI SDK · WebSockets
Backend: FastAPI · Node.js 22 · tRPC · GraphQL · gRPC · Celery · Redis
Cloud & DevOps: AWS Bedrock · ECS · Lambda · SageMaker · Docker · Kubernetes · Terraform · GitHub Actions
ML / Evals: PyTorch 2 · LoRA/QLoRA · LangSmith · Weights & Biases · pandas · NumPy
Why Clients Hire Me
✅ Production specialist — not research, not theory. I ship AI systems that serve hundreds of thousands of users reliably.
✅ I've authored 60+ async architecture RFCs and design documents, communicating clearly across 4 distributed remote teams.
✅ I document everything. You'll own what I build and know how to maintain it.
✅ I scope honestly upfront — no surprise pivots, no silent scope creep.
✅ I move fast without cutting corners on architecture: from zero to 12 production microservices.
Not a Fit If
❌ You need a data scientist to train models from scratch
❌ You want "magic AI" without understanding your own workflows first
❌ You're not ready to iterate — real integration projects require feedback loops
💬 Let's Work Together
Send me a message with:
1. The tools and systems you're currently using
2. The process you want AI to improve or automate
3. Your timeline
I'll respond within a few hours with a clear plan — and tell you honestly if I'm the right fit.
Working hours
- Monday:08h00 To 18h00
- Tuesday:08h00 To 18h00
- Wednesday:08h00 To 18h00
- Thursday:08h00 To 18h00
- Friday:08h00 To 18h00
- Saturday:Not available
- Sunday:Not available
- 🇬🇧 English
Please sign in as a customer to give your feedback



