Back to home
Writing
Notes on shipping production AI
Longer-form writing about RAG, fine-tuning, LLM architecture, and the operational work that keeps AI systems running in production.
Retrieval-Augmented Generation in production: lessons from shipping
Jan 2025
Placeholder: replace with a real Medium article. What worked, what didn't, and the boring ops work that actually matters.
RAGLLMsFastAPI
Fine-tuning Llama 2 for low-resource languages
Jun 2024
Placeholder: replace with a real Medium article. Data prep, LoRA config, evaluation, and what the numbers actually mean.
Fine-tuningLLMsAmharic
Building an LLM gateway: multi-provider routing and retries
Oct 2024
Placeholder: replace with a real Medium article about routing between OpenAI / Claude / Gemini with structured outputs.
LLMsArchitecture