Large Language Models: Deep Dive
Architecture, training, fine-tuning, evaluation, and deployment of LLMs
What you'll be able to do
- Explain how transformers and LLMs work under the hood
- Build RAG and tool-using applications
- Fine-tune and evaluate models
- Deploy LLM-powered features responsibly
Before you start
- Python fundamentals
- Comfort with basic machine-learning concepts
- Familiarity with using an LLM API helps
Phase 1 · How LLMs Work
Neural Networks & the Transformer
From embeddings and attention to the full transformer block. The architecture behind every modern LLM.
- 3Blue1Brown: Neural Networks + Transformers (YouTube)videofree
- The Illustrated Transformer (Jay Alammar)articlefree
- Attention Is All You Need (paper)articlefree
- Explain self-attention with a worked example
- Diagram a transformer block
- Implement scaled dot-product attention in NumPy
Build GPT From Scratch
Follow Karpathy's nanoGPT to implement, train, and sample from a small GPT. The single best way to truly understand LLMs.
- Karpathy: Let's build GPT (YouTube)videofree
- nanoGPT (GitHub)repofree
- Karpathy: Neural Networks Zero to Herocoursefree
- Train a character-level GPT
- Implement multi-head attention
- Sample coherent text from your model
Phase 2 · Training & Fine-tuning
Pretraining, SFT & RLHF
The full training pipeline: next-token pretraining, supervised fine-tuning, and alignment with RLHF/DPO.
- HuggingFace LLM Course (free)coursefree
- Karpathy: State of GPT (talk)videofree
- InstructGPT / RLHF (paper)articlefree
- Explain pretraining vs. SFT vs. RLHF
- Describe reward modeling
- Compare RLHF and DPO
Efficient Fine-tuning with LoRA & PEFT
Adapt open models on a single GPU with LoRA/QLoRA, dataset prep, and evaluation of the result.
- HuggingFace PEFT Docsdocfree
- QLoRA (paper)articlefree
- HuggingFace: Fine-tune an LLM (recipe)articlefree
- Prepare an instruction dataset
- Fine-tune a 7B model with QLoRA
- Evaluate before/after on held-out data
Phase 3 · Evaluation, Ethics & Deployment
Evaluating & Red-teaming LLMs
Benchmarks, LLM-as-judge, bias, toxicity, and building task-specific eval harnesses.
- HuggingFace Evaluate Docsdocfree
- Open LLM Leaderboardlinkfree
- Holistic Evaluation of Language Models (HELM)articlefree
- Build a task-specific eval set
- Use LLM-as-judge with rubrics
- Document a bias/safety finding
Serving & Scaling LLMs in Production
Quantization, vLLM, KV-cache, batching, and cost/latency tradeoffs for real deployments.
- vLLM Docsdocfree
- HuggingFace Text Generation Inferencedocfree
- Ollama (run models locally)repofree
- Serve a model with vLLM
- Quantize and measure latency change
- Estimate cost per 1M tokens
Frequently asked
Is the Large Language Models: Deep Dive roadmap free?+
Yes. The entire Large Language Models: Deep Dive roadmap and every curated resource is free to follow on Commit. You can track your progress, keep a daily streak, and earn a shareable certificate at no cost — there is no paywall.
How long does the Large Language Models: Deep Dive roadmap take to complete?+
About 160 hours of focused study across 6 courses and 3 stages. At roughly one hour a day that is about 6 months; you can move faster by studying more each day.
Do I get a certificate for finishing the Large Language Models: Deep Dive roadmap?+
Yes. When you complete the roadmap on Commit you receive a verifiable certificate of completion that you can add to LinkedIn and your public Commit profile as proof of what you finished.
Related roadmaps
Make it stick
Copy this roadmap into Commit and turn it into a tracked program with a streak graph, study logging, and a shareable certificate when you finish. Free forever.
Start Large Language Models: Deep Dive free