entry-level~3h total10 lessonsfree to read

Learn: AI.

The track for the people building with LLMs and being honest about what works. Prompt engineering past one-shot tricks, model ops on a budget, evals that actually catch regressions, and the ethics conversation you can’t skip.

Each lesson is a 1-line briefing on what you'll practice. Read the briefing, then click Practice in terminal to drop into a real sandbox.

module · entry-level

Prompts in production

Prompt regression after model updates, hallucinations on real customer data, token-cost spikes. The work nobody told you was going to be 60% of the job.

3 lessons

01Prompt drifts after a model updatefree

A prompt that worked yesterday silently regresses after a model update. Build the eval that catches it before users do.

Prompt regression · ~10m

Practice in terminal →

02Hallucinated fact in a customer replyfree

A hallucinated fact in a customer reply is a grounding failure. RAG + retrieval that actually retrieves the right context.

Grounding · RAG · ~8m

Practice in terminal →

03Token cost spike: diagnose

Token costs spike : find whether it is context growth, retry loops, or a tool that returns too much.

Context · caching · ~10m

Practice in terminal →

module · mid-level

Build evals + safety

Eval set design, tool-use streaming, PII redaction, open-weights vs hosted trade-offs. The boring infrastructure that lets your demo survive contact with users.

4 lessons

04Build an eval set for one task

A real eval set is a labeled distribution of inputs your users actually send. Build one in 15 minutes.

Eval design · ~15m

Practice in terminal →

05Streaming response cuts off mid-tool

Streaming responses cut off mid-tool-call because the parser gives up too early. Fix the consumer, not the model.

Tool use · ~12m

Practice in terminal →

06PII leaks into a logged prompt

PII in logged prompts is a regulatory issue, not a stylistic one. Redaction filter before the log, not after.

Redaction · DLP · ~12m

Practice in terminal →

07Choose: open-weights vs hosted

Open-weights vs hosted is a trade-off across cost, latency, control, and compliance. Pick by the column that hurts most.

Trade-offs · ~15m

Practice in terminal →

module · senior-level

Ship responsibly

Prompt injection, vector DB drift at scale, end-to-end feature launch. The questions that decide whether your launch lands on the front page for the right or wrong reason.

3 lessons

08Prompt-injection in a help doc

Prompt injection in an internal doc is the OWASP LLM-01 attack. Sanitize, sandbox, isolate untrusted content.

OWASP LLM-01 · ~15m

Practice in terminal →

09Vector DB drift: re-embed at scale

Vector DB drift after a model swap means re-embedding at scale. Plan the rollout so search keeps working through it.

RAG ops · ~18m

Practice in terminal →

10Capstone: ship one feature responsibly · capstone

Capstone: ship one AI feature responsibly. Eval, guardrail, fallback, rollout plan, rollback plan, comms plan.

End-to-end · ~25m

Practice in terminal →

ready to test?

Done reading. Now do it.

The graded assessment uses the same terminal sandbox as these lessons : only it scores you on accuracy, methodology, tool fluency, communication, and real-world fit.

Take the graded assessment →See full track page

// loading

Learn: AI.

Each lesson is a 1-line briefing on what you'll practice. Read the briefing, then click Practice in terminal to drop into a real sandbox.