For people who can already code

You already think in algorithms.
Now learn to think in agents.

You can solve a Gold-level problem at 2am. That brain is exactly what makes you dangerous with an AI coding agent — if you drive it instead of letting it drive you. This is a short, practical course on doing that well, ending with a real app deployed to the internet.

~45 min read Python assumed 8 core patterns 1 project, shipped live

1 · You're the architect, not the typist mindset

The single mental shift the rest of this course depends on.

"Vibe coding" is letting an AI agent write the code while you describe what you want. Done lazily, you get a pile of code you don't understand that breaks the moment a real user touches it. Done well, you ship 5x faster than your classmates and actually understand every line.

The difference is one idea: the AI is a fast, tireless junior engineer with no judgment. You are the senior engineer. The agent can type 100x faster than you, has read every library's docs, and never gets bored. But it will confidently do the wrong thing, miss the edge case, and trust input it shouldn't — exactly the mistakes that cost you points in a contest. Your job is the part that isn't typing: deciding what to build, how to structure it, and whether the output is actually correct.

⬡ Under the hood

Serious agent frameworks formalize this as human-in-the-loop orchestration: the human sets goals and is the final quality gate; the agent executes. Everything in this course is a practical version of that one principle.

Your competitive-programming instincts are an unfair advantage here. You already (1) decompose hard problems into sub-problems, (2) reason about edge cases and complexity, and (3) trace code in your head to find the bug. Those are the three skills the AI is worst at. Lean on them.

2 · The vibe coding loop workflow

Every feature you build with an agent runs through the same five steps.

Beginners type a vague request, paste whatever comes out, and run it. When it breaks they paste the error and pray. That's not a workflow, it's gambling. Here's the loop that actually works:

  1. Spec. Decide precisely what this piece should do — inputs, outputs, edge cases — before the agent writes anything.
  2. Generate. Hand the agent a tight brief and let it write the code.
  3. Review. Read what it produced like you're reviewing a teammate's pull request. Don't run it yet.
  4. Run & test. Execute it on real and adversarial inputs. Write a test or two.
  5. Iterate. Feed back specific corrections — not "it's broken," but "it crashes when the list is empty; handle that."

The whole course is just these five steps, zoomed in. Most beginners skip step 1 and step 3 — the two steps where your brain matters most. Don't.

◇ Our running project

We'll build Contest Companion — a tiny web app that shows upcoming programming contests and lets you star the ones you want to do. Small enough to finish, real enough to deploy. We'll vibe-code it piece by piece.

3 · Write a brief, not a wish context engineering

The #1 lever on output quality. Bigger than which model you use.

The agent only knows what you tell it. A weak prompt gets weak code — not because the model is dumb, but because you left it guessing. Compare:

✗ A wish

"make a function to get upcoming contests"

The agent invents an API that may not exist, guesses the return shape, ignores errors, and picks a date format you didn't want.

✓ A brief

"Write a Python function get_contests() that calls the public Codeforces API https://codeforces.com/api/contest.list, returns only contests with phase BEFORE, as a list of dicts with keys name, start (ISO string), and url. Sort by soonest first. Use requests. If the call fails, return an empty list — don't raise."

The good version isn't longer for the sake of it. Every extra sentence removes a decision the agent would otherwise guess wrong. A good brief names four things:

  • The exact interface — function name, inputs, and the precise output shape.
  • The data source — the real endpoint or file, not "an API."
  • The rules — filtering, sorting, formats.
  • The failure behavior — what happens when something goes wrong (the part beginners always forget).
⬡ Under the hood

This is context engineering: the discipline of assembling everything the model needs before it generates. Pros keep reusable briefs in their repo and point the agent at real files and docs instead of describing them from memory.

◇ Try it

For Contest Companion, write the brief for a second function: star_contest(contest_id) that saves a starred contest to a local starred.json file. Name the interface, the storage, the rules, and the failure behavior before you let the agent touch it.

4 · Break the big ask down prompt chaining · planning

If you wouldn't solve it in one function, don't ask the agent to.

"Build me Contest Companion" in one shot gets you a tangled 300-line file that half-works and is impossible to debug — same reason you don't write a whole contest solution in main() with no helper functions. Decompose first, then generate one piece at a time.

For Contest Companion, the chain is obvious once you think like a problem-setter:

the plan, before any code
1. get_contests()      -> fetch + clean the contest data
2. star/unstar         -> save choices to starred.json
3. render(contests)    -> turn data into HTML
4. tiny web server     -> serve the page
5. deploy              -> put it on the internet

Now each step is a separate, well-scoped brief. You build and verify step 1, confirm it returns clean data, then move to step 2. If step 3 breaks, you know it's not step 1 — because step 1 already passed. This is the same "isolate the failing subroutine" instinct you use when debugging a wrong-answer verdict.

▲ The trap

When you let the agent build everything at once and it breaks, you have no idea which part is wrong — so you end up regenerating the whole thing repeatedly, getting a different bug each time. Small steps turn one impossible debug into five easy ones.

⬡ Under the hood

Two patterns hiding here. Planning: have the agent (or yourself) lay out the steps before coding. Prompt chaining: feed each step's verified output into the next step's input. Together they're how every non-trivial agent system is built.

5 · Give it tools tool use

An LLM alone can't fetch live data, do exact math, or touch a file. Tools fix that.

A language model is a text predictor. Ask it "what contests are on this week?" and it will make some up, because it has no live data — it's pattern-matching, not looking anything up. The fix is to give it a tool: a real function it can call to get real data. Our get_contests() from lesson 3 is that tool.

contests.py — a tool that returns real data
import requests, datetime

def get_contests():
    """Upcoming Codeforces contests, soonest first. Never raises."""
    try:
        r = requests.get("https://codeforces.com/api/contest.list", timeout=10)
        rows = r.json()["result"]
    except Exception:
        return []                          # fail soft — empty list, not a crash
    upcoming = [c for c in rows if c["phase"] == "BEFORE"]
    upcoming.sort(key=lambda c: c["startTimeSeconds"])
    return [{
        "name":  c["name"],
        "start": datetime.datetime.utcfromtimestamp(c["startTimeSeconds"]).isoformat(),
        "url":   f"https://codeforces.com/contests/{c['id']}",
    } for c in upcoming]

The key idea isn't this specific function — it's the boundary. The messy, unreliable part (the network, the outside world) is sealed inside one small tool with a clean, predictable output. Everything downstream just gets a tidy list of dicts. When the agent builds the rest of the app, it never has to reason about the network again.

⬡ Under the hood

This is tool use (a.k.a. function calling) — the pattern that turns a chatbot into something that can act in the real world: search the web, run code, query a database, hit an API. Whenever you want an agent to do something true rather than plausible, you give it a tool.

6 · Make it check its own work reflection

The cheapest quality upgrade: ask the agent to grade itself before you do.

An agent's first answer is a first draft. Models are noticeably better at spotting a flaw than at avoiding it the first time — same as you re-reading your own solution and going "oh, that breaks on n=0." You can trigger that for free with a follow-up:

a reflection prompt
Before I run this, review your own get_contests() as if you were
a strict code reviewer. List concrete failure cases: What if the
API returns 200 but no "result" key? What if startTimeSeconds is
missing on a contest? What if the network is fine but slow? Then
show me the fixed version.

That one message routinely surfaces the missing KeyError guard, the absent timeout, the unhandled empty case — the stuff that would've crashed in front of a real user. You're not doing extra work; you're making the agent do the review pass it skipped.

◇ Make it a habit

After any non-trivial generation, ask: "What are the three most likely ways this breaks in production, and fix them." It's the highest return-on-effort prompt you'll ever paste.

⬡ Under the hood

This is the reflection (self-correction) pattern: the agent critiques its own output and revises. Advanced systems automate the critique→revise loop; you're doing the manual, controllable version — which is better while you're still learning what "good" looks like.

7 · Read the diff like a judge code review

The agent's confidence is not evidence. Verify, don't vibe.

This is where your contest brain earns its keep. AI code looks great — clean names, neat comments, confident tone — and is wrong just often enough to hurt. The danger isn't the obvious crash; it's the plausible code that's subtly incorrect. Read every diff with the same suspicion you'd give your own untested submission. Hunt specifically for:

  • Off-by-one and boundary bugs — empty lists, single elements, the last index. The AI's favorite blind spot, and yours-to-catch.
  • Invented APIs. Models hallucinate plausible function and parameter names. If you've never seen requests.get_json(), it's because it doesn't exist. Check the real docs.
  • Silent wrong answers. Code that runs fine and returns the wrong thing — wrong sort order, wrong timezone, off-by-a-factor. Trace one real input by hand.
  • Unhandled failure. What happens when the input is empty, the file is missing, the network is down? If the brief didn't say and the code doesn't handle it, it'll surface in front of a user.
  • Security smells. Hardcoded secrets, building SQL or shell commands by string-concatenating user input, trusting anything that came from outside.
▲ The hardest habit

If you don't understand a line, do not ship it. Ask the agent "explain line 14 and why it's needed." Code you don't understand is code you can't debug at 11pm when it's down — and "the AI wrote it" is not a defense your future self will accept.

⬡ Under the hood

You're acting as the final quality gate in the human-agent loop. The agent proposes; you dispose. Treat its output as a pull request from a talented intern who's never been burned in production — because that's exactly what it is.

8 · Test it or it's broken testing

"It ran once" is not "it works." You know this from stress-testing.

You'd never trust a solution because it passed the sample case — you'd stress-test it against brute force on random inputs. Same energy here. The good news: writing tests is the thing AI is genuinely great at, because it's bounded and mechanical. Make the agent write them, then you design the nasty cases.

test_contests.py — the agent writes these, you add the evil ones
from contests import get_contests
from unittest.mock import patch

def test_network_failure_returns_empty():
    # the case beginners forget: the internet is down
    with patch("contests.requests.get", side_effect=Exception):
        assert get_contests() == []

def test_results_sorted_soonest_first():
    out = get_contests()
    starts = [c["start"] for c in out]
    assert starts == sorted(starts)   # ordering is a silent-bug magnet

Notice the two tests cover exactly the failure modes from lessons 5 and 7: the network dying, and the silent wrong-order bug. Tests are where your edge-case paranoia becomes permanent. Run them with pytest. When you later ask the agent to change the code, the tests catch it if it breaks something — which it will, eventually.

◇ Rule of thumb

For every feature, ask the agent for tests covering: the normal case, the empty/zero case, and the failure case. Then add one test for the specific bug you're most worried about. That's 80% of the value for 5 minutes of work.

9 · Guardrails & failure guardrails · exception handling

Real software meets real users, bad input, and flaky networks. Plan for all three.

In a contest, the input is guaranteed to match the spec. In the real world it never does — users paste garbage, APIs go down, files vanish. The gap between "works on my machine" and "works for strangers on the internet" is almost entirely about handling the inputs you didn't plan for. Bake three habits in:

  • Never trust input. Validate anything from a user or an API before you use it. Assume it's malformed, oversized, or malicious until proven otherwise.
  • Fail soft, on purpose. Decide what happens when something goes wrong — return a default, show a friendly message, retry once — instead of letting a stack trace reach the user. Notice get_contests() already does this: a dead API yields an empty list and an honest "no contests right now," not a 500 error.
  • Respect other people's servers. Add a timeout to every network call, don't hammer an API in a loop, and cache results you'll reuse. Getting your IP rate-limited mid-demo is a rite of passage you can skip.
⬡ Under the hood

Two patterns: guardrails (validate and constrain what goes in and comes out) and exception handling & recovery (degrade gracefully instead of crashing). They're what separate a weekend hack from something you'd let real people use.

▲ Secrets

If your app ever uses an API key, it goes in an environment variable, never hardcoded and never committed to Git. Anything pushed to a public repo is public forever — bots scrape GitHub for leaked keys within minutes. More on this in the next lesson.

10 · Ship it: Git + Vercel deploy

An app on your laptop helps no one. Let's put Contest Companion on a real URL.

Shipping is a skill, and it's mostly two tools: Git (save and version your code) and a host (run it on the internet). We'll use Vercel — it's free for personal projects, and it deploys automatically every time you push to GitHub.

Step A — Put your code on GitHub

From your project folder, this is the whole ritual. (Ask your agent to explain any line you don't recognize.)

terminal
# one-time: tell Git who you are, then save your project
git init
echo "node_modules/" > .gitignore   # never commit junk or secrets
git add .
git commit -m "Contest Companion: first working version"

# create an empty repo on github.com, then connect + push:
git remote add origin https://github.com/YOUR_NAME/contest-companion.git
git push -u origin main
◇ Why Git matters for vibe coding

When the agent makes a change that breaks everything, Git lets you jump back to the last working version in one command (git checkout .). It's your undo button for the whole project. Commit every time something works — those are your save points.

Step B — Deploy on Vercel (the easy path)

  1. Sign up at vercel.com with your GitHub account. One click; no credit card for the free Hobby tier.
  2. Click "Add New… → Project" and import your contest-companion repo. Vercel reads the repo and auto-detects how to build it.
  3. Add secrets as Environment Variables, not in code. If your app needs an API key, paste it into the project's Settings → Environment Variables. Vercel injects it at runtime.
  4. Click Deploy. A minute later you get a live your-app.vercel.app URL you can send to anyone.
  5. Every future git push redeploys automatically. You push; Vercel rebuilds and updates the URL. That's continuous deployment — the thing real teams pay for, free.
◇ The CLI path (faster, once you're comfortable)

Prefer the terminal? npm i -g vercel, then run vercel in your project folder and answer the prompts. vercel --prod pushes a production deploy. Same result, no clicking.

▲ Reality check

Static sites and Node apps deploy to Vercel with near-zero config. A pure-Python backend needs a bit more setup (Vercel runs Python as serverless functions in an /api folder) — a perfect thing to hand your agent: "Set up this Flask app to deploy on Vercel as a serverless function; show me the folder structure and config." Then review what it gives you.

The cheat sheet keep this open

The whole course in checkboxes. Tick them off as they become automatic.

  • Spec before you generate. Name the interface, data, rules, and failure behavior.
  • Decompose. Build one small, verified piece at a time — never the whole app at once.
  • Give it tools for anything that needs to be true, not plausible (live data, exact math, files).
  • Ask it to review itself — "how does this break in production?" — before you run it.
  • Read every diff like a judge. Hunt boundaries, invented APIs, silent wrong answers.
  • Never ship code you don't understand. Ask for an explanation of any line you can't trace.
  • Test the normal, empty, and failure cases. Make the agent write them; you design the evil ones.
  • Never trust input. Fail soft. Respect rate limits. Keep secrets in env vars, never in Git.
  • Commit every working version. Git is your project-wide undo button.
  • Ship it. A deployed app teaches you more than ten that never leave your laptop.

Self-check 5 questions

No grades. Just check whether the mindset stuck.


That's the whole loop: spec → generate → review → test → iterate, with your judgment on the parts that matter. The patterns here — context engineering, tool use, reflection, guardrails — are the same ones used to build the agents you're vibing with. You're not just using the tools; you're learning how they're made.

Next move: open an empty folder, brief your agent on step 1 of Contest Companion, and don't stop until it's live on a .vercel.app URL. Build something real this week.