Vibe Coding Fundamentals — Prompts, Pairing, Context & Tokens
Vibe Coding Fundamentals — Prompts, Pairing, Context & Tokens
🎯 After reading this lesson
After finishing this lesson, you will be able to confidently do the following three things.
- ▸✅ Vibe coding = AI pair programming
- ▸✅ Save tokens by writing CLAUDE.md / .cursorrules
- ▸✅ A 5-item checklist for handling hallucinations
Keep the learning objectives as a checklist and close the lesson once you can answer all of them.
What is Vibe Coding — AI Pair Programming
One line: Vibe coding = the practice of building code together with AI. Code is the output; humans handle intent, validation, and architecture.
3-Role Model:
Where AI excels:
- ▸✅ Boilerplate (REST API CRUD, test cases)
- ▸✅ Transformation and refactoring (TS↔JS, class↔hook)
- ▸✅ Writing docs, comments, and CHANGELOG
- ▸✅ Interpreting error messages and forming debugging hypotheses
Where AI falls short:
- ▸❌ Business logic outside its domain knowledge
- ▸❌ Large-scale architecture decisions
- ▸❌ Performance and security evaluation (hypotheses OK, verification requires a human)
- ▸❌ Generating plausible-looking code that misses the intent (hallucination)
> 💡 Iron Law: Code produced by AI is still your responsibility. Code review skills become more important than ever.
Saving Tokens — Affects *Cost, Speed, and Accuracy* Alike
What is a Token — A Quick Recap
Token = the unit AI uses to read text. One English word ≈ 1.3 tokens; one Korean character ≈ 1–2 tokens. Every AI response counts and bills both input tokens and output tokens.
Output Tokens Cost 5× More
→ The longer AI responds, the more costs explode. A vague question → AI plays it safe and explains every possibility → output token explosion.
❌ Bad vs ✅ Good Prompts — Token Difference
Example 1: Code Fix
❌ Bad (estimated output: 2,000 tokens):
> "Take a look at this code"
→ AI tries to rewrite and show the entire file, including parts that didn't need to change.
✅ Good (estimated output: 100 tokens):
> "Fix only the type error on line 47 of auth.ts. Don't touch any other code. Show only the changed lines."
→ AI responds with just that one line. 20× fewer output tokens.
Example 2: Adding a Feature
❌ Bad (estimated output: 3,000 tokens):
> "Build a login feature"
→ AI explains every option (OAuth, JWT, sessions, password hashing, email verification, etc.) and produces a full implementation.
✅ Good (estimated output: 800 tokens):
> "Based on the stack in @CLAUDE.md. POST /api/auth/login.
> Input: zod schema (email, password).
> Processing: bcrypt compare → JWT access 15 min + refresh 7 days.
> Response: httpOnly cookie.
> Include Vitest tests."
→ AI implements exactly the stated requirements.
CLAUDE.md / .cursorrules — Save Repeated Context
You no longer need to repeat "I'm using Next.js 14, TypeScript, and Tailwind" at the start of every conversation.
Create CLAUDE.md or .cursorrules in the project root:
AI reads this file automatically every time and incorporates it into responses. Zero repeated explanations.
7 Practical Token-Saving Tips
1. Name specific files: Don't say "look at everything" → say "src/auth.ts only"
2. Limit the change scope: "Edit only this part"
3. Specify the output format: "Show diff only" · "Code only, no explanation"
4. Use CLAUDE.md: Handle repeated context in one shot
5. prompt caching: Reuse the same system prompt via caching (90% discount from Anthropic)
6. Try smaller models first: Start with Haiku → upgrade to Sonnet if insufficient
7. Prune context: Carry only a summary into the next conversation after a long chat
Summary
- ▸Vague prompt = token bomb
- ▸Specific prompt = AI responds briefly and accurately
- ▸CLAUDE.md is the foundation of all token savings
Hallucination — *AI Doesn't Admit When It Doesn't Know*
Core Takeaway
LLMs do not honestly admit ignorance. They make up the most plausible-sounding answer. Function names, library versions, and API responses can all be fabricated.
Why This Happens — The Probability Prediction Mechanism
An LLM predicts "the most likely next token." The concept of genuinely not knowing doesn't exist. It operates on the principle of "if it sounds right, that's the answer."
Example: "What is React's useSnapshot hook?"
- ▸Fact: That hook does not exist (though Valtio has one)
- ▸AI: "React's useSnapshot is a hook that saves a snapshot of component state. Usage is..."
- ▸→ Confidently lying
Checklist for When You Encounter One
✅ 1. Verify the Function or Library Exists
- ▸grep or Ctrl+F the function name in official documentation
- ▸Check the version with
npm view <pkg>orpip show <pkg> - ▸Click the link — if the URL AI gave returns 404, it's fake
✅ 2. Actually Run It
- ▸Immediately run the received code
- ▸For TypeScript, check for compile errors
- ▸Runtime errors are a truth reveal
✅ 3. Ask AI to Verify
> "Does this function really exist? Give me a link to the official docs."
Most of the time, AI will admit: "Upon checking, it does not exist."
✅ 4. When in Doubt, Cross-Check with Another Model
- ▸Ask the same question to GPT after Claude
- ▸If the answers differ, one of them is wrong
✅ 5. Specify the Version
> ❌ "In Next.js..." → AI guesses which version
> ✅ "In Next.js 15 App Router..." → explicit context
Top 5 Things AI Frequently Fabricates
1. Non-existent npm packages (plausible-sounding names like react-magic-form)
2. Incorrect import paths (non-existent paths like from 'next/legacy')
3. Non-existent options (options like { strictMode: 'super-strict' } that don't exist)
4. API response fields (absent fields like response.data.user.premiumLevel)
5. Version confusion (calling a Tailwind v3 option a v4 feature)
Real-World Example — Common in Interviews
> Q: "Have you experienced AI hallucination?"
>
> A: "In code generated by v0, a non-existent shadcn/ui component (Slider3D) was used. I caught it by cross-referencing the official docs, and since then I always specify which components are available in the prompt.
>
> Also, Claude confidently recommended npm install zod-extras, a package that doesn't exist. After confirming with npm view, I replaced it with zod's actual superRefine."
Summary
- ▸AI makes things up when it doesn't know (a limitation of the probability mechanism)
- ▸Validate with official docs, actual execution, and cross-checking with another model
- ▸Specify version and exact names to reduce room for guessing
5 Principles of Prompt Engineering
The 5 elements of a good prompt (CRISP):
Bad prompt vs Good prompt:
❌ "Build a login API"
✅ "Next.js 14 App Router, TypeScript, Drizzle ORM environment.
POST /api/auth/login endpoint:
- ▸Input:
{email: string, password: string}(zod validation) - ▸Processing: query users table → bcrypt.compare → issue JWT (15 min) + refresh (7 days)
- ▸Response: 200 success + httpOnly cookie, 401 failure
- ▸Include Vitest tests (success, failure, and invalid input cases)"
Additional techniques:
- ▸One-shot example: "Respond in this format: [example]"
- ▸Chain-of-thought: "Explain step by step, then write the code"
- ▸Reasoning: "Analyze the trade-offs of this approach first, then decide"
- ▸Constraint: "No external libraries · under 50 lines"
- ▸Verification: "After writing the code, run it yourself and tell me the result"
> 💡 A prompt is code. It's worth version-controlling like a PR (Cursor Rules, CLAUDE.md).
Context Window + Token Economics
Context window = the amount of text (in tokens) an LLM can read in one go.
Context by model:
1 token ≈ 0.75 English words ≈ 1–2 Korean characters. 1,000 lines of code ≈ roughly 4–8K tokens.
4 Principles of Token Economics:
Price comparison (2025) (1M input + 1M output):
> 💡 Fast and cheap: Haiku → Balanced: Sonnet → Quality: Opus. Division of labor is the right approach.
How LLMs Work — Why You Should Know
One line: LLMs predict the probability of the next token. They don't think — they generate the most plausible answer.
4 Fundamentals:
1. Hallucination — makes up plausible answers rather than admitting ignorance
- Verify function existence (grep in ./docs/api.md), validate doc links, run actual tests
2. Context length limits — even with 1M tokens, later information takes priority
- Place important information at the end of the prompt
3. Probabilistic responses — the same question can yield different answers
- For consistency, use temperature=0 + fixed seed
4. Training data cutoff — Claude Opus 4.7 = 2026-01 cutoff
- For recent information, web search is required (WebSearch tool, Perplexity)
Strengths and Weaknesses of LLMs:
> 💡 LLMs are fast, smart tools — but they require supervision.
🤖 Try Asking AI Like This
Understanding the concepts in this lesson lets you give AI specific instructions. Not a vague "fix this" but a request with vocabulary — that's where token savings begin.
- ▸"Rewrite this vague prompt using the 4 elements: scope, context, constraints, and output format"
- ▸"This prompt has a high hallucination risk — add an evidence requirement to it"
Why This Reduces Tokens
Without understanding the concepts, even after receiving an AI response you have to ask "What does that mean?" again. That follow-up question eats tokens. Learn the concepts once and the conversation ends in one shot.