Infrastructure — CDN · Load Balancer · Proxy · Browser Rendering
Infrastructure — CDN · Load Balancer · Proxy · Browser Rendering
🎯 After Reading This Lesson
By the end of this lesson, you will be able to confidently do the following 3 things.
- ▸✅ Load Balancer L4 vs L7 + algorithms (Round Robin · Least Connection)
- ▸✅ CDN (Cloudflare · CloudFront) caching strategies
- ▸✅ The role of Reverse Proxy (Nginx)
Keep the learning objectives as a checklist and close the lesson once you can answer all of them.
CDN — Cache Closer to the User
CDN (Content Delivery Network) — a geographically distributed network of cache servers.
How it works:
1. User requests image.jpg
2. DNS routes to the nearest CDN edge
3. Edge cache hit? → immediate response
4. Miss → request from origin server → cache the response + deliver to user
Why it's fast:
- ▸Geographic proximity (Seoul user → Tokyo edge at 50ms vs US origin at 200ms)
- ▸Uses uncongested backbone (peering · private connections)
- ▸Cache hit rate ~95% (when well designed)
Major CDN providers:
Cache policy (HTTP headers):
- ▸
Cache-Control: public, max-age=31536000, immutable— 1 year · no changes - ▸
Cache-Control: private, no-cache— always validate with origin - ▸
Cache-Control: no-store— never cache - ▸
ETag: "abc123"— content hash (change detection)
Cache Invalidation:
- ▸On new deployment, change the URL (e.g.,
app.v2.js·app.HASH.js) - ▸Or call the purge API (Cloudflare · CloudFront)
- ▸Poor invalidation = stale cache bugs
Modern Trend — Edge Computing:
- ▸Cloudflare Workers · Vercel Edge Functions · AWS Lambda@Edge
- ▸Cache + computation at the edge (personalization · authentication)
- ▸Lower response time · reduced server load
Load Balancer + Algorithms
Load Balancer — distributes traffic across multiple servers:
By layer:
- ▸L4 (Transport) — TCP/UDP level. Fast. AWS NLB
- ▸L7 (Application) — HTTP level. URL · header-based routing. AWS ALB · Nginx
Distribution algorithms:
Health checks:
- ▸LB periodically calls
/health - ▸N consecutive failures → unhealthy → excluded from traffic
- ▸Re-included upon recovery
- ▸Interval: 5–30 seconds, threshold: 2–5
Sticky Session:
- ▸Same user always routed to the same server
- ▸Required when session data is stored in memory
- ▸Downside: if one server goes down, all its users are disconnected
- ▸Alternative: shared session store (Redis)
Major LB options:
HTTP/2 · gRPC load balancing pitfalls:
- ▸Multiple streams over a single TCP connection → L4 LB cannot distribute evenly
- ▸Solution: L7 LB (Envoy · Nginx) or client-side LB (built into gRPC)
Browser Rendering + Performance
Critical Rendering Path (browser from HTML → screen):
JS loading impact:
- ▸Default
<script>— blocks the parser (pauses until download and execution finish) - ▸
<script defer>— downloads in parallel, executes after DOM is ready - ▸
<script async>— downloads in parallel, executes immediately when done (no order guarantee) - ▸
<script type="module">— similar to defer
3 Web Vitals (Google):
Optimization techniques:
- ▸Image lazy loading:
<img loading="lazy"> - ▸Code splitting: separate JS chunks per route
- ▸Preload · Preconnect: start critical resources early
- ▸WebP · AVIF: image compression (30–50% smaller than JPEG)
- ▸Critical CSS: inline CSS for the first screen
- ▸CDN: serve static assets from nearby locations
- ▸HTTP/2 · 3: multiplexed requests + header compression
Measurement tools:
- ▸Lighthouse (built into Chrome DevTools)
- ▸WebPageTest (various regions · devices)
- ▸PageSpeed Insights (Google)
- ▸Real User Monitoring (Datadog · Sentry)
🤖 Try Asking AI Like This
Once you understand the concepts in this lesson, you can give AI specific instructions. Rather than a vague "fix this for me," use vocabulary-driven requests — that's where token savings begin.
- ▸"Apply Cloudflare CDN + 1-year cache headers to these static files"
- ▸"Create an Nginx load balancing (least_conn) configuration for me"
Why This Reduces Tokens
When you don't know the concepts, even after getting an AI response you have to ask "What does that mean?" again. That follow-up question is what eats tokens. Learn the concepts once, and the conversation ends in one turn.