C
Python/Intermediate/Lesson 11

Generator

1 hr·theory
This chapter
3/8
Python

Generator

🎯 By the end of this lesson

After reading this lesson, you will be able to confidently do the following 3 things.

  • ✅ The mechanism by which yield turns a function into a generator
  • ✅ Memory efficiency (generator instead of a large list)
  • ✅ Using itertools (chain, islice, groupby)

Keep these learning goals as a checklist, and close the lesson once you can answer all of them.

5 Core Generator Concepts — Code + Output

generator = a function that lazily produces values one at a time. Saves memory + enables infinite sequences.


1. list vs generator — Memory Difference

python
# list — 100 million items in memory at once
large_list = [x * x for x in range(100_000_000)]
# Memory approx 4GB ⚠️ Computer might freeze

# generator — 1 item at a time, as needed
large_gen = (x * x for x in range(100_000_000))
# Memory almost 0 — only one function object ✅

import sys
print(sys.getsizeof(large_gen))     # Around 200 bytes

[ ] → list, ( ) → generator. A single character difference, but the memory usage differs by a factor of hundreds of billions.


2. yield — Creating a Generator

python
def count(n):
    """Returns numbers from 1 to n, one by one"""
    for i in range(1, n + 1):
        yield i              # ← Yields one at a time

# Calling it alone does not execute
gen = count(5)
print(gen)                   # <generator object count at 0x...>

# Executes only when iterated with for
for n in count(5):
    print(n, end=" ")        # 1 2 3 4 5

When yield is hit, the function yields a value and pauses. On the next call, it resumes from that exact line.


3. next() — Pulling Values One at a Time

python
def infinite_count():
    n = 1
    while True:              # Infinite loop OK!
        yield n
        n += 1

one_by_one = infinite_count()
print(next(one_by_one))         # 1
print(next(one_by_one))         # 2
print(next(one_by_one))         # 3
# Infinite memory — possible endlessly

⚠️ An infinite sequence is impossible with a list — this is the generator's unique strength.


4. Real-World Pattern — Reading a Large File Line by Line

python
def read_large_log(filepath):
    """Process a 1GB log file memory-efficiently"""
    with open(filepath) as f:
        for line in f:
            if "ERROR" in line:
                yield line.strip()

# Usage
for error_line in read_large_log("server.log"):
    print(error_line)
# Memory: one line at a time — 1GB file is OK

5. return vs yield Difference

python
# return — creates and returns all at once
def all_at_once(n):
    return [i * i for i in range(n)]    # 100 million would explode memory

# yield — one at a time
def one_by_one(n):
    for i in range(n):
        yield i * i                      # 100 million is OK

# The result is the same, memory usage is vastly different
'''
all_at_once(5) → [0, 1, 4, 9, 16] (returns immediately)
one_by_one(5) → <generator> → 0, 1, 4, 9, 16 (when needed)
'''

One-Line Summary

Comparisonlistgenerator
Creating[x for x in xs](x for x in xs)
Functionreturn listyield x (repeated)
MemoryAll at onceOne at a time
InfiniteNot possiblePossible
ReuseOKOnce only (exhausted)

Key point: Large data, infinite sequences, streaming → generator. Small data → list.

💻 Bad Example — Loading Large Data into Memory as a List All at Once
# List 100 million numbers — consumes several GBs of memory
def get_all_numbers(n):
    result = []
    for i in range(n):
        result.append(i * i)
    return result  # All values exist in memory

# Immediately uses several GBs of memory when called
numbers = get_all_numbers(100_000_000)
for num in numbers:
    process(num)
💻 Good Example — Lazy Evaluation with yield
from typing import Generator, Iterator
import sys

# Generator function — generates values one by one (memory O(1))
def square_numbers(n: int) -> Generator[int, None, None]:
    """Generates n square numbers one by one"""
    for i in range(n):
        yield i * i  # Returns value to caller, then pauses here

# Generator expression — more concise
squares = (i * i for i in range(100_000_000))

print(sys.getsizeof(list(range(1000))))    # ~8056 bytes
print(sys.getsizeof(range(1000)))          # 48 bytes (iterator)
print(sys.getsizeof(squares))             # 104 bytes (generator)

# yield from — delegates to an inner iterable
def flatten(nested: list) -> Generator:
    for item in nested:
        if isinstance(item, list):
            yield from flatten(item)  # Recursive delegation
        else:
            yield item

print(list(flatten([1, [2, [3, 4]], 5])))  # [1, 2, 3, 4, 5]

# Real-world: Processing large CSV files
def read_large_csv(filepath: str) -> Iterator[dict]:
    """Processes large CSV files memory-efficiently"""
    import csv
    with open(filepath, encoding='utf-8') as f:
        reader = csv.DictReader(f)
        for row in reader:
            yield row  # Processes one row at a time, memory O(1)

# Combining data pipelines
def process_pipeline(filepath: str):
    rows = read_large_csv(filepath)
    filtered = (row for row in rows if int(row['age']) >= 18)  # Filter
    transformed = ({**row, 'name': row['name'].upper()} for row in filtered)  # Transform
    for item in transformed:  # Lazy execution
        save_to_db(item)

🐍 Try It Out — Generator

Run the concepts above as actual code. The fastest way to learn is to change the values and observe how the behavior changes firsthand.
✏️ Python 코드
📟 Console output
▶ Press the Run button
🐍 Real Python via Pyodide — first run takes 3–5s to load

🤖 Try Asking AI Like This

Knowing the concepts from this lesson lets you give AI specific instructions. Instead of a vague "fix this," you make vocabulary-driven requests — that is the starting point for saving tokens.

  • "Convert this large list build to a generator (yield) to save memory"
  • "Use itertools to make this more efficient"

Why This Reduces Tokens

Without understanding the concept, even after receiving an AI response you have to ask "What is that?" again. That follow-up question is what burns tokens. Learn the concept once, and the conversation ends in a single exchange.

Read this first: Lambda Functions
Up next: Decorators
Generators - Python