C
OS/Synchronization/Lesson 04

Concurrency — Locks, Deadlocks, Sync/Async, Blocking/Non-blocking

60 min·theory

Concurrency — Locks, Deadlocks, Sync/Async, Blocking/Non-blocking

🎯 After reading this lesson

After finishing this lesson, you will be able to confidently handle the following three topics.

  • ✅ Mutex vs Semaphore vs Monitor
  • ✅ Race Condition + 4 conditions for Deadlock
  • ✅ Sync vs Async + Blocking vs Non-blocking — all 4 combinations

Keep the learning objectives as a checklist, and close the lesson once you can answer all of them.

Race Condition — The Tragedy of Concurrent Modification

In one line: Two threads modifying the same data simultaneously → unpredictable result.

Example:

code
Initial: counter = 0
Thread A: counter = counter + 1    (1)
Thread B: counter = counter + 1    (2)
Expected: counter = 2
Actual: counter = 1 (possible!)

Reason: counter + 1 involves 3 instructions:
1. read counter → register
2. register + 1
3. register → counter

If A executes steps 1 and 2, then B executes step 1 → both read 0, add 1, and only 1 gets stored.

Solutions:

ApproachUse Case
Mutex (Lock)Only one thread enters at a time. Most common
SemaphoreAllow N concurrent accesses (e.g., DB connection pool)
AtomicSingle variable (CAS / AtomicInteger) — no lock, fast
Channel/ActorMessage passing (Go, Erlang) — avoids shared data
ImmutableImmutable data (Rust, functional) — no race to begin with

Common race condition traps:

  • ❌ Bug disappears when you attach a debugger (Heisenbug) — the debugger changes timing
  • ❌ "Works on my machine" — different number of CPU cores or load
  • ✅ Always run concurrency tests (-race flag, ThreadSanitizer)

Mutex — The Most Common Lock

Mutex (MUTual EXclusion): Only one thread at a time may enter the critical section.

Usage flow:

python
import threading
lock = threading.Lock()
counter = 0

def increment():
    global counter
    with lock:           # acquire
        counter += 1     # critical section
    # release (automatic)

Mutex by language:

LanguageSyntax
Javasynchronized(obj) { ... } or ReentrantLock
Pythonwith threading.Lock():
Gosync.Mutex with Lock()/Unlock()
RustMutex<T> with .lock() (ownership releases automatically)
C++std::lock_guard<std::mutex>

Common mutex mistakes:

MistakeConsequence
❌ Forgetting to unlockOther threads wait forever
❌ Holding a lock during I/O (e.g., DB call)Long processing time lowers throughput
❌ Acquiring another lock while holding oneRisk of deadlock
❌ Same lock in recursive callsNon-recursive mutex deadlocks with itself

ReadWriteLock: Multiple concurrent readers + exclusive writer → advantageous for read-heavy workloads.
Optimistic Lock: Commonly used in databases. Retry on conflict (similar to CAS).

Deadlock — Waiting Forever

4 necessary conditions (deadlock occurs when all are met):
1. Mutual Exclusion — A resource can be held by only one thread at a time
2. Hold and Wait — A thread holds a resource while waiting for another
3. No Preemption — Resources cannot be forcibly taken away
4. Circular Wait — A → B → C → A cycle

Classic example: Two threads acquire the same two locks in different orders:

code
Thread 1: lock A → lock B
Thread 2: lock B → lock A

→ Thread 1 holds A and waits for B / Thread 2 holds B and waits for A → neither proceeds

6 solutions:

ApproachPrinciple
Consistent lock orderingAll threads acquire locks in the same order
TimeouttryLock(5s) — give up and retry on failure
Deadlock detectionAnalyze a wait-for graph (like a DB), then select a victim
Acquire all at onceAcquire all required locks atomically (no partial acquisition)
Avoid locksUse no locks at all (Actor, Channel)
Timed lockProtect with lock.tryLock(time)

Deadlock debugging:

  • Java: jstack <PID> → automatically detects "Found one Java-level deadlock"
  • Python: faulthandler + thread dump
  • Go: go run -race (race detector)
  • DB: PostgreSQL auto-detects and aborts the victim transaction

Sync vs Async · Blocking vs Non-blocking

4 combinations (frequently confused — understand each separately):

CombinationMeaningExample
Sync + BlockingCaller waits until result is readyresult = db.query(...)
Sync + Non-blockingReturns immediately (current state). Caller pollssocket.recv() (NONBLOCK)
Async + BlockingStart task, do other work. Block when collecting resultFuture.get()
Async + Non-blockingStart task, receive result via callback/eventasync/await, callback

Sync/Async = how completion is notified
Blocking/Non-blocking = whether the caller can proceed

4 Linux I/O models:

ModelSystem CallNotes
Blocking I/Oread() defaultSimplest. 1 thread = 1 connection
Non-blocking I/OO_NONBLOCK + read()Requires polling (wastes CPU)
I/O Multiplexingselect / poll / epoll / kqueue1 thread handles N sockets (Nginx, Redis)
Async I/Oio_uring (Linux 5.1+) / POSIX AIOTruly async. Notifies via callback/event

How epoll works (efficient I/O multiplexing on Linux):
1. epoll_create() — create an instance
2. epoll_ctl(ADD) — register file descriptors to watch
3. epoll_wait() — block until an event fires
4. Handle the fd → wait again

Why Node.js and Nginx are fast = single thread + epoll event loop → no context switching.

💻 📌 Debugging Concurrency Through Scenarios
# ============================================================
# Scenario 1: "Java app has stopped. Is it a deadlock?"
# ============================================================
# jstack — stack trace of all threads + *automatic deadlock detection*
jstack 28391 > /tmp/dump.txt
grep -A 50 'Found one Java-level deadlock' /tmp/dump.txt
# Example output:
# "Thread-1" waiting to lock <0xabc> (a Account)
#   which is held by "Thread-2"
# "Thread-2" waiting to lock <0xdef> (a Order)
#   which is held by "Thread-1"
# → Two threads locking in *opposite order* → deadlock

# Thread state analysis (BLOCKED·WAITING·RUNNABLE)
grep -E 'java.lang.Thread.State' /tmp/dump.txt | sort | uniq -c
# Example:  150 RUNNABLE  / 5 BLOCKED  / 845 WAITING
# Many BLOCKED = suspected lock contention

# Friendlier tools
jcmd 28391 Thread.print                 # Modern version of jstack
jconsole &                              # GUI (real-time threads and MBeans)

# ============================================================
# Scenario 2: "Throughput suddenly dropped — where is time being spent?"
# ============================================================
# Find hotspots by system call count
strace -c -p 28391                # Ctrl+C after about 30 seconds
# Example output:
#   % time  calls  syscall
#   45.2    1820   futex      ← High lock waiting! (mutex)
#   22.1    8500   read       ← High disk I/O
#   15.3    8500   write
#   ...

# If *futex* is #1 → suspected lock contention
# If *read·write* is #1 → disk I/O bottleneck

# Track specific system calls in real time
strace -e trace=futex,epoll_wait -p 28391

# CPU hotspots (which function is consuming CPU)
perf top -p 28391                 # Interactive
perf record -p 28391 -g sleep 30  # Record for 30 seconds
perf report                        # Analysis

# ============================================================
# Scenario 3: "Suspected race condition in Go app"
# ============================================================
# Go's race detector — compile-time option
go run -race ./cmd/server          # Development environment
go test -race ./...                 # Always recommended in CI

# Output when race is found:
# WARNING: DATA RACE
# Read at 0x00c000010080 by goroutine 7:
#   main.counter
# Previous write at 0x00c000010080 by goroutine 6:
#   main.increment
# ⚠️ Two goroutines accessing concurrently = race

# ============================================================
# Scenario 4: "Python app is hung"
# ============================================================
# Dump current stack of all threads with faulthandler
# At code start:
#   import faulthandler
#   faulthandler.enable()
# Then from outside:
kill -SIGABRT 28391                 # Print stack to stderr

# py-spy — real-time profiling without code changes (powerful)
# pip install py-spy
py-spy dump --pid 28391            # Current stack
py-spy top --pid 28391             # Real-time top (which function uses CPU)
py-spy record -o profile.svg --pid 28391  # flamegraph

# ============================================================
# Scenario 5: "Check if I/O is the bottleneck"
# ============================================================
# Which process is doing a lot of disk I/O?
sudo iotop -o                            # Active processes only (-o)

# IOPS and wait time per disk
iostat -x 1
# %util > 80%·await > 50ms = disk bottleneck

# Socket statistics
ss -s                              # Summary (TCP, UDP, established counts)
ss -tnp                             # TCP connections + PID
ss -tn state established sport :443 # Active connections coming into 443

# Number of open fds (frequently encountered 'too many open files')
ls /proc/28391/fd/ | wc -l
cat /proc/28391/limits | grep 'open files'

# ============================================================
# Scenario 6: "epoll-based server — what events are waiting?"
# ============================================================
# Event loop servers like Node.js, nginx, Redis
cat /proc/28391/fdinfo/<epoll_fd>   # epoll instance info
# Or directly with strace:
strace -e trace=epoll_wait,epoll_ctl -p 28391

# Find all open epolls
ls -la /proc/28391/fd/ | grep eventpoll

🤖 Try asking AI like this

Knowing the concepts in this lesson lets you give AI specific instructions.

  • "Analyze the concurrency issues in this method and recommend synchronized vs ReentrantLock."
  • "This code might have a deadlock — diagnose it by examining the resource acquisition order."
  • "Parallelize this task using CompletableFuture + thenCombine."

Why this reduces token usage

Knowing vocabulary like "lock / deadlock / volatile / atomic" means AI responses come back as direct code. Without it, you get a "what is concurrency..." explanation first.

Concurrency — Lock·Deadlock·Synchronous/Asynchronous - OS