C
Java/Collections Functional/Lesson 05

Collections + Functional — List · Set · Map · Lambda · Stream

60 min·theory

Collections + Functional — List · Set · Map · Lambda · Stream

🎯 After reading this lesson

After reading through this lesson, you will be able to confidently do the following three things.

  • ✅ When to use each of List · Set · Map
  • ✅ Refactoring for-loops into Stream API map/filter/reduce chains
  • ✅ The Iterator vs for-each pitfall and ConcurrentModificationException

Keep these learning goals as a checklist — once you can answer all of them, close the lesson.

What Is the Java Collections Framework — the *container* for your data

The Core in One Line

Java Collections Framework = a standard set of containers for holding multiple pieces of data. Introduced in Java 2 in 1998, it remains the most fundamental tool every Java developer uses every single day.

Why It Is Divided into Four Interfaces

The way you hold data differs by purpose.

  • Listordered, duplicates allowed. Think of it as "items in a shopping cart." Use it when the index — first, second, third — is meaningful.
  • Setno duplicates, usually no order. Think of it as "a list of user IDs who visited today." When you want to count the same person only once, no matter how many times they show up.
  • Mapkey → value mapping. Like "user ID → user info" — when you want to handle data in pairs. The most frequently used structure.
  • Queuefirst in, first out. Job queues, event processing. Used when FIFO is the natural order.

Thanks to these four abstractions, you handle any implementation the same way. Whether a List is an ArrayList or a LinkedList, the code stays identical: list.add() · list.get().

Most Common: ArrayList vs LinkedList

The names look similar, but their behavior is completely different.

ArrayList uses an array internally. That means it can jump directly to an index — retrieving the nth element takes one step (O(1)). However, inserting in the middle requires shifting every element after it, which is slow. Still, because memory is laid out contiguously, it is CPU cache-friendly and faster in most real-world cases.

LinkedList is a linked list where nodes point to each other. Middle insertion looks like O(1) since you only change links, but finding that position takes O(n). Plus, each node occupies separate memory, so cache efficiency is poor.

Conclusion: Use ArrayList when indexed access is frequent. Consider LinkedList only when middle insertions are truly numerous — in practice, you almost never need it. Despite its name, LinkedList is often the slower choice.

HashMap — the most-used data structure in Java

Code like HashMap<String, User> is something you see every day. It is worth understanding how it works at least once.

Internally it is a combination of an array and a linked list. It computes the hashCode() of a key to find the array index; when multiple keys collide at the same index, they are chained as a linked list at that slot. Since Java 8, when more than 8 entries collide, that slot is automatically converted to a Red-Black Tree, guaranteeing O(log n) even in the worst case.

Most common mistake: you must correctly implement both hashCode() and equals() on objects used as keys. Overriding only one of them means lookups will fail. Lombok's @EqualsAndHashCode or records (Java 14+) handle this automatically.

Multi-threaded Environments — ConcurrentHashMap

If multiple threads concurrently modify the same HashMap, you can end up in an infinite loop or with corrupted data. A common accident is sharing a HashMap as a server-side cache.

The solution is ConcurrentHashMap. Internally it acquires locks per bucket, allowing concurrency. There is also Collections.synchronizedMap(), which uses a full lock, but ConcurrentHashMap is the de-facto standard when higher concurrency is needed.

For similar reasons, CopyOnWriteArrayList also exists — used when reads are frequent but writes are rare (e.g., event listener lists).

Summary

InterfaceMost common implementationWhen to use
ListArrayListNeed index, order, and duplicates
SetHashSet / LinkedHashSetRemove duplicates / preserve order too
MapHashMap / ConcurrentHashMapkey→value; use the latter for concurrency
QueueArrayDequeFIFO job queue
SortedTreeMap / TreeSetWhen key sorting is required

> 💡 In practice: start with ArrayList + HashMap in 90% of cases. Use ConcurrentHashMap for multi-threaded scenarios. Switch to TreeMap only when you need sorting.

Lambdas and Functional Style — the moment Java *was reborn*

Why Java 8 Was a Turning Point

In 2014, Java 8 introduced lambda expressions and the Stream API. Until then, Java was exclusively an object-oriented language. Passing a single function required five lines of anonymous class boilerplate like new Runnable() { public void run() { ... } }.

Lambdas collapse that to one line.

java
// Old style
new Thread(new Runnable() {
    public void run() { System.out.println("hi"); }
}).start();

// Lambda — Java 8+
new Thread(() -> System.out.println("hi")).start();

The shorter code is not the point. The real change is that functions can now be treated as data: stored in variables, passed as arguments to other functions, and returned as values.

What a Functional Interface Actually Is

Lambdas are possible because of Functional Interfaces — interfaces with exactly one abstract method. When the compiler sees a lambda, it infers which functional interface is meant and automatically generates the implementation class.

Four commonly used ones:

  • Function<T, R> — takes T, returns R. Used for transformation. Example: String → length.
  • Predicate<T> — takes T, returns boolean. Used for filtering. Example: "is this an adult?"
  • Consumer<T> — takes T, returns nothing. Used for side effects only. Example: logging output.
  • Supplier<T>takes nothing, returns T. Used for supplying a value. Example: getting the current time.

The names are intuitive. Four combinations of receives something · returns something · both · neither.

Stream API — treating collections as a flow

Stream is a tool for processing collection data as a pipeline. Old-style code looked like this:

java
List<String> result = new ArrayList<>();
for (User u : users) {
    if (u.getAge() >= 30) {
        result.add(u.getName().toUpperCase());
    }
}
Collections.sort(result);

With Stream, the intent appears directly in the code:

java
List<String> result = users.stream()
    .filter(u -> u.getAge() >= 30)
    .map(u -> u.getName().toUpperCase())
    .sorted()
    .toList();

.filter passes only matching elements. .map transforms each element. .sorted sorts. .toList collects the result into a List. Reading from top to bottom, you can visualize data flowing through the pipeline.

groupingBy — a frequently used magic trick

The real power of Stream shines in aggregation. Grouping employees by department:

java
Map<String, List<Employee>> byDept = employees.stream()
    .collect(Collectors.groupingBy(Employee::getDept));

Done in one line. Think of it as SQL's GROUP BY coming to Java. Getting the average salary by department is similar:

java
Map<String, Double> avgSalaryByDept = employees.stream()
    .collect(Collectors.groupingBy(
        Employee::getDept,
        Collectors.averagingDouble(Employee::getSalary)));

Common Pitfalls — good to know

Never modify external variables inside a Stream. Doing so causes race conditions during parallel processing and breaks the functional mindset entirely. Instead, use collect or reduce to produce immutable results.

parallelStream() is not a silver bullet. For small datasets or I/O-bound operations, it can actually be slower — context-switch overhead outweighs the benefit. It only helps with large datasets that are CPU-intensive.

Optional was introduced in the same release. It is a box that explicitly signals that a value may be absent, instead of using null. When you receive an Optional<User>, you immediately know it might be null.

java
Optional<User> u = userRepo.findById(id);
String name = u.map(User::getName).orElse("(no name)");

80% of NullPointerExceptions disappear with this pattern.

Multithreading — how to make things run *concurrently*

Why Threads Are Needed

Modern CPUs have multiple cores (typically 8–16). But an ordinary Java program uses only one core — the rest sit idle. Use threads well and you can harness those idle cores, multiplying throughput by N.

Another reason is waiting. During the 1 second spent waiting for a DB response, the CPU does nothing. Put it to work on something else during that time and throughput skyrockets. This is how a web server handles 1,000 simultaneous requests.

Three Ways to Create Threads

1. Direct creation: new Thread(() -> { ... }).start(). Simplest, but manual management is tedious. Create 10,000 of them and the OS will scream.

2. ExecutorService: Create a thread pool and submit tasks to it. The industry standard.

java
ExecutorService pool = Executors.newFixedThreadPool(10);
for (int i = 0; i < 1000; i++) {
    pool.submit(() -> doWork());
}
pool.shutdown();

Ten threads process 1,000 tasks in order. The overhead of creating a new thread every time disappears.

3. CompletableFuture — the ultimate tool for composing async operations. It lets you cleanly express flows like "fetch A and B concurrently, then combine them and call C when both finish."

java
CompletableFuture<User> userFut    = CompletableFuture.supplyAsync(() -> fetchUser());
CompletableFuture<Order> orderFut  = CompletableFuture.supplyAsync(() -> fetchOrders());
CompletableFuture<UserDto> dto = userFut.thenCombine(orderFut, UserDto::new);

The True Revolution in Java 21 — Virtual Threads

For a long time, creating thousands of threads in Java was risky because each thread consumed OS resources. The practical limit was usually a few hundred.

Java 21 Virtual Threads broke that barrier. The JVM manages threads virtually, making it easy to spin up tens or hundreds of thousands. During I/O waits, they automatically yield so other work can proceed.

java
Thread.startVirtualThread(() -> doWork());

This single line brings Java to the same level of concurrency as Go goroutines · Kotlin coroutines. Note, however, that this does not help with CPU-bound tasks — the actual number of CPU cores is unchanged.

The Most Important Pitfall — Race Conditions

When two threads modify the same variable concurrently, the result is unpredictable.

java
int counter = 0;
// Thread A: counter = counter + 1;  → reads 0 → writes 1
// Thread B: counter = counter + 1;  → reads 0 → writes 1
// Result: counter = 1 (expected 2)

counter + 1 breaks down into three operations (read · add · write), and another thread can interrupt in between.

The solution is a lock (synchronized · Lock) or an atomic operation (AtomicInteger):

java
private final AtomicInteger counter = new AtomicInteger(0);
counter.incrementAndGet();   // atomic +1

Deadlock — Waiting Forever

When two threads wait for each other's lock, they halt forever.

If A holds lock X and waits for Y, while B holds lock Y and waits for X — neither can proceed. That is a deadlock.

The most common cause is acquiring locks in different orders. If every thread always acquires locks in the same order, deadlock cannot occur.

When a deadlock is suspected in production, run jstack <PID> to dump the stack traces of all threads. Java automatically detects deadlocks and prints a "Found one Java-level deadlock" message.

Summary

Threads = a tool for making better use of the CPU. The key is knowing how to handle shared data safely. Picking the right approach — locks, atomic operations, or message passing — for the situation is a core practical skill.

Starting with Java 21, Virtual Threads have significantly lowered the barrier to concurrency. For new projects, they are a strongly worth considering option.

💻 📌 Frequently Used Code (no need to memorize — reference only)
// ========================================
// 1. Stream — Most frequent pattern: filter, transform, collect
// ========================================
List<Order> orders = orderRepo.findAll();

// User emails of paid orders (deduplicated)
List<String> emails = orders.stream()
    .filter(o -> o.getStatus() == OrderStatus.PAID)
    .map(o -> o.getUser().getEmail())
    .distinct()
    .toList();

// ========================================
// 2. Aggregation — Grouping, average, sum
// ========================================
// Group employees by department
Map<String, List<Employee>> byDept = employees.stream()
    .collect(Collectors.groupingBy(Employee::getDept));

// Average salary by department
Map<String, Double> avgSalary = employees.stream()
    .collect(Collectors.groupingBy(
        Employee::getDept,
        Collectors.averagingDouble(Employee::getSalary)));

// Total sales sum
BigDecimal total = orders.stream()
    .map(Order::getAmount)
    .reduce(BigDecimal.ZERO, BigDecimal::add);

// ========================================
// 3. Optional — Null-safe handling
// ========================================
Optional<User> u = userRepo.findById(id);
String name = u.map(User::getName).orElse("(No name)");

u.ifPresent(user -> sendEmail(user.getEmail()));

// ========================================
// 4. CompletableFuture — Asynchronous composition
// ========================================
CompletableFuture<User> userFut    = CompletableFuture.supplyAsync(() -> fetchUser(id));
CompletableFuture<List<Order>> ordersFut = CompletableFuture.supplyAsync(() -> fetchOrders(id));

CompletableFuture<UserDto> dto = userFut.thenCombine(ordersFut,
    (user, orders) -> new UserDto(user, orders));

dto.thenAccept(d -> System.out.println(d))
   .exceptionally(ex -> { log.error("Failed", ex); return null; });

// ========================================
// 5. Virtual Thread (Java 21+) — 10k concurrent requests
// ========================================
try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
    List<Future<Response>> results = userIds.stream()
        .map(id -> executor.submit(() -> httpClient.get("/api/users/" + id)))
        .toList();
    // Even with 10,000 users, it's handled without 10,000 OS threads
}

The Iterator Pattern — how for-each actually works

for-each Is Syntactic Sugar for Iterator

java
List<String> list = List.of("a", "b", "c");
for (String s : list) {
    System.out.println(s);
}

At compile time, the code above is actually transformed into this:

java
Iterator<String> it = list.iterator();
while (it.hasNext()) {
    String s = it.next();
    System.out.println(s);
}

Any collection that implements the Iterable interface can be used in a for-each loop. That includes ArrayList, HashSet, and LinkedList (for HashMap, use entrySet/keySet/values).

The Three Methods of Iterator

java
public interface Iterator<E> {
    boolean hasNext();   // is there a next element?
    E next();            // retrieve the next element and advance the cursor
    default void remove();  // remove the current element (optional)
}

The Modification-During-Iteration Pitfall — ConcurrentModificationException

java
List<String> list = new ArrayList<>(List.of("a", "b", "c"));
for (String s : list) {
    if (s.equals("b")) list.remove(s);   // ❌ throws exception
}

Modifying the collection during iteration causes it to blow up. Two solutions:

java
// ✅ 1. Use Iterator directly + Iterator.remove()
Iterator<String> it = list.iterator();
while (it.hasNext()) {
    if (it.next().equals("b")) it.remove();
}

// ✅ 2. removeIf (Java 8+)
list.removeIf(s -> s.equals("b"));

Creating Your Own Iterable — Custom Collections

java
class Range implements Iterable<Integer> {
    private final int start, end;
    Range(int s, int e) { this.start = s; this.end = e; }

    @Override
    public Iterator<Integer> iterator() {
        return new Iterator<>() {
            int cur = start;
            public boolean hasNext() { return cur < end; }
            public Integer next()    { return cur++; }
        };
    }
}

for (int i : new Range(1, 5)) System.out.println(i);  // 1,2,3,4

Implementing Iterable is all it takes to enable for-each — this is the core design of the Java collections framework.

☕ Try It Yourself — List · Map · Stream

Core of Collections + Stream API. Functional transformations, filtering, and aggregation.
☕ Java
✏️ 코드 편집기
📟 출력 결과
▶ Press the Run button
💡 코드를 직접 수정하고 실행해보세요. 변수값을 바꾸거나 println을 추가해 결과를 확인하세요!
☁️ Judge0 API로 서버에서 실행 — Java / Python / JS / C++ 지원

🤖 Try Asking AI Like This

Knowing the concepts from this lesson lets you give AI specific, precise instructions. Instead of a vague "fix this," you can make vocabulary-driven requests — and that is where token savings begin.

  • "Rewrite this for-loop as a Stream API map/filter/reduce chain"
  • "Fix this ArrayList operation so it safely removes elements inside an Iterator"
  • "Refactor this List to an immutable List.of()"

Why This Saves Tokens

Without the concepts, you have to ask "what does that mean?" again after receiving an AI answer. Those follow-up questions eat tokens. Learn the concept once and the conversation ends in a single exchange.

Collections + Functional — List · Set · Map · Lambda · Stream - Java