Collections + Functional — List · Set · Map · Lambda · Stream
Collections + Functional — List · Set · Map · Lambda · Stream
🎯 After reading this lesson
After reading through this lesson, you will be able to confidently do the following three things.
- ▸✅ When to use each of List · Set · Map
- ▸✅ Refactoring for-loops into Stream API map/filter/reduce chains
- ▸✅ The Iterator vs for-each pitfall and ConcurrentModificationException
Keep these learning goals as a checklist — once you can answer all of them, close the lesson.
What Is the Java Collections Framework — the *container* for your data
The Core in One Line
Java Collections Framework = a standard set of containers for holding multiple pieces of data. Introduced in Java 2 in 1998, it remains the most fundamental tool every Java developer uses every single day.
Why It Is Divided into Four Interfaces
The way you hold data differs by purpose.
- ▸List — ordered, duplicates allowed. Think of it as "items in a shopping cart." Use it when the index — first, second, third — is meaningful.
- ▸Set — no duplicates, usually no order. Think of it as "a list of user IDs who visited today." When you want to count the same person only once, no matter how many times they show up.
- ▸Map — key → value mapping. Like "user ID → user info" — when you want to handle data in pairs. The most frequently used structure.
- ▸Queue — first in, first out. Job queues, event processing. Used when FIFO is the natural order.
Thanks to these four abstractions, you handle any implementation the same way. Whether a List is an ArrayList or a LinkedList, the code stays identical: list.add() · list.get().
Most Common: ArrayList vs LinkedList
The names look similar, but their behavior is completely different.
ArrayList uses an array internally. That means it can jump directly to an index — retrieving the nth element takes one step (O(1)). However, inserting in the middle requires shifting every element after it, which is slow. Still, because memory is laid out contiguously, it is CPU cache-friendly and faster in most real-world cases.
LinkedList is a linked list where nodes point to each other. Middle insertion looks like O(1) since you only change links, but finding that position takes O(n). Plus, each node occupies separate memory, so cache efficiency is poor.
Conclusion: Use ArrayList when indexed access is frequent. Consider LinkedList only when middle insertions are truly numerous — in practice, you almost never need it. Despite its name, LinkedList is often the slower choice.
HashMap — the most-used data structure in Java
Code like HashMap<String, User> is something you see every day. It is worth understanding how it works at least once.
Internally it is a combination of an array and a linked list. It computes the hashCode() of a key to find the array index; when multiple keys collide at the same index, they are chained as a linked list at that slot. Since Java 8, when more than 8 entries collide, that slot is automatically converted to a Red-Black Tree, guaranteeing O(log n) even in the worst case.
Most common mistake: you must correctly implement both hashCode() and equals() on objects used as keys. Overriding only one of them means lookups will fail. Lombok's @EqualsAndHashCode or records (Java 14+) handle this automatically.
Multi-threaded Environments — ConcurrentHashMap
If multiple threads concurrently modify the same HashMap, you can end up in an infinite loop or with corrupted data. A common accident is sharing a HashMap as a server-side cache.
The solution is ConcurrentHashMap. Internally it acquires locks per bucket, allowing concurrency. There is also Collections.synchronizedMap(), which uses a full lock, but ConcurrentHashMap is the de-facto standard when higher concurrency is needed.
For similar reasons, CopyOnWriteArrayList also exists — used when reads are frequent but writes are rare (e.g., event listener lists).
Summary
> 💡 In practice: start with ArrayList + HashMap in 90% of cases. Use ConcurrentHashMap for multi-threaded scenarios. Switch to TreeMap only when you need sorting.
Lambdas and Functional Style — the moment Java *was reborn*
Why Java 8 Was a Turning Point
In 2014, Java 8 introduced lambda expressions and the Stream API. Until then, Java was exclusively an object-oriented language. Passing a single function required five lines of anonymous class boilerplate like new Runnable() { public void run() { ... } }.
Lambdas collapse that to one line.
The shorter code is not the point. The real change is that functions can now be treated as data: stored in variables, passed as arguments to other functions, and returned as values.
What a Functional Interface Actually Is
Lambdas are possible because of Functional Interfaces — interfaces with exactly one abstract method. When the compiler sees a lambda, it infers which functional interface is meant and automatically generates the implementation class.
Four commonly used ones:
- ▸
Function<T, R>— takes T, returns R. Used for transformation. Example: String → length. - ▸
Predicate<T>— takes T, returns boolean. Used for filtering. Example: "is this an adult?" - ▸
Consumer<T>— takes T, returns nothing. Used for side effects only. Example: logging output. - ▸
Supplier<T>— takes nothing, returns T. Used for supplying a value. Example: getting the current time.
The names are intuitive. Four combinations of receives something · returns something · both · neither.
Stream API — treating collections as a flow
Stream is a tool for processing collection data as a pipeline. Old-style code looked like this:
With Stream, the intent appears directly in the code:
.filter passes only matching elements. .map transforms each element. .sorted sorts. .toList collects the result into a List. Reading from top to bottom, you can visualize data flowing through the pipeline.
groupingBy — a frequently used magic trick
The real power of Stream shines in aggregation. Grouping employees by department:
Done in one line. Think of it as SQL's GROUP BY coming to Java. Getting the average salary by department is similar:
Common Pitfalls — good to know
Never modify external variables inside a Stream. Doing so causes race conditions during parallel processing and breaks the functional mindset entirely. Instead, use collect or reduce to produce immutable results.
parallelStream() is not a silver bullet. For small datasets or I/O-bound operations, it can actually be slower — context-switch overhead outweighs the benefit. It only helps with large datasets that are CPU-intensive.
Optional was introduced in the same release. It is a box that explicitly signals that a value may be absent, instead of using null. When you receive an Optional<User>, you immediately know it might be null.
80% of NullPointerExceptions disappear with this pattern.
Multithreading — how to make things run *concurrently*
Why Threads Are Needed
Modern CPUs have multiple cores (typically 8–16). But an ordinary Java program uses only one core — the rest sit idle. Use threads well and you can harness those idle cores, multiplying throughput by N.
Another reason is waiting. During the 1 second spent waiting for a DB response, the CPU does nothing. Put it to work on something else during that time and throughput skyrockets. This is how a web server handles 1,000 simultaneous requests.
Three Ways to Create Threads
1. Direct creation: new Thread(() -> { ... }).start(). Simplest, but manual management is tedious. Create 10,000 of them and the OS will scream.
2. ExecutorService: Create a thread pool and submit tasks to it. The industry standard.
Ten threads process 1,000 tasks in order. The overhead of creating a new thread every time disappears.
3. CompletableFuture — the ultimate tool for composing async operations. It lets you cleanly express flows like "fetch A and B concurrently, then combine them and call C when both finish."
The True Revolution in Java 21 — Virtual Threads
For a long time, creating thousands of threads in Java was risky because each thread consumed OS resources. The practical limit was usually a few hundred.
Java 21 Virtual Threads broke that barrier. The JVM manages threads virtually, making it easy to spin up tens or hundreds of thousands. During I/O waits, they automatically yield so other work can proceed.
This single line brings Java to the same level of concurrency as Go goroutines · Kotlin coroutines. Note, however, that this does not help with CPU-bound tasks — the actual number of CPU cores is unchanged.
The Most Important Pitfall — Race Conditions
When two threads modify the same variable concurrently, the result is unpredictable.
counter + 1 breaks down into three operations (read · add · write), and another thread can interrupt in between.
The solution is a lock (synchronized · Lock) or an atomic operation (AtomicInteger):
Deadlock — Waiting Forever
When two threads wait for each other's lock, they halt forever.
If A holds lock X and waits for Y, while B holds lock Y and waits for X — neither can proceed. That is a deadlock.
The most common cause is acquiring locks in different orders. If every thread always acquires locks in the same order, deadlock cannot occur.
When a deadlock is suspected in production, run jstack <PID> to dump the stack traces of all threads. Java automatically detects deadlocks and prints a "Found one Java-level deadlock" message.
Summary
Threads = a tool for making better use of the CPU. The key is knowing how to handle shared data safely. Picking the right approach — locks, atomic operations, or message passing — for the situation is a core practical skill.
Starting with Java 21, Virtual Threads have significantly lowered the barrier to concurrency. For new projects, they are a strongly worth considering option.
The Iterator Pattern — how for-each actually works
for-each Is Syntactic Sugar for Iterator
At compile time, the code above is actually transformed into this:
Any collection that implements the Iterable interface can be used in a for-each loop. That includes ArrayList, HashSet, and LinkedList (for HashMap, use entrySet/keySet/values).
The Three Methods of Iterator
The Modification-During-Iteration Pitfall — ConcurrentModificationException
Modifying the collection during iteration causes it to blow up. Two solutions:
Creating Your Own Iterable — Custom Collections
Implementing Iterable is all it takes to enable for-each — this is the core design of the Java collections framework.
☕ Try It Yourself — List · Map · Stream
🤖 Try Asking AI Like This
Knowing the concepts from this lesson lets you give AI specific, precise instructions. Instead of a vague "fix this," you can make vocabulary-driven requests — and that is where token savings begin.
- ▸"Rewrite this for-loop as a Stream API map/filter/reduce chain"
- ▸"Fix this ArrayList operation so it safely removes elements inside an Iterator"
- ▸"Refactor this List to an immutable List.of()"
Why This Saves Tokens
Without the concepts, you have to ask "what does that mean?" again after receiving an AI answer. Those follow-up questions eat tokens. Learn the concept once and the conversation ends in a single exchange.