In this article we’ll cover the Java 8 Streams API and the new Optionals wrapper class.
The point of an Optional comes in two parts, utilising them for managing nullable values and utilising them for a lack of value presence, literally, no present value whatsoever is possible via Optionals. This might sound a little strange, but hopefully after a few code snippets you’ll come to see why they’re useful and also why maybe NOT to use them. But it is good to know what they are, how they work and how to overcome some situations where you’re forced to use them.
So, in a nutshell, an optional is a wrapper for an instance (a.k.a., an element) or to indicate the absence of value (a.k.a., an instance / element).
I’m probably not going to advocate this be used unless in exceptional circumstances. The reason is this code executes fine, but usually, null values are a smell and should be avoided. But, in a nutshell, it allows us to create an Optional with it’s initial value possibly being null.
One of two creation methods you should use, it returns us a none-null, absence of value wrapped in an optional
Does what it says on the tin, returns an optional wrapping another instance.
Methods on the Optional class
It gets the value inside an optional:
Throwing exceptions from optionals, on the absence of elements:
There’s a bunch of orElse in the absence of value, here’s a good set of examples:
Note: ofNullable(null) does count as absence too by the way, but we shouldn’t really ever be calling this ourselves
Optionals have some pretty f**king cool utility if you’re a fan of the functional paradigm, as above, I’ve put together a bunch of examples explaining the outputs (it follows on directly from our orElseExample above):
I think it’s worth learning optionals so we’re aware of them and as such we can handle the Stream API more confidently when it returns optional values, but other than wrapping an HTTP response, or some outside serialized input in an Optional, I do struggle to see the full benefit of them. Sometimes you just can’t beat a good o’ trust try/catch and if == null.
Varargs a.k.a., Spread Operator
Before we jump into Streams, I just wanna’ make sure we’re aware of the
... operator as it’s used within the Streams API. It effectively allows us to spread a list of values, that aren’t actually a list or collection into a method, see the below:
This is literally it, it allows us to take a list (not literal list) of values and pass them as a ‘infinite length’ argument to a function, which is ultimately resolved into an array in the given method.
Note: Do not actually use the array created from varargs elsewhere, i.e., returning it from the method and using it in another class, if you want to be absolutely safe. Heap Pollution is the effect of multiple casts being appended to runtime data before the reference (whether it be a variable, array, whatever) is made. The JVM has no idea what type it actually is and therefore considers the heap ‘polluted’. For example: (Object)(String)(MyClass) someVariableOfStringType;
Streams API Overview
Note: Streams make heavy use of Lambdas, see my Lambda guide for more info
The Streams API brings the functional paradigm in relation to collections of objects in Java. So far in our other guides we’ve covered Java’s functional callbacks within CompletableFutures & Method References and Lambdas.
We won’t cover every method in the API, but we’ll cover the most useful ones (at least in my opinion from previous experience in other languages).
What is the Streams API in a more technical description?
It represents a way for us to take or create a dynamic collection of objects and manipulate them over the course of an extensible call stack. The keyword from this description bar dynamic, is the fact it works with Objects and not primitives. Throughout this guide you’ll see me use Integer over int, this is purposeful. We’ll cover primitives at the end.
Let’s dive straight in and create some Streams via a number of ways.
Creating a Stream
We can create streams in a number of ways, checkout the following code examples:
Yup, the Stream.of(T…) is vararg based.
How do we use a Stream? (The juicy bit :P)
Let’s look at a basic stream example and cover the opening, streaming, and termination of the stream (Either manually closing it, or any extension method which returns void is how we terminate a stream). For this example, we’ll be using Bob again :P:
Let’s begin with map, map is basically a forEach (from the collection framework) which instead of returning void forEach element, returns back forEach element — whether it’s been modified, left the same, or is a brand new element. Map is one of the most powerful operations we can perform within a stream:
So what happened here?
- It iterated through each element of our Stream collection
- Modified the name of each bob
- Was converted from Stream<SpecialBob> -> Object
That’s it. Do notice though, that the output order is maintained as the input order. I.e., Streams consider input order for which is processed at a time:
Element 1 -> Map, Map modified element 1.name -> Sits at the end of stream
Element 2 -> Map, Map modified element 2.name -> Sits at the end of stream
Stream.toArray() -> Terminates the stream
The above outputs:
// Output:// bob0.032052957686273365 -- Our special bob printed from the stream
// bob0.47736778571943084 -- Our special bob printed from reference
If we were to flip element 1 and 2 in the creation:
Stream<SpecialBob> myStream = Stream.of(
Then theoutput would look as follows:
// Output:// bob0.5406010870099155
// bob0.032052957686273365 -- Our special bob printed from the stream
// bob0.47736778571943084 -- Our special bob printed from reference
Now we’re familiar with the following concepts:
- They stream 1 object at a time (think of it literally as an observable stream)
- They are opened and closed (very much like ordinary I/O)
- They make heavy use of Lambdas, see my Lambda guide for more info
The above code could be simplified [with use of a new method, .collect] (as they are extension methods) to:
Don’t worry about the .collect(…) or Collectors.* argument for now, just know it’s an optional way to turn our stream into another form of collection without having to call stream.toArray() and up until we explicitly teach .collect, we’ll be using .toList() each time. Which as the name implies, converts our stream into a list. The return type for collect(Collectors.toList()) from our above example is:
Streams API Usage
Given what we’ve learnt so far, we should be good to move into using more methods from the API to manipulate our collections. We’ll go over each individually which code as a example. I’ve broken the intermediate operations and terminal operations up into their own sections so it’s explicitly clear which is which. Also for a reminder:
- Intermediate operation — Returns the Stream after performing some operation
- Terminal operation — Returns void, therefore no more extension methods can be used on this stream
Intermediate Operation Methods
.distinct() returns only ‘distinct’ i.e., no duplicated values (same as a Set):
Sorted sorts the items in the stream according to natural order, unless given a Comparator:
And with a comparator to reverse them:
Filters based on a given predicate:
Many often become confused by dropWhile, here’s what it does in a nutshell — it drops/filters/removes, whatever term we want to use, whilst the predicate is true and happen only on the sequence once. Consider the following:
I know it likely looks confusing, but I hope this relays how it works. To further explain it, we’ll change the order:
Now none of them are dropped as it’s never true from the FIRST element in our stream, finally, check this final example out:
Notice the 100 is DROPPED.
To summarise dropWhile, it DROPS the values WHILST the PREDICATE is TRUE from the FIRST ELEMENT and AS SOON AS it becomes FALSE, it will NO LONGER DROP any subsequent VALUES in the STREAM.
As such, the order of your collection that is streamed matters and this should be used carefully and intentionally.
Unlike .dropWhile(), .takeWhile() continues to take whilst the predicate is true, I’ll use all of our examples from dropWhile in the same order:
Similar to .dropWhile(), .takeWhile() checks the first element and as our first element != 0 on a % 2, it takes nothing at all and as such the output is nil.
Now we’ll begin with a truthy, and observe it’s behaviour:
As the first element is true, it is taken, and despite 12, 14, 16 also matching the predicate, as the next value was 7, it takes no more. Very much like dropWhile.
For each value in the stream, it returns a mapped value, this can be a brand new value, modified (like from earlier), or exactly the same:
FlatMap flattens a Stream of Streams and let’s us perform an ordinary map on the values.
Similar to the Optionals guide above, we can flatMap nested optionals into a single optional, it’s the same concept here, consider these examples:
Note: If you’re brand new to the concept of flatMapping, it’s perfectly fine that it may take a while to process.
One of the more powerful operations in Streams is limit, it literally places a delimiter (from 0 to the specified limiter) on our stream for us, consider the following:
If we additionally wanted 2, or even 3, we can simply increase our Stream element limit.
Very similar to the forEach below, but instead of returning void, returns the stream back to us and is therefore not terminal. The usecase for something like this would be to see how values are being modified and if your sequence is behaving as example, consider the following:
We can literally see each element goes through the stream sequentially and how it is modified via peek.
Skips N first amount of elements, the example should speak for itself:
Streams provide us a way to run a callback on stream close, as after all, they are technically observable streams:
Terminal Operation Methods
One of, if not the most powerful operations we can perform in streams. It takes a function which considers the previous (at the beginning this is the first value) and the next value, accumulates them, and uses that in the next previous value until the stream is completely empty, terminating into an optional:
undefined so it would have 1 more iteration to the reduce loop, in Java, this isn’t the case, the previous value on initial stream IS the FIRST value.
Performs the provided lambda forEach element in the stream. Literally the same as Collections.forEach:
Checks if any elements in the stream match a given predicate, if they do, terminate the stream and return a boolean:
Checks if all elements in the stream match a given predicate, if they do, terminate the stream and return a boolean:
Returns an Optional (empty optional if the stream is empty), it select a near random element non-deterministically, but in most scenarios you can expect it to find the first element of the stream:
The reason it is non-deterministic is we also have the concept of ‘parallel’ streams, in which it will get any value that fork-joins FIRST. Ultimately resulting in an inconsistent value being returned. Don’t worry for now, we’ll go over parallel streams soon.
Does what it says on the tin, gets us the first element of the stream. The difference between FindAny and FindFirst is that FindFirst GUARANTEES it:
Returns the ‘maximum/minimum’ value in a given stream, i.e., in a numbers 1->3, 3 is the maximum. It accepts a Comparator for us to determine this:
Working With Primitives
So you’ve likely noticed we haven’t worked with Primitives yet, this is because the Streams API separates reference vs primitive. To work with primitives we have a number of Stream creation methods we can use and cast to. Here’s a few examples:
The methods are mostly the same, but the Lambdas and Optionals used are PRIMITIVE SPECIFIC being the main difference. Other than that, they’re what they look like on the tin.
Stream Concurrency — ForkJoinPool
The final piece to the Streams API is the parallelism provided. Unlike raw threads, the Streams API was accompanied by a new built in fork-and-join functionality, effectively allowing us to achieve LITERAL PARALELLISM in multiple given operations. If you’re unsure about this, checkout my Concurrency Tutorial. It’s basically mandatory if you wish to understand Streams API concurrency properly. If not, just continue ahead and pick it up later.
Your machine, if it has > 2 logical processing units (In a hyperthreaded CPU scenario) or > 2 PHYSICAL processing units is capable of parallel processing. This means what it says on the tin, to run two tasks, concurrently, in parallel. No schedule forcing threads to wait upon one another, nor the OS dictating how they should behave subtly.
To discover the parallelism level available for your machine, you can run:
We won’t cover how to make custom ForkJoinTasks and utilise parallelism via them for the Streams API but I wanted you to be aware of it for the next step of this tutorial.
Streams Concurrency API Usage
The Streams API utilises the ForkJoinPool underneath to parallelise the tasks (a.k.a., operations). Consider this example:
By calling .parallel(), the Streams API will delegate the operations into forkjoinpool workers and maintain our order for us. That’s really it to be honest, unlike manually making RecursiveTasks/Actions within the ForkJoin API, the Streams API handles this for us and recursively forks and joins back for us to the final, sorted collection. Unfortunately it does appear sorts are purely handled in the main thread.
Finally, we can further substantiate the pool is executed in parallel via the isParallel method:
Overall I hope this tutorial has explained how the Optionals, Streams API in regards to primitives and references types, varargs, and ForkJoinPool parallelism via the Streams API works in somewhat decent depth. If it did, or if you have any questions, you can hit me up on my discord server here.