The Java Stream API part 2: the Map phase

Introduction

In the previous post we started looking into the new Stream API of Java 8 which makes working with collections easier. LINQ to Collections in .NET makes it a breeze to run queries on lists, maps – dictionaries in .NET – and other list-like objects and Java 8 is now coming with something similar. My overall impression is that LINQ in .NET is more concise and straightforward than the Stream API in Java.

In this post we’ll investigate Streams in greater detail.

Lazy execution of streams

If you’re familiar with LINQ statements in .NET then the notion of lazy or deferred execution is nothing new to you. Just because you have a LINQ statement, such as…

IEnumerable<Customer> customers = from c in DbContext.Customers where c.Id > 30 select c;

…the variable “customers” will not hold any data yet. You can execute the filter query with various other non-deferring operators like “ToList()”. We have a similar situation in the Stream API. Recall our Java code from the previous part:

Stream<Integer> of = Stream.of(1, 2, 4, 2, 10, 4, 40);
Predicate<Integer> pred = Predicate.isEqual(4);
Stream<Integer> filter = of.filter(pred);

The object called “filter” will at this point not hold any data. Writing the C# LINQ statement above won’t execute anything – writing of.filter(pred) in Java won’t execute anything either. They are simply declarations that describe what we want to do with a Collection. This is true for all methods in the Stream interface that return another Stream. Such operations are called intermediary operations. Methods that actually “do something” are called terminal operations or final operations.

Recall our Employee class from the previous part. We also had a list of employees:

List<Employee> employees = new ArrayList<>();
employees.add(new Employee(UUID.randomUUID(), "Elvis", 50));
.
.
.
employees.add(new Employee(UUID.randomUUID(), "Anna", 20));

Based on the above statements about a Stream object, can you guess what the List object called “filteredNames” will contain?

List<String> filteredNames = new ArrayList<>();
Stream<Employee> stream = employees.stream();
        
Stream<Employee> peekEmployees = employees.stream().peek(System.out::println);
Stream<Employee> filteredEmployees = peekEmployees.filter(emp -> emp.getAge() > 30);
Stream<Employee> peekFilteredEmployees = filteredEmployees.peek(emp -> filteredNames.add(emp.getName()));

The “peek” method is similar to forEach but it returns a Stream whereas forEach is void. Here we simply build Stream objects from other Stream objects. Those who answered “nothing” in response to the above questions were correct. “filteredNames” will remain an empty collection as we only declared our intentions to filter the source. The first “peek” method which invokes println won’t be executed, there will be nothing printed on the output window.

So if you’d like to “execute your intentions” then you’ll need to pick a terminal operation, such as forEach:

List<String> filteredNames = new ArrayList<>();
Stream<Employee> stream = employees.stream();
       
Stream<Employee> peekEmployees = employees.stream().peek(System.out::println);
Stream<Employee> filteredEmployees = peekEmployees.filter(emp -> emp.getAge() > 30);
filteredEmployees.forEach(emp -> filteredNames.add(emp.getName()));

The forEach loop will fill the filteredNames list correctly. Also, the System.out::println bit will be executed.

The map() operation

We mentioned the MapReduce algorithm in the previous post as it is extensively used in data mining. We are looking for meaningful information from a data set using some steps, such as Map, Filter and Reduce. We don’t always need all of these steps and we saw some very simple examples before. The Map step is represented by the map() intermediary operation which returns another Stream – hence it won’t execute anything:

Stream<Employee> employeeStream = employees.stream();
Stream<String> employeeNamesStream = employeeStream.map(emp -> emp.getName());

Our intention is to collect the names of the employees. We can do it as follows:

List<String> employeeNames = new ArrayList<>();
Stream<Employee> employeeStream = employees.stream();
employeeStream.map(emp -> emp.getName()).forEach(employeeNames::add);

We can also do other string operations like here:

List<String> employeeNames = new ArrayList<>();
Stream<Employee> employeeStream = employees.stream();
employeeStream.map(emp -> emp.getId().toString().concat(": ").concat(emp.getName())).forEach(employeeNames::add);

…where the employeeNames list will contain concatenated strings of the employee ID and name.

The flatMap() operation

You can use the flatMap operation to flatten a stream of streams. Say we have 3 different Employee lists:

List<Employee> employeesOne = new ArrayList<>();
employeesOne.add(new Employee(UUID.randomUUID(), "Elvis", 50));
employeesOne.add(new Employee(UUID.randomUUID(), "Marylin", 18));
employeesOne.add(new Employee(UUID.randomUUID(), "Freddie", 25));
employeesOne.add(new Employee(UUID.randomUUID(), "Mario", 43));
        
List<Employee> employeesTwo = new ArrayList<>();
employeesTwo.add(new Employee(UUID.randomUUID(), "John", 35));
employeesTwo.add(new Employee(UUID.randomUUID(), "Julia", 55));        
employeesTwo.add(new Employee(UUID.randomUUID(), "Lotta", 52));
        
List<Employee> employeesThree = new ArrayList<>();
employeesThree.add(new Employee(UUID.randomUUID(), "Eva", 42));
employeesThree.add(new Employee(UUID.randomUUID(), "Anna", 20));

Then suppose that we have a list of lists of employees:

List<List<Employee>> employeeLists = Arrays.asList(employeesOne, employeesTwo, employeesThree);

We can collect all employee names as follows:

List<String> allEmployeeNames = new ArrayList<>();
        
employeeLists.stream()
                .flatMap(empList -> empList.stream())
                .map(emp -> emp.getId().toString().concat(": ").concat(emp.getName()))
                .forEach(allEmployeeNames::add);

We first flatten the streams from the individual Employee lists then run the map function to retrieve the concatenated IDs and names. We finally put the elements into the allEmployeeNames collection.

Find the next post here where we go through the Reduce phase.

View all posts related to Java here.

Advertisements

About Andras Nemes
I'm a .NET/Java developer living and working in Stockholm, Sweden.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

ultimatemindsettoday

A great WordPress.com site

Elliot Balynn's Blog

A directory of wonderful thoughts

Robin Sedlaczek's Blog

Developer on Microsoft Technologies

HarsH ReaLiTy

A Good Blog is Hard to Find

Softwarearchitektur in der Praxis

Wissenswertes zu Webentwicklung, Domain-Driven Design und Microservices

the software architecture

thoughts, ideas, diagrams,enterprise code, design pattern , solution designs

Technology Talks

on Microsoft technologies, Web, Android and others

Software Engineering

Web development

Disparate Opinions

Various tidbits

chsakell's Blog

Anything around ASP.NET MVC,WEB API, WCF, Entity Framework & AngularJS

Cyber Matters

Bite-size insight on Cyber Security for the not too technical.

Guru N Guns's

OneSolution To dOTnET.

Johnny Zraiby

Measuring programming progress by lines of code is like measuring aircraft building progress by weight.

%d bloggers like this: