The Java Stream API part 5: collection reducers

Introduction

In the previous post we saw how to handle an ambiguous terminal reduction result of a Stream. There’s in fact another type of reducer function in Java 8 that we haven’t discussed so far: collectors, represented by the collect() function available for Stream objects. The first overload of the collect function accepts an object that implements the Collector interface.

Implementing the Collector interface involves implementing 5 functions: a supplier, an accumulator, a combiner, a finisher and characteristics provider. At this point I’m not sure how to implement all those methods. Luckily for us the Collectors object provides a long range of ready-made implementing classes that can be supplied to the collect function.

Purpose and first example

Collectors are similar to Maps and the Reducers we’ve seen up to now in this series at the same time. Depending on the exact implementation you take the collect function can e.g. map a certain numeric field of a custom object into an intermediary stream and calculate the average of that field in one step.

Let’s see that in action. We’ll revisit our Employee class:

public class Employee
{
    private UUID id;
    private String name;
    private int age;

    public Employee(UUID id, String name, int age)
    {
        this.id = id;
        this.name = name;
        this.age = age;
    }
        
    public UUID getId()
    {
        return id;
    }

    public void setId(UUID id)
    {
        this.id = id;
    }

    public String getName()
    {
        return name;
    }

    public void setName(String name)
    {
        this.name = name;
    }    
    
    public int getAge()
    {
        return age;
    }

    public void setAge(int age)
    {
        this.age = age;
    }
    
    public boolean isCool(EmployeeCoolnessJudger coolnessJudger)
    {
        return coolnessJudger.isCool(this);
    }
    
    public void saySomething(EmployeeSpeaker speaker)
    {
        speaker.speak();
    }
}

We’ve seen that some aggregation functions have ready-made methods in the Stream class: min, max, count and some others. However, there’s nothing for counting the average. What if I’d like to calculate the average age of my employees?

List<Employee> employees = new ArrayList<>();
        employees.add(new Employee(UUID.randomUUID(), "Elvis", 50));
        employees.add(new Employee(UUID.randomUUID(), "Marylin", 18));
        employees.add(new Employee(UUID.randomUUID(), "Freddie", 25));
        employees.add(new Employee(UUID.randomUUID(), "Mario", 43));
        employees.add(new Employee(UUID.randomUUID(), "John", 35));
        employees.add(new Employee(UUID.randomUUID(), "Julia", 55));        
        employees.add(new Employee(UUID.randomUUID(), "Lotta", 52));
        employees.add(new Employee(UUID.randomUUID(), "Eva", 42));
        employees.add(new Employee(UUID.randomUUID(), "Anna", 20)); 

It may not be obvious at first but the collect function can perform that – and a lot more. The Collectors class includes a ready-made implementation of Collector: averagingInt which accepts a ToIntFunction of T. The ToIntFunction will return an integer from the T object. In our case we need the age values so we can calculate the average age as follows:

ToIntFunction<Employee> toInt = Employee::getAge;
Double averageAge = employees.stream().collect(Collectors.averagingInt(toInt));     

averageAge will be 37.78.

Other examples

Collect all the names into a string list:

List<String> names = employees.stream().map(Employee::getName).collect(Collectors.toList());     

Compute sum of all ages in a different way:

int totalAge = employees.stream().collect(Collectors.summingInt(Employee::getAge));

Let’s change the age values a little before the next example:

employees.add(new Employee(UUID.randomUUID(), "Elvis", 50));
        employees.add(new Employee(UUID.randomUUID(), "Marilyn", 20));
        employees.add(new Employee(UUID.randomUUID(), "Freddie", 20));
        employees.add(new Employee(UUID.randomUUID(), "Mario", 30));
        employees.add(new Employee(UUID.randomUUID(), "John", 30));
        employees.add(new Employee(UUID.randomUUID(), "Julia", 50));
        employees.add(new Employee(UUID.randomUUID(), "Lotta", 30));
        employees.add(new Employee(UUID.randomUUID(), "Eva", 40));
        employees.add(new Employee(UUID.randomUUID(), "Anna", 20));    

We can group the employees by age into a Map of Integers:

Map<Integer, List<Employee>> employeesByAge = employees.stream().collect(Collectors.groupingBy(Employee::getAge));  

Here you’ll see that the key 20 will have 3 employees, key 50 will have 2 employees etc.

You can also supply another Collector to the groupingBy function if you want to have some different type as the value in the Map. E.g. the following will do the same as above except that the value will show the number of employees within an age group:

Map<Integer, Long> employeesByAge = employees.stream().collect(Collectors.groupingBy(Employee::getAge, Collectors.counting()));

You can partition the collection based on some boolean condition. Here we build a Map by putting the employees into one of two groups: younger than 40 or older. The partitionBy function will help solve this:

Map<Boolean, List<Employee>> agePartitioning = employees.stream().collect(Collectors.partitioningBy(emp -> emp.getAge()>= 40));

agePartitioning will have 6 employees who are younger than 40 and 3 who are either 40 or older which is the correct result.

You can create something like an ad-hoc toString() function:

String allEmployees = employees.stream().map(emp -> emp.getName().concat(",  ").concat(Integer.toString(emp.getAge()))).collect(Collectors.joining(" | "));

The above function will go through each employee, create a “name + , + age” string of each of them and then join all individual strings by a pipe character. The result will look like this:

Elvis, 50 | Marilyn, 20 | Freddie, 20 | Mario, 30 | John, 30 | Julia, 50 | Lotta, 30 | Eva, 40 | Anna, 20

Notice that the collector was intelligent not to put the pipe character after the last element.

The Collectors class has a lot more ready-made collectors. Just type “Collectors.” in an IDE which supports IntelliSense and you’ll be able to view the whole list. Chances are that if you need to perform a composite MapReduce operation on a collection then you’ll find something useful here.

This post concludes our discussion on the new Stream API in Java 8.

View all posts related to Java here.

Advertisements

About Andras Nemes
I'm a .NET/Java developer living and working in Stockholm, Sweden.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

ultimatemindsettoday

A great WordPress.com site

Elliot Balynn's Blog

A directory of wonderful thoughts

Robin Sedlaczek's Blog

Developer on Microsoft Technologies

HarsH ReaLiTy

A Good Blog is Hard to Find

Softwarearchitektur in der Praxis

Wissenswertes zu Webentwicklung, Domain-Driven Design und Microservices

the software architecture

thoughts, ideas, diagrams,enterprise code, design pattern , solution designs

Technology Talks

on Microsoft technologies, Web, Android and others

Software Engineering

Web development

Disparate Opinions

Various tidbits

chsakell's Blog

Anything around ASP.NET MVC,WEB API, WCF, Entity Framework & AngularJS

Cyber Matters

Bite-size insight on Cyber Security for the not too technical.

Guru N Guns's

OneSolution To dOTnET.

Johnny Zraiby

Measuring programming progress by lines of code is like measuring aircraft building progress by weight.

%d bloggers like this: