Creating an Amazon Beanstalk wrapper around a Kinesis application in Java

Introduction

Suppose that you have a Java Amazon Kinesis application which handles messages from the Amazon message queue handler Kinesis. This means that you have a method that starts a Worker object in the com.amazonaws.services.kinesis.clientlibrary.lib.worker library.

If you are developing an application that is meant to process messages from Amazon Kinesis and you don’t know what I mean by a Kinesis app then check out the Amazon documentation on the Kinesis Client Library (KCL) here.

The starting point for this post is that you have a KCL application and want to host it somewhere. One possibility is to deploy it on Amazon Elastic Beanstalk. You cannot simply deploy a KCL application as it is. You’ll need to wrap it within a special Kinesis Beanstalk worker wrapper.

The Beanstalk wrapper application

The wrapper is a very thin Java Maven web application which can be deployed as a .war file. If you’ve done any web-based Java development then the .war file extension will be familiar to you. It’s really like a .zip file that contains the project – you can even rename a .war file to a .zip file and unpack it like you would do with any compressed .zip file.

The wrapper can be cloned from GitHub. Once you have cloned it onto your computer you can open with a Java IDE such as NetBeans or Eclipse. I personally use NetBeans so the snapshots will show that environment. You’ll see the following folders after opening the project:

Beanstalk wrapper in NetBeans

NetBeans will load the dependencies in the POM file automatically. If you’re using something else or prefer to run Maven from the command prompt then here’s the mvn command to execute:

mvn clean compile war:war

The .war file will be placed in the “target” folder as expected. The wrapper will have all the basic dependencies to run an Amazon app such as the AWS SDK, Jackson, various Apache packages etc. In case your KCL app has some external dependencies on its own then those will need to be part of the wrapper app as well. Example: in my case one dependency of my KCL app was commons-collections4-4.0.jar. As this particular Apache dependency wasn’t by default available in the Beanstalk KCL wrapper I had to add it to its POM file. Do that for any such dependency.

The Source Packages folder includes a single Java file called KinesisWorkerServletInitiator.java. It is very short and has the following characteristics:

  • The overridden contextInitialized method will be executed automatically upon application start
  • It will look for a value in a query parameter called “PARAM1”
  • PARAM1 is supposed to be a fully qualified class name
  • The class name refers to a class in the Kinesis application that includes a public parameterless method called “run”
  • KinesisWorkerServletInitiator.java will look for this method and execute it through Reflection

We’ll come back to PARAM1 shortly.

So you’ll need to have a public method called “run” in the KCL app that the wrapper can call upon. Note that you of course change the body of the Kinesis wrapper as you wish. In my case I had to pass in a string parameter to the “run” method so I modified the Reflection code to look for a “run” method which accept a single string argument:

final Class consumerClass = (Class) Class.forName(consumerClassName);
final Method runMethod = consumerClass.getMethod("run", String.class);
runMethod.setAccessible(true);
final Object consumer = consumerClass.newInstance();

.
.
.

@Override
public void run()
{
         try
         {
               m.invoke(o, messageType);
         } catch (Exception e)
         {
               e.printStackTrace();
               LOG.error(e);
         }
}

You can put the run method anywhere – or even change its name and the wrapper app implementation will need to follow. I’ve put mine in the same place as the main method – the “run” method is really nothing else than the KCL application entry point from the Beanstalk wrapper’s point of view. When you test the KCL app locally then the main method will be executed first. When you run it from Beanstalk “run” will be executed first. Therefore the easiest implementation is simply to call “run” from “main” but you may have different needs for local execution. Anyway, you probably get the idea with the “run” method: it will start a Worker which in turn will process the Kinesis messages as you implemented the IRecordProcessor.processRecords method.

Take note of the full name of the class that has the run method. Open the containing class and check the package name, say “com.company.kinesismessageprocessor”. Then check the class name such as KinesisApplication so the full name will be com.company.kinesismessageprocessor.KinesisApplication. You can even put this as a default consumer class name in the Beanstalk wrapper in case PARAM1 is not available:

@Override
public void contextInitialized(ServletContextEvent arg0)
{
        String consumerClassName = System.getProperty(param);
        if (consumerClassName == null) 
        {
            consumerClassName = defaultConsumingClass;
        }
.
.
.
}

…where defaultConsumingClass is a private String holding the above mentioned class name.

The actual wrapping

Now we need to put the KCL application into the wrapper. Compile the KCL app into a JAR file. Copy the JAR file into the following directory of the Beanstalk wrapper web app:

drive:\directory-to-wrapper\src\main\WebContent\WEB-INF\lib

The JAR file should be visible in the project. In my case it looks as follows:

Drop KCL app into Beanstalk wrapper

Compile the wrapper app and the .war file should be ready for upload

Upload

While creating a new application in Beanstalk you will be able to upload the .war file. You’ll be able to upload a new version through the UI of the application:

Beanstalk deployment UI

You’ll be able to configure the Beanstalk app using the Configuration link on the left hand panel:

Configuration link for a Beanstalk app

This is where you can set the value for PARAM1:

Software configuration link in Beanstalk

Define PARAM1 for Beanstalk app

You’ll be able to enter the fully qualified name of the consumer class with the method “run” in the above table. If you don’t like the name “PARAM1” you can add your own parameters in the bottom of the screen and modify the name in code as well.

Troubleshooting

You can always look at the logs:

Request logs from Beanstalk app

You can then search for “exception” or “error” in the log file to check if e.g. an unhandled exception occurred in the application which stops it from functioning correctly.

A common issue is related to roles. When you created the Beanstalk app you have to select a specific IAM role here:

Select IAM role in Beanstalk app

The Beanstalk app will run under the selected role. If the KCL app needs to access other Amazon services, such as S3 or DynamoDb then the selected role must have access to those resources at the level defined by the KCL app. E.g. if the KCL app needs to put a record into a DynamoDb table then the Beanstalk role must have “dynamodb:PutItem” defined. You can edit this in the IAM console available here. Select the appropriate role and extend the role JSON under “Manage policy”:

Modify role in IAM console

View all posts related to Amazon Web Services here.

Advertisement
Elliot Balynn's Blog

A directory of wonderful thoughts

Software Engineering

Web development

Disparate Opinions

Various tidbits

chsakell's Blog

WEB APPLICATION DEVELOPMENT TUTORIALS WITH OPEN-SOURCE PROJECTS

Once Upon a Camayoc

Bite-size insight on Cyber Security for the not too technical.

%d bloggers like this: