Introduction to Amazon Code Pipeline with Java part 8: the job agent communication process
May 8, 2016 Leave a comment
Introduction
In the previous post we went through the details behind the communication between your web site and CodePipeline when a CP user selects your third party action. The third party action developer will be responsible to set up and maintain an external configuration page where CP users can complete their action configuration. CP and the external page communicate with each other using HTTPS POST calls and URL-encoded JSON objects. CP sends a number of properties to the configuration page in the JSON object. These include the client ID and client token that you’ll need to save in your data store and map them to a user in your system. The configuration page redirects the user to the CP GUI with the list of key-value pairs that contain the configuration values for the test runner. These values will be available to the job agent when the job runs.
In this post we’ll look into the details of the communication between CP and the job agent. This post will go through the process at an overall level. The upcoming code examples will show a lot more details.
Job agent communication process
Recall that the job agent is a long-running process that constantly polls a CodePipeline endpoint for new jobs. The endpoint being monitored depends on the AWS region. CP is currently offered in two regions: us-east-1 (North Virginia) and us-west-2 (Oregon). Their CP endpoints are the following:
You’ll find the complete list of endpoints for all sorts of Amazon services on this page. You’ll also find CodePipeline in the list.
So now we’re at the stage where the third-party action has been included in a customer’s pipeline. Before we look at the communication process there’s an important detail you need to be aware of. For each pipeline there will be a corresponding bucket in Amazon S3. What is S3? It’s a document storage service by Amazon. There’s a series devoted to S3 on this blog starting here if you want to learn more. Otherwise it’s enough to know that S3 – Simple Storage Service – can store files in so-called buckets. A bucket is something like a folder on a file system.
Here’s what the top S3 bucket can look like per CP region:
Each top bucket will include a sub-bucket for each pipeline like here:
The pipeline buckets will include the artifacts for the actions of a pipeline stage to work upon. If the third party action will need to work directly with the artifacts, like in the case of a build action, then the artifact will be available during the action execution.
OK, we’re now at the point where the pipeline has been triggered either manually or a new commit to the configured source control, like S3 or GitHub. The pipeline execution has reached your third party action, what happens next? At that point, assuming that the job agent has been configured correctly, e.g. the correct CP endpoint is being monitored, then the function in the AWS SDK code that is responsible for periodically polling CP will get a concrete job as a response.
The job will include the following details that you can act upon:
- The CP job ID: this is a long base 64 encoded string that you can easily decode if you want. It includes some encrypted material about that job. Honestly, I’ve never needed to look at the CP job ID for anything up to this point. However, it can be a useful reference point if you want to know the details of why a job has failed at CP. In that case AWS will probably need the job ID
- The AWS client ID that was originally provided when the user has set up the third party action and you presumably saved in your data store
The next step for the job agent is to look up the client ID in the data store and retrieve the client token. There’s a separate method in the AWS SDK to retrieve the job details, such as the list of key-value objects that hold the job configuration details. At this point you’ll need to send the client token so that CP can confirm the validity of the request. If this validation succeeds then the job details are also returned.
This is where the real job execution begins. It is up to each job agent to implement the action performed. In our case we initiate a load test with the predefined job configuration details. At the end of the job execution the job agent can either return a success or failure result to CP.
There’s one implementation detail though that all job agent developers must get familiar with. It is the so-called continuation token. I think the continuation token is such an important concept in this job execution process that it deserves its own dedicated post.
We’ll take a closer look at it in the next post.
View all posts related to Amazon Web Services and Big Data here.