Blog

Creating a CI/CD pipeline for serverless applications

TL;DR

You can clone or download the template for my github repository: Lambda Serverless CI/CD

Motivation

Serverless application using AWS lambda let you run code without provisioning or managing servers [1].

Not having to provision or manage servers doesn't imply that you can fire your DevOps team. It merely means that some aspects of running your application lifecycle got a little bit easier, but you still need to think about how to manage and operate serverless applications.

Two years ago we didn't have the ecosystem of tools that we have nowadays. I specifically talk about the Serverless Framework.

With the Serverless Framework, you can quickly setup and deploy a serverless application, but that is as far as serverless go, you still need to think about building, packaging, and continuously integrate and deploying the application.

CI/CD pipeline

I have created a sample CloudFormation template that generates a CI/CD pipeline using AWS CodePipeline and AWS CodeBuild. It deploys a lambda functions to a staging environment. After an approval step the function is deployed to a pducttion environment. These are the steps in more detail:

  1. Deploy the application to a staging environment
    1. Install and package dependencies
    2. Deploy using the serverless framework
  2. Approval process
    1. User must manually approve the deployment to production (ideally after staging has been tested)
  3. Deploy the application to a production environment
    1. Install and package dependencies
    2. Deploy using the serverless framework

Each deployment process runs on CodeBuild, and you can select the docker image that it uses for the build process. By default it runs a custom image that I have created which is a wrapper of the Amazon Linux 2017.09 image, it has pre-installed nodejs 8.8, python3.6, numpy and pandas. While this is an image aims at data science applications, it can be used to create any python3.6 applications.

To get started with the template all you have to do is:

  1. Copy the buildspec.yml and deploy.sh files to the root of your repository
  2. Create the cloud formation stack and plug in the given parameters for your repository using the serverless-codepipeline-cicd_cfm.yml CloudFormation template
  3. Your repo must contain a valid serverless.yml in the root of the project

In the future, I am planning to add more deployment strategies to be able to compile, package and deploy Scala, and .NETCore apps.

The "AutoMaTweet"

At MindTouch, the company that I'm currently working for, we have been hosting a series of hands-on meetups focused on C# and AWS Lambda called λ# (Lambda Sharp). These are 3 hour hackathons where people are placed in random teams and get to learn, colaborate and have a lot of fun! Since I have been learning scala in my own time, I decided to implement one of the most fun challenges to date: The AutoMaTweet.

The AutoMaTweet is a simple AWS Lambda Function that is invoked every time an image is uploaded to an S3 bucket. The funciton will download the image and use a service called rekognition to analize the image and get "tags" or "lables" for the image. Finally the image is posted to twitter using a custom message created based on the labels:

You can find the challange in this github repository, and the Scala solution in my github account

Setup Guide

Clone repository

clone or fork the following github repository:

git clone https://github.com/onema/TheAutoMaTweet

Create a new AWS S3 bucket

aws s3api create-bucket --bucket <DEPLOYMENT BUCKET NAME>
aws s3api create-bucket --bucket <IMAGES BUCKET NAME>

Create a role for the lambda function

This role will contain the policy allowing the lambda function to get objects from S3 and use the Rekognition API

aws iam create-role \
    --role-name scala-lambda-role \
    --assume-role-policy-document file://trust-policy.json

{
    "Role": {
        "AssumeRolePolicyDocument": {
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Action": "sts:AssumeRole",
                    "Effect": "Allow",
                    "Principal": {
                        "Service": "lambda.amazonaws.com"
                    }
                }
            ]
        },
        "RoleId": "FEDCBA987654321",
        "CreateDate": "2017-03-24T04:39:08.518Z",
        "RoleName": "scala-lambda-role",
        "Path": "/",
        "Arn": "arn:aws:iam::123456789012:role/scala-lambda-role"
    }
}

Create a policy

I have included a simple policy file. PLEASE NOTE THIS POLICY IS VERY OPEN AND SHOULD ONLY BE USED FOR TEST APPS

aws iam create-policy \
    --policy-name scala-lambda-s3-rekognition-policy \
    --policy-document file://policy.json

{
    "Policy": {
        "PolicyName": "scala-lambda-s3-rekognition-policy",
        "CreateDate": "2017-03-24T05:03:31.319Z",
        "AttachmentCount": 0,
        "IsAttachable": true,
        "PolicyId": "ABCDEF123456",
        "DefaultVersionId": "v1",
        "Path": "/",
        "Arn": "arn:aws:iam::123456789012:policy/scala-lambda-s3-rekognition-policy",
        "UpdateDate": "2017-03-24T05:03:31.319Z"
    }
}

NOTE: Take a note of the ARN returned by this call!

Attach the policy to the role

aws iam attach-role-policy \
    --role-name scala-lambda-role \
    --policy-arn <POLICY ARN>

Compile the project and upload it to S3

stb compile
sbt assembly
aws s3 cp target/scala-2.12/LambdaScala-assembly-1.0.jar s3://<DEPLOYMENT BUCKET NAME>/LambdaScala-assembly-1.0.jar

Create the lambda function

aws lambda create-function \
    --function-name scala-lambda-function \
    --runtime java8 \
    --timeout 30 \
    --memory-size 256 \
    --handler "lambda.AutoMaTweet::lambdaHandler" \
    --code S3Bucket=<DEPLOYMENT BUCKET NAME>,S3Key=LambdaScala-assembly-1.0.jar \
    --role <ROLE ARN>

Wrapping it all up

Create a trigger

At this point login to AWS go to lambda and click on your new function.

Select Triggers > Add trigger > S3

Make sure to select the Bucket: <IMAGES BUCKET NAME> and the Event type: Object Created (All) and click on Submit

Set the environment variables

These values are used to access your twitter application (go to https://apps.twitter.com/ if you don't have one). Create the following environment variables in the Code section of your lambda function: CONSUMER_KEY, CONSUMER_SECRET, ACCESS_TOKEN, TOKEN_SECRET and fill their values with your twitter application keys and access tokens. Save!

Upload an image

Upload an image jpg or png to the <IMAGES BUCKET NAME>.

Troubleshooting

If you are not seeing anything happen, got the Monitoring section of your lambda function and click on View logs in CloudWatch, this will open the CloudWatch logs for your lambda function. Select the most resent Log Stream.

Scala Case Classes to CSV

For the past eight months, I have have been playing and programming in Scala for fun. One of the many things I wanted to do was to convert a collection of case classes into a CSV file. This sounds like a problem that has already been solved see pure CSV [1], and it has... to an extent.

In my specific use case, I wanted to parse some data from JSON into a case class and eventually convert a collection of these classes into a CSV file. The problem I was running into was that the serialization library I was using, [GSON], sets JSON null values as Java null values, but Pure CSV did not like these null values and was throwing ReferenceNullExceptions.

I ended up creating a simple implementation based on this SO answer where I assign default values to different value types. Here is what I ended up doing:

Here I'm using the com.github.tototoshi.csv.CSVWriter to write to a csv file, and the org.apache.commons.lang3.StringEscapeUtils to format values, but other than that it should be all Scala.

The intention of the toMap method is to allow you to pass a function that check the value for null and set a default one. I have provided default formatting method that should cover some cases, but you can create your own and pass it to the toMap method.

AWS Security groups nuances

When adding an ingress rule to a security group A in VPC, and specifying access to another security group B in VPC, you can only connect from an instance b in B (classic linked or not) to an instance a in A using a’s private ip address.

If you need to talk to instance a using it's public IP, you must add a ingress rule using CIDR notation in group A.