Testing and Deploying Applications with GitHub, CodePipeline and Elastic Beanstalk


For the last two and half years, I have been doing much development around automation, tools, microservices and serverless applications. Before that, I was developing applications and services in PHP. Back in those days, I was using a traditional set of tools to integrate, test and deploy my code using GitHub, Travis, Chef (AWS OpsWorks), and a whole lot of manual setup.

Recently I was wondering what it would take to launch a PHP application taking advantage of some new and old technologies that I have had the opportunity to learn and master both at work and in my own time.

The objective of this post is to define a Continuous Integration/Continuous Deployment pipeline and the application infrastructure using the following technologies:

  • AWS CloudFormation to define the infrastructure as code
  • CodePipeline as the primary orchestration mechanism
  • CodeBuild to run tests and install dependencies
  • S3 to save deployment assets
  • SNS as the notification mechanism
  • GitHub webhooks, protected branches and status checks.

Infrastructure as Code

All the code is defined as CloudFormation templates and the code is hosted on the GitHub repository "Elastic Beanstalk CICD"



Continuous integration refers to the practice of merging code into a central repository or integration branch (I refer to this as the integration branch) after a series of code quality checks have passed. There are many methodologies to enable this process including Git Flow and GitHub Flow.

In GitHub, enforce Code Quality Checks by protecting branches and require status checks to pass before pull requests can be merged.

Some of these checks include but are not limited to:

  1. Unit test
  2. Code Reviews
  3. Security validation
  4. Code analysis
  5. Linting
  6. Others

Enabling Required GitHub Status Checks:

Check the article "Enabling required status checks" for more information.


AWS CodeBuild is a simple build service, and unlike some of it is more popular counterparts, you only pay for the time that the service is in use. Another advantage of CodeBuld is that you can define custom docker images for your build container and these can be hosted in Dockerhub or on Elastic Container Registry (ECR). This can be very powerful and quite simple to do if you have a bit of experience with Docker.

To add a GitHub check, we are going to use CodeBuild and trigger it every time code is pushed to the GitHub repository and every time there is a PullRequest. At this time I do not know a way to automatically set up the GitHub webhook without having to create a CloudFormation custom resource. Therefore I am just going to define the CodeBuildTest resource and set it up manually.

A CodeBuild definition is straight forward:

There are a few things to notice here:

  1. The environment variables passed to code build can be used to perform custom actions
  2. We specify a custom Source.BuildSpec file name, this will enable us to define a different series of steps for testing code; the buildspec.yml is used to define the steps to build the code.

The buildspec-test.yml is defined as follows:

Since CodeBuild does not have an image to run PHP, I have created a PHP-Build image that contains several tools required to prepare a PHP application to be deployed.


Continuous Deployment is the process in which teams reliably and continuously release code to a production environment. In this instance, every change that passes the automated tests and is merged into the integration branch is automatically deployed.


The main orchestration mechanism is going to be AWS CodePipeline. In CodePipeline we can define different types of steps, but for the sake of this example the following process is defined:

Source > Build > Deploy


This step listens for code changes in the GitHub repository. The code is downloaded into an S3 bucket. CodePipeline then triggers CodeBuild to start the build process.


CodeBuild is used once again, this time around the steps will be different from the automated tests run. In this instance, the CodeBuild "phases" include:

  1. Install: required dependencies like OS packages, updates and more.
  2. Pre Build: Setup the environment to run the test, build the code, generate configuration, etc.
  3. Build: Here you can build your code or run unit tests.
  4. Post Build: perform any cleanup steps

There is an additional optional step called artifacts. While it is optional, it is essential to use it here to pass the code that we prepared during the build phase to the next step in the pipeline.


This is the act of publishing the latest version of the repository prepared by code build to the application server!


Elastic Beanstalk is a Platform as a Service offering from AWS. It makes it easy to launch and scale applications developed in various programming languages including Java, .Net, PHP, Node.js, Python and more.

This post focuses on PHP, but it can be modified to work with other platforms.

The template defines parameters to set up the following options

  1. Application Name (This must match the code-pipeline application name)
  2. Solution Stack Version (The AWS version of the stack. See Supported Platforms for more information)
  3. Instance type
  4. Environment Type (single instance or auto-scaling)
  5. Min and Max number of instances
  6. Document Root
  7. PHP version
  8. PHP Memory Limit
  9. Environment variables for a database (user, pass, host, DB name)

The diagram shows the application setup when enabling load balancing, but it can be deployed as a single instance. It also shows an RDS instance, but the CloudFormation template does not define it.

Additional resources

The template defines an S3 bucket, the files bucket. It intends to have a place to store any assets required by the application such as images, CSS, js, and others. Ideally, the S3 would have a CloudFront distribution in front of it, but it is an exercise for the reader to implement this.


The stack generates and exports a few values:

  1. The EB application name
  2. The EB environment name
  3. The artifacts bucket
  4. The files output bucket

Final thoughts

I believe essential to remove as many barriers as possible for developers to safely put their code into production, enabling them to have quick feedback on their work and respond faster when issues arise.

Setting up a CI/CD pipeline process early on in the life of any application can foster this process.

At the same time, it is vital for every organization to invest in good engineering practices: testing and reviewing code just to mention a couple; it is everyone's responsibility to ensure the code is held to a very high standard, and no single team (QA or DevOps) can hold the entire responsibility.

I hope this pipeline can serve as a template to start new projects taking advantage of several tools and best practices for CI/CD, but it is by no means an extensive and you are encouraged to improve upon it.

Happy Coding!

Creating a CI/CD pipeline for serverless applications


You can clone or download the template for my github repository: Lambda Serverless CI/CD


Serverless application using AWS lambda let you run code without provisioning or managing servers [1].

Not having to provision or manage servers doesn't imply that you can fire your DevOps team. It merely means that some aspects of running your application lifecycle got a little bit easier, but you still need to think about how to manage and operate serverless applications.

Two years ago we didn't have the ecosystem of tools that we have nowadays. I specifically talk about the Serverless Framework.

With the Serverless Framework, you can quickly setup and deploy a serverless application, but that is as far as serverless go, you still need to think about building, packaging, and continuously integrate and deploying the application.

CI/CD pipeline

I have created a sample CloudFormation template that generates a CI/CD pipeline using AWS CodePipeline and AWS CodeBuild. It deploys a lambda functions to a staging environment. After an approval step the function is deployed to a pducttion environment. These are the steps in more detail:

  1. Deploy the application to a staging environment
    1. Install and package dependencies
    2. Deploy using the serverless framework
  2. Approval process
    1. User must manually approve the deployment to production (ideally after staging has been tested)
  3. Deploy the application to a production environment
    1. Install and package dependencies
    2. Deploy using the serverless framework

Each deployment process runs on CodeBuild, and you can select the docker image that it uses for the build process. By default it runs a custom image that I have created which is a wrapper of the Amazon Linux 2017.09 image, it has pre-installed nodejs 8.8, python3.6, numpy and pandas. While this is an image aims at data science applications, it can be used to create any python3.6 applications.

To get started with the template all you have to do is:

  1. Copy the buildspec.yml and files to the root of your repository
  2. Create the cloud formation stack and plug in the given parameters for your repository using the serverless-codepipeline-cicd_cfm.yml CloudFormation template
  3. Your repo must contain a valid serverless.yml in the root of the project

In the future, I am planning to add more deployment strategies to be able to compile, package and deploy Scala, and .NETCore apps.

The "AutoMaTweet"

At MindTouch, the company that I'm currently working for, we have been hosting a series of hands-on meetups focused on C# and AWS Lambda called λ# (Lambda Sharp). These are 3 hour hackathons where people are placed in random teams and get to learn, colaborate and have a lot of fun! Since I have been learning scala in my own time, I decided to implement one of the most fun challenges to date: The AutoMaTweet.

The AutoMaTweet is a simple AWS Lambda Function that is invoked every time an image is uploaded to an S3 bucket. The funciton will download the image and use a service called rekognition to analize the image and get "tags" or "lables" for the image. Finally the image is posted to twitter using a custom message created based on the labels:

You can find the challange in this github repository, and the Scala solution in my github account

Setup Guide

Clone repository

clone or fork the following github repository:

git clone

Create a new AWS S3 bucket

aws s3api create-bucket --bucket <DEPLOYMENT BUCKET NAME>
aws s3api create-bucket --bucket <IMAGES BUCKET NAME>

Create a role for the lambda function

This role will contain the policy allowing the lambda function to get objects from S3 and use the Rekognition API

aws iam create-role \
    --role-name scala-lambda-role \
    --assume-role-policy-document file://trust-policy.json

    "Role": {
        "AssumeRolePolicyDocument": {
            "Version": "2012-10-17",
            "Statement": [
                    "Action": "sts:AssumeRole",
                    "Effect": "Allow",
                    "Principal": {
                        "Service": ""
        "RoleId": "FEDCBA987654321",
        "CreateDate": "2017-03-24T04:39:08.518Z",
        "RoleName": "scala-lambda-role",
        "Path": "/",
        "Arn": "arn:aws:iam::123456789012:role/scala-lambda-role"

Create a policy


aws iam create-policy \
    --policy-name scala-lambda-s3-rekognition-policy \
    --policy-document file://policy.json

    "Policy": {
        "PolicyName": "scala-lambda-s3-rekognition-policy",
        "CreateDate": "2017-03-24T05:03:31.319Z",
        "AttachmentCount": 0,
        "IsAttachable": true,
        "PolicyId": "ABCDEF123456",
        "DefaultVersionId": "v1",
        "Path": "/",
        "Arn": "arn:aws:iam::123456789012:policy/scala-lambda-s3-rekognition-policy",
        "UpdateDate": "2017-03-24T05:03:31.319Z"

NOTE: Take a note of the ARN returned by this call!

Attach the policy to the role

aws iam attach-role-policy \
    --role-name scala-lambda-role \
    --policy-arn <POLICY ARN>

Compile the project and upload it to S3

stb compile
sbt assembly
aws s3 cp target/scala-2.12/LambdaScala-assembly-1.0.jar s3://<DEPLOYMENT BUCKET NAME>/LambdaScala-assembly-1.0.jar

Create the lambda function

aws lambda create-function \
    --function-name scala-lambda-function \
    --runtime java8 \
    --timeout 30 \
    --memory-size 256 \
    --handler "lambda.AutoMaTweet::lambdaHandler" \
    --code S3Bucket=<DEPLOYMENT BUCKET NAME>,S3Key=LambdaScala-assembly-1.0.jar \
    --role <ROLE ARN>

Wrapping it all up

Create a trigger

At this point login to AWS go to lambda and click on your new function.

Select Triggers > Add trigger > S3

Make sure to select the Bucket: <IMAGES BUCKET NAME> and the Event type: Object Created (All) and click on Submit

Set the environment variables

These values are used to access your twitter application (go to if you don't have one). Create the following environment variables in the Code section of your lambda function: CONSUMER_KEY, CONSUMER_SECRET, ACCESS_TOKEN, TOKEN_SECRET and fill their values with your twitter application keys and access tokens. Save!

Upload an image

Upload an image jpg or png to the <IMAGES BUCKET NAME>.


If you are not seeing anything happen, got the Monitoring section of your lambda function and click on View logs in CloudWatch, this will open the CloudWatch logs for your lambda function. Select the most resent Log Stream.

Scala Case Classes to CSV

For the past eight months, I have have been playing and programming in Scala for fun. One of the many things I wanted to do was to convert a collection of case classes into a CSV file. This sounds like a problem that has already been solved see pure CSV [1], and it has... to an extent.

In my specific use case, I wanted to parse some data from JSON into a case class and eventually convert a collection of these classes into a CSV file. The problem I was running into was that the serialization library I was using, [GSON], sets JSON null values as Java null values, but Pure CSV did not like these null values and was throwing ReferenceNullExceptions.

I ended up creating a simple implementation based on this SO answer where I assign default values to different value types. Here is what I ended up doing:

Here I'm using the com.github.tototoshi.csv.CSVWriter to write to a csv file, and the org.apache.commons.lang3.StringEscapeUtils to format values, but other than that it should be all Scala.

The intention of the toMap method is to allow you to pass a function that check the value for null and set a default one. I have provided default formatting method that should cover some cases, but you can create your own and pass it to the toMap method.