Hassy Veldstra

Building Artillery.io • Interested in OSS, SRE, product design, SaaS • always up for coffee ☕ • h@veldstra.org@hveldstra


Meet Chaos Llama

I’m excited to release v1 of Chaos Llama, my latest open-source project under the banner of Shoreditch Ops.

Star

                 V
                /'>>>
               /*/  _____ _____ _____ _____ _____
              / /  |     |  |  |  _  |     |   __|
             /*/   |   --|     |     |  |  |__   |
            / /    |_____|__|__|__|__|_____|_____|
    -------/*/      __    __    _____ _____ _____
 --/  *  * */      |  |  |  |  |  _  |     |  _  |
  /* * *  */       |  |__|  |__|     | | | |     |
  -  --- -/        |_____|_____|__|__|_|_|_|__|__|
   H    H
   H    H
   --   --

What is Chaos Llama?

Chaos Llama is a small utility for testing the relisience of AWS architectures to random failures.

Once deployed, Chaos Llama will pick and terminate instances at random at some configurable interval. The idea is to constantly test your system’s ability to keep running despite partial failure of some components making the system more resilient to outages overall.

If this sounds familiar, that’s because Chaos Llama is inspired by Netflix’s notorious Chaos Monkey. The main difference between Chaos Monkey and Chaos Llama is simplicity. Whereas Chaos Monkey requires an EC2 instance to be created, configured and maintained to run, Chaos Llama takes advantage of AWS Lambda and can be installed and deployed in a matter of minutes. The flipside of that is that Chaos Llama has a smaller feature set and only runs on AWS.

How Chaos Llama Works

There are two parts to Chaos Llama: the CLI that lets you deploy and configure Llama, and the AWS Lambda function which picks and terminates an instance when it’s run.

  1. The CLI The llama-cli package is a Node.js CLI application (using the awesome yargs library) that uses the AWS Node.js SDK to create and update the lambda function and to create an invokation schedule for it with Cloud Watch Events.
  2. The lambda function The lambda function that contains the logic for selecting and terminating EC2 instances.

Getting Started With Chaos Llama

Installation

Install the CLI with:

npm install -g llama-cli

(If you don’t have Node.js/npm installed, grab an installer for your platform from nodejs.org.)

AWS Config

To deploy Llama, you’ll need an IAM User (for the CLI to run as) and an IAM Role (for the lambda).

Sset up an IAM user (if you don’t have one already):

  1. Log into the AWS Console
  2. Navigate to IAM -> Users -> Create New Users
    • Name the new user something like chaos_llama
    • Copy the Access Key ID and Secret Access Key into ~/.aws/credentials:
     [llama]
     aws_access_key_id=YOUR_KEY_ID_HERE
     aws_secret_access_key=YOUR_ACCESS_KEY_HERE
    

Then, create a Role for Llama’s lambda function:

  1. Finally, navigate to Roles -> Create New Role
  2. Name the new role something like chaos_llama
  3. Select ‘EC2’ under ‘AWS Service Roles’
  4. Select AmazonEC2FullAccess in the list of policies
  5. Take note of the Role ARN somewhere

Deploy the llama

Once the IAM User is set up and your have the role ARN, run:

AWS_PROFILE=llama llama deploy -r $LAMBDA_ROLE_ARN

This will deploy Chaos Llama to your AWS environment, but it won’t actually do anything by default.

Configure Chaos Llama

To configure termination rules, run deploy with a Llamafile:

AWS_PROFILE=llama llama deploy -c Llamafile.json

Llama Configuration

A Llamafile is a JSON file that configures your Chaos Llama:

{
  "interval": "60",
  "enableForASGs": [
  ],
  "disableForASGs": [
  ]
}

The options are:

If both enableForASGs and disableForASGs are specified, then only enableForASGs rules are applied.

Further Plans

Would you like to contribute? The Issues is a good place to start for some ideas. Feel free to email me on h@veldstra.org if you have any questions.

More

P.S. Only 90s kids will understand

Nobody whips this llama’s ass.