3 Tools to Validate CloudFormation

Hello!

Note: If you just want the script and don’t need the background, go to the gist.

If you found this page, SEO means you probably already found the AWS page on validating CloudFormation templates. If you haven’t, read that first. It’s a better starting place.

I run three tools before applying CF templates.

#1 AWS CLI’s validator

This is the native tool. It’s ok. It’s really only a syntax checker, there are plenty of errors you won’t see until you apply a template to a stack. Still, it’s fast and catches some things.

aws cloudformation validate-template --template-body file://./my_template.yaml

Notes:

  • The CLI has to be configured with access keys or it won’t run the validator.
  • If the template is JSON, this will ignore some requirements (e.g. it’ll allow trailing commas). However, the CF service ignores the same things.

#2 Python’s JSON library

Because the AWS CLI validator ignores some JSON requirements, I like to pass JSON templates through Python’s parser to make sure they’re valid. In the past, I’ve had to do things like load and search templates for unused parameters, etc. That’s not ideal but it’s happened a couple times while doing cleanup and refactoring of legacy code. It’s easier if the JSON is valid JSON.

It’s fiddly to run this in a shell script. I do it with a heredoc so I don’t have to write multiple scripts to the filesystem:

python - <<END
import json
with open('my_template.json') as f:
    json.load(f)
END

Notes:

  • I use Python for this because it’s a dependency of the AWS CLI so I know it’s already installed. You could use jq or another tool, though.
  • I don’t do the YAML equivalent of this because it errors on CF-specific syntax like !Ref.

#3 cfn-nag

This is a linter for CloudFormation. It’s not perfect. I’ve seen it generate false positives like “don’t use * in IAM policy resources” even when * is the only option because it’s all that’s supported by the service I’m writing a policy for. Still, it’s one more way to catch things before you deploy, and it catches some good stuff.

cfn_nag_scan --input-path my_template.yaml

Notes:

  • Annoyingly, this is a Ruby gem so you need a new dependency chain to install it. I highly recommend setting up RVM and creating a gemset to isolate this from your system and other projects (just like you’d do with a Python venv).

Happy automating!

Adam

Python on Mac OS X: One of the Good Ways

Good morning!

When I start Python development on a new Apple, I immediately hit two problems:

  1. I need a version of Python that’s not installed.
  2. I need to install a bunch of packages from PyPI for ProjectA and a different bunch for ProjectB.

Virtualenv is not the answer! That’s the first tool you’ll hear about but it only partially solves one of these problems. You need more. There are a ton of tools and a ton of different ways to use them. Here’s how I do it on Apple’s Mac OS X.

If you’re asking questions like, “Why do you need multiple versions installed? Isn’t latest enough?” or “Why not just pip install all the packages for ProjectA and ProjectB?” then this article probably isn’t where you should start. Great answers to those questions have already been written. This is just a disambiguation page that shows you which tools to use for which problems and how to use them.

Installing Python Versions

I use pyenv, which is available in homebrew. It allows me to install arbitrary versions of Python and switch between them without replacing what’s included with the OS.

Note: You can use homebrew to install other versions of Python, but only a single version of Python 2 and a single version of Python 3 at a time. You can’t easily switch between two projects each frozen at 3.4 and 3.6 (for example). There’s also a limited list of versions available.

Install pyenv:

$ brew update
$ brew install pyenv

Ensure pyenv loads when you login by adding this to ~/.profile:

$ eval "$(pyenv init -)"

Activate pyenv now by either closing and re-opening Terminal or running:

$ source ~/.profile

List which versions are available and install one:

$ pyenv install --list
$ pyenv install 3.6.4

If the version you wanted was missing, update pyenv via homebrew:

$ brew update && brew upgrade pyenv

If you get weird errors about missing gcc or zlib, install the XCode Command Line Tools and try again:

$ xcode-select --install

I always set my global (aka default) version to the latest 3:

$ pyenv global 3.6.4

Update 2018-10-23: If I need several versions available, for example to run tests in tox:

$ pyenv global 3.6.4 3.7.0

Setting these makes versioned Python commands available:

$ python3.6 --version
$ python3.7 --version

Pyenv has lots of great features, like support for setting a different version whenever you’re in a specific directory. Check out its commands reference.

Installing PyPI Packages

In the old days, virtualenv was always the right solution. Today, it depends on the version of Python you’re using.

Python <= 3.3 (Including 2)

This is legacy Python, when environment management wasn’t native. In these ancient times, you needed a third party tool called virtualenv.

$ pyenv global 2.7.14
$ pip install virtualenv
$ virtualenv ~/my_env
$ source ~/my_env/bin/activate
(my_env) $ pip install 

This installs the virtualenv Python package into the root environment for the legacy version of Python I need, then creates a virtual Python environment where I can install project-specific dependencies.

Python >= 3.3

In PEP 405 an environment manager called venv was added to core. It works pretty much like virtualenv.

Note: Virtualenv works with newer versions of Python, but it’s better to use a core library than to add a dependency. I only use the third party tool when I have to.

$ pyenv global 3.6.4
$ python -m venv ~/my_env
$ source ~/my_env/bin/activate
(my_env) $ pip install 

Happy programming!

Adam

Terraform: Get Data with Python

Update 2017-12-26: There’s now a more complete, step-by-step example of how to use terraform’s data resource, pip, and this decorator in the source.

Good morning!

Sometimes I have data I need to assemble during terraform’s apply phase, and I like to use Python helper scripts to do that. Awesomely, terraform natively supports using Python to populate the data resource:

data "external" "cars_count" {
  program = ["python", "${path.module}/get_cool_data.py"]

  query = {
    thing_to_count = "cars"
  }
}

output "cars_count" {
  value = "${data.external.cars_count.result.cars}"
}

A slick, easy way to drop out of terraform and use Python to grab what you need (although it can get you in to trouble if you abuse it).

The Python script has to follow a protocol that defines formats, error handling, etc. It’s minimal but it’s fiddly, plus if you need more than one external data script it’s better to modularize than copy and paste, so I wrote a pip-installable package with a decorator that implements the protocol for you:

from terraform_external_data import terraform_external_data

@terraform_external_data
def get_cool_data(query):
    return {query['thing_to_count']: '3'}

if __name__ == '__main__':
    get_cool_data()

And it’s available on PyPI! Just pip install terraform_external_data. Here’s the source.

Happy terraforming,

Adam

EC2: Stateful Statelessness

networkingcontentdelivery_grayscale_amazonvpc.png

Hello!

While studying for an AWS certification, I rediscovered a fiddly networking detail: although ICMP’s ping is stateless, EC2 security groups will pass return ping traffic even when only one direction is defined in their rules. I wanted to see this in action, so I built a lab.

If you just asked, “Wat❓”, keep reading. Skip to the next section if you just want the code.

Background

Network hosts using stateful protocols (like TCP) distinguish between packets that are part of an established connection and packets that are new. For example, when I SSH (which runs on TCP) from A to B:

  1. A asks B to start a new connection.
  2. B agrees.
  3. A and B exchange bunches of packets that are part of the connection they agreed to.
  4. A and B agree to close the connection.

There’s a difference between a new packet and a packet that’s part of an ongoing connection. That means the connection, and its packets, have state (e.g. new vs established). Stateful firewalls (vs stateless) are aware of this:

  1. A ask B to start a new connection.
  2. Firewalls in between allow these packets if there is an explicit rule allowing traffic from A to B.
  3. A and B exchange bunches of packets.
  4. Firewalls in between allow the packets from A to B because of the explicit rule above. However, they allow the return traffic from B to A even if there is no explicit rule to allow it. Since B agreed to the connection the firewall assumes that packets in that connection should be allowed.

In EC2, this is why you only need an outgoing rule on A’s Security Group (SG) and an incoming rule on B’s Security Group to SSH from A to B. EC2 SGs are stateful, and allow the return traffic implicitly.

Ok, here’s the gnarly bit. ICMP (the protocol behind ping) is stateless. Hosts don’t have a negotiation phase where the agree to establish a connection. They just send packets and hope. So, doesn’t that mean I need to write explicit firewall rules in the SGs to allow the return traffic? If the firewall can’t see the state of the connection, it won’t be able to implicitly figure out to allow that traffic, right?

Nope, they infer state based on timeouts and packet types. ICMP pings are ECHO requests answered by ECHO replies. If the SG has seen a request within the timeout, it makes the educated guess that replies are essentially part of “established” connections and allows them. This is what I wanted to see in action.

The Lab

I setup a VPC with two hosts, A (10.0.1.70) and B (10.0.2.35). They’re in different subnets but the ACLs allow all traffic so they don’t influence the test. Here are the SG rules for A (10.0.0.0/16 covers the entire VPC):

test_a_inboundtest_a_outbound

And the rules for B:

test_b_inboundtest_b_outbound

A allows outgoing ICMP to B, and B allows incoming ICMP from A. The return traffic is not allowed by any rules.

The Test Script

I didn’t find a way to send just replies without requests in Linux, so I bodged together a Python script:

"""
This is a stripped-down version of ping that allows you to send a reply without responding a request. This was needed
to test the details of how Amazon EC2 security groups handle state with ICMP traffic. You shouldn't use this for normal
pings.

The ping implementation was based on Samuel Stauffer's python-ping: https://github.com/samuel/python-ping (which only
works with Python 2).

This must be run as root.
You must tell the Linux kernel to ignore ICMP before you run this or it'll eat some of the traffic:
    echo 1 > /proc/sys/net/ipv4/icmp_echo_ignore_all
"""

import argparse, socket, struct, time

def get_arguments():
    parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter)
    parser.add_argument('--send-request', metavar='IP_ADDRESS', type=str, help='IP address to send ECHO request.')
    parser.add_argument('--receive', action='store_true', help='Wait for a reply.')
    parser.add_argument('--send-reply', metavar='IP_ADDRESS', type=str, help='IP address to send ECHO reply.')
    return parser.parse_args()

def receive(my_socket):
    while True:
        recPacket, addr = my_socket.recvfrom(1024)
        icmpHeader = recPacket[20:28]
        icmp_type, code, checksum, packetID, sequence = struct.unpack("bbHHh", icmpHeader)
        print('Received type {}.'.format(icmp_type))

def ping(my_socket, dest_addr, icmp_type):
    dest_addr = socket.gethostbyname(dest_addr)
    bytesInDouble = struct.calcsize("d")
    data = (192 - bytesInDouble) * "Q"
    data = struct.pack("d", time.time()) + data
    dummy_checksum = 1 & 0xffff
    dummy_id = 1 & 0xFFFF
    # Header is type (8), code (8), checksum (16), id (16), sequence (16)
    header = struct.pack("bbHHh", icmp_type, 0, socket.htons(dummy_checksum), dummy_id, 1)
    packet = header + data
    my_socket.sendto(packet, (dest_addr, 1))

if __name__ == '__main__':
    args = get_arguments()
    icmp = socket.getprotobyname("icmp")
    my_socket = socket.socket(socket.AF_INET, socket.SOCK_RAW, icmp)
    if args.send_request:
        ping(my_socket, args.send_request, icmp_type=8)  # Type 8 is ECHO request.
    if args.receive:
        receive(my_socket)
    if args.send_reply:
        ping(my_socket, args.send_reply, icmp_type=0)  # Type 0 is ECHO reply.

You can skip reading the code, the important thing is that we can individually choose to listen for packets, send ECHO requests, or send ECHO replies:

python ping.py --help
usage: ping.py [-h] [--send-request IP_ADDRESS] [--receive]
               [--send-reply IP_ADDRESS]

optional arguments:
  -h, --help            show this help message and exit
  --send-request IP_ADDRESS
                        IP address to send ECHO request. (default: None)
  --receive             Wait for a reply. (default: False)
  --send-reply IP_ADDRESS
                        IP address to send ECHO reply. (default: None)<span 				data-mce-type="bookmark" 				id="mce_SELREST_start" 				data-mce-style="overflow:hidden;line-height:0" 				style="overflow:hidden;line-height:0" 			></span>

The Experiment

SSH to each host and tell Linux to ignore ICMP traffic so I can use the script to capture it (see docstring in the script above):

sudo su -
echo 1 > /proc/sys/net/ipv4/icmp_echo_ignore_all
exit

Normal Ping

I send a request from A to B and expect the reply from B to A to be allowed. Here’s what happened:

Remember A is 10.0.1.70 and B is 10.0.2.35.

normal_ping_test_anormal_ping_test_b

Ok, nothing surprising. I sent a request from A to B and started listening on A, a little later I send a reply from B to A and it was allowed. You can do this test with the normal Linux ping command (but not until you tell the kernel to stop ignoring ICMP traffic). This test just validates that my bodged Python actually works.

Reply Only

First we wait a bit. The previous test sent a request from A to B, which started a timer in the SG. Until that timer expires, reply traffic will be allowed. We need to wait for that expiration before this next test is valid.

reply_only_ping_test_areply_only_ping_test_b

Boom! I start listening on A, without sending a request. On B I send a reply to A but it never arrives. The Security Group didn’t allow it. This demonstrates that EC2 Security Groups are inferring the state of ICMP pings by reading their type.

Other Tests

I also tried a couple other things that I’ll leave to you to reproduce in your own lab if you want to see them.

  • Start out like the normal test. Send a request from A to B and start listening on A. Then send several replies from B to A. They’re all allowed. This shows that the SG isn’t counting to ensure it only allows one reply for each request; if it has seen just one request within the timeout it allows replies even if there are multiple.
  • Edit the script above to set the hardcoded ID to be different on A than it is on B. Then nothing works at all. I’m not actually sure what causes this. Could be that the SG is looking at more than just the type, but it could also be in the kernel or the network drivers or somewhere else I haven’t thought of. If you figure it out, message me!

Conclusion

I had free time over the holidays this year! Realistically, understanding this demo isn’t a priority for doing good work on AWS. I just enjoy unwrapping black boxes to see how the parts move.

Be well!

Adam

Production-ready Scripts and Python

Production is hard. Even a simple script that looks up queue state and sends it to an API gets complex in prod. Without tests, the divide by zero case you missed will mask queue overloads. Someone won’t see that required argument you didn’t enforce and break everything when they accidentally publish a null value. You’ll forget to timestamp one of your output lines and then when the queue goes down you won’t be able to correlate queue status to network events.

Python can help! Out of box it can give you essential but often-skipped features, like these:

  • Automated tests for multiple platforms.
  • A --simulate option.
  • Command line sanity like a --help option and enforcement of required arguments.
  • Informative log output.
  • An easy way to build and package.
  • An easy way to install a build without a git clone.
  • A command that you can just run like any other command. No weird shell setup or invocation required.

It can be a little tricky, though, if you haven’t already done it. So I wrote a project that demonstrates it for you. It includes an example of a script that isn’t ready for prod.

Hopefully this will save you from some of the many totally avoidable, horrible problems that bad scripts have caused in my prods.

Thanks for reading!

Adam

Pear: A Better Way to Deploy to AWS

Update 29 September 2018: Since the release of AWS Step Functions, this pattern is out of date. Orchestrating deployment with a slim lambda function is still a solid pattern, but Step Functions could make the implementation simpler and could provide more features (like graphical display of deployment status). You can even see some of the ways this might work in AWS’s DevOps blog. Those implementation changes would drive some re-architecture, too. Because those are major changes, for now Pear is an interesting exploration of a pattern but isn’t ready for use in new deployments.

A while back I wanted to put a voice interface in front of my deployment automation. I think passwords on voice interfaces are annoying and aren’t secure, so I wanted an unauthenticated system. I didn’t want to lay awake at night worried about all the things people could break with it, so I set out to engineer a deployment infrastructure that I could put behind voice control and still feel was secure and reliable.

After a lot of learning (the journey towards this is where the Life Coach and the Better Alexa Quick Start came from) and several revisions, I had built my new infrastructure. For reasons I can’t remember, I called it Pear.

Pear is designed to make it easy to add slick features like voice interfaces while giving you enough control to stay secure and enough stability to operate in production. It’s meant for infrastructures too complex to fit in tools like Heroku or Elastic Beanstalk, for situations where you need to be able to turn all the knobs. I think it achieves those goals, so I decided to publish it. Check out the repo for an example implementation and more complete documentation of its features, but to give you a taste here’s a diagram of the basic architecture:

pear architecture

Cheers!

Adam

The Better Alexa Quick Start

You’ve seen the Life Coach. That was my second Alexa project. The one I used to learn the platform.

I began with Amazon’s quick start tutorial, but I didn’t like the alexa-skills-kit-color-expert-python example code. It feels like a spaghetti-nest and is more complicated than it needs to be for a quick start (it’s 200 lines, the Life Coach is 40).

Hoping to make the intro process smoother for you, I’ve turned the Life Coach into my own quick start for Alexa. It gives you an example that’s simple but still has the tools you need to do your own development. Essential things like error log capturing and scripts to build and publish code updates that aren’t included in the Amazon guide.

Especially without Ops or AWS background, setting up AWS well enough that you don’t want to pull out your hair can be harder than writing your first skill. I’ve included automation to handle the infrastructure. You can skip that and just use the Life Coach code if you’d rather.

I’ve also included the better lambda packaging practices from my previous post.

Hopefully this makes your Alexa journey easier. Check it out!

If you want an intro to how Alexa works before you dive in to code, check out the slides from my lighting talk on this project at the May 2017 meeting of the San Diego Python group.

Talk to you soon,

Adam