Python on Lambda: Better Packaging Practices

Update 2018-10-22: This is out of date! Since I wrote this, lambda released support for Python 3 and in the new version I don’t have to do the handler import described below (although I don’t know if that’s because of a difference in Python 3 or because of a change in how lambda imports modules). In a future post I’ll cover Python 3 lambda functions in more detail.

Lambda is an AWS service that runs your code for you, without you managing servers. It’s my new favorite tool, but the official docs encourage a code structure that I think is an anti-pattern in Python. Fortunately, after some fiddling I found what I think is a better way. Originally I presented it to the San Diego Python meetup (if you’re in Southern California you should come to the next one!), this post is a recap.

The Lambda Getting Started guide starts you off with simple code, like this from the Hello World “blueprint” (you can find the blueprints in the AWS web console):

def lambda_handler(event, context):
    #print("Received event: " + json.dumps(event, indent=2))
    print("value1 = " + event['key1'])
    print("value2 = " + event['key2'])
    print("value3 = " + event['key3'])
    return event['key1']  # Echo back the first key value
    #raise Exception('Something went wrong')

This gets uploaded to Lambda as a flat file, then you set a “handler” (the function to run). They tell you the handler has this format:

python-file-name.handler-function

So for the Hello World blueprint we set the handler to this:

lambda_function.lambda_handler

Then Lambda knows to look for the lambda_handler function in the lambda_function file. For something as small as a hello world app this is fine, but it doesn’t scale.

For example, here’s a piece of the Alexa skill blueprint (I’ve taken out most of it to keep this short):

from __future__ import print_function

def build_speechlet_response(title, output, reprompt_text, should_end_session):
    ...human interface stuff

def build_response(session_attributes, speechlet_response):
    ...more human interface stuff

def get_welcome_response():
    ...more human interface stuff

def handle_session_end_request():
    ...session stuff

def create_favorite_color_attributes(favorite_color):
    ...helper function

def set_color_in_session(intent, session):
    ...the real logic of the tool part 1

def get_color_from_session(intent, session):
    ...the real logic of the tool part 2

etc...

This doesn’t belong in one long module, it belongs in a Python package. The session stuff goes in its own module, the human interface stuff in a different module, and probably a bunch of modules or packages for the logic of the skill itself. Something like the pypa sample project:

sampleproject/
├── LICENSE.txt
├── MANIFEST.in
├── README.rst
├── data
│   └── data_file
├── sample
│   ├── __init__.py  <-- main() ('handler') here
│   ├── package_data.dat
├── setup.cfg
├── setup.py
├── tests
│   ├── __init__.py
│   └── test_simple.py
└── tox.ini

The project is super simple (all its main() function does is print "Call your main application code here"), but the code is organized, the setup.py tracks dependencies and excludes tests from packaging, there's a clear place to put docs and data, etc. All the awesome things you get from packages. This is how I write my projects, so I want to just pip install and then set the handler to sample.main and let Lambda find main() because it's part of the package namespace.

It turns out this is possible, and the Deployment Package doc sort of tells you how to do it when they talk about dependencies.

Their example is the requests package. They tell you to create a virtualenv, pip install requests, then zip the contents of the site-packages directory along with your module (that’s the directory inside the virtualenv where installed packages end up). Then you can import requests in your code. If you do the same thing with your package, Lambda can run the handler function and you don’t have to upload a module outside the package.

I’m going to use the pypa sample project as an example because it follows a common Python package structure, but we need a small tweak because Lambda calls the handler with args and the main() function of the sample project doesn’t take any.

Change this:

def main():

To this:

def main(*args):

Then do what you usually do with packages (here I’m going to create and install a wheel because I think that’s the current best practice, but there are other ways):

  1. Make a wheel.
    python setup.py -q bdist_wheel --universal
    
  2. Create and activate a virtualenv.
  3. Install the wheel.
    pip install sample-1.2.0-py2.py3-none-any.whl
    
  4. Zip up the contents of this directory:
    $VIRTUAL_ENV/lib/python2.7/site-packages
  5. Upload the zip.
  6. Set the handler to sample.main.

Then it works!

START RequestId: bba5ce1b-9b16-11e6-9773-7b181414ea96 Version: $LATEST
Call your main application code here
END RequestId: bba5ce1b-9b16-11e6-9773-7b181414ea96
REPORT RequestId: bba5ce1b-9b16-11e6-9773-7b181414ea96	Duration: 0.37 ms...

… except when it doesn’t.

The pypa sample package defines its main() in __init__.py, so the handler path is sample.main.

sampleproject/
├── sample
│   ├── __init__.py  <-- main() here

But what if your entry function is in some other module?

sampleproject/
├── sample
│ ├── __init__.py
│ ├── cool.py <– main() here

We can just set the handler to sample.cool.main, right? Nope! Doesn't work.

{
"errorMessage": "Unable to import module 'sample.cool'"
}

I'm working on figuring out why this happens, but the workaround is to import the function I need in __init__.py so my hanlder path only needs one dot. That's annoying but not too bad; lots of folks do that to simplify their package namespace anyway so most Python readers know to look for it. I met a chap at the SD Python group who has some great ideas on why this might be happening, if we figure it out I'll post the details.

To sum up:

  • You don’t have to cram all your code into one big file.
  • If you pip install your package, all you need to upload is the site-packages directory.
  • Remember to import your handler in the root __init__.py if you get import errors.

Thanks for reading!

Adam

One thought on “Python on Lambda: Better Packaging Practices

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s