AWS Lambda Layers and Ruby

I’ve been building a few things using lambda functions in ruby recently. Some of these are just standalone functions, some end up as more complicated stacks, with multiple functions working together, usually with a lot of common dependencies.

This was annoying to work with since on each sam build invocation bundler was redownloading & reinstalling these dependencies for each function, even if you had only changed one of the functions. By the time I got to 4-5 functions in a stack it was really starting to slow down that fast iteration that is so beneficial. This got me thinking - could I put the vendored bundle in a layer that was used by all my functions?

One Gemfile to rule them all

At the start my directory looked at little like this - the standard sam setup ¹

.
├── Gemfile
├── Gemfile.lock
├── function1
│   ├── Gemfile
│   ├── Gemfile.lock
│   └── app.rb
├── function2
│   ├── Gemfile
│   ├── Gemfile.lock
│   └── app.rb
└── template.yaml

The first thing I did was merge the Gemfiles into a single one that I could use to build the layer. In order to make sure this would never diverge from what the functions are using, I added this at the top level and symlinked it into each function ². I then added a folder called dependencies for my layer, containing just the Gemfile and Gemfile.lock (also as symlinks), and added the layer itself to the template:

Dependencies:
  Type: AWS::Serverless::LayerVersion
  Properties:
    LayerName: !Sub "${AWS::StackName}-dependencies"
    ContentUri: dependencies/
    CompatibleRuntimes:
      - ruby2.7
    LicenseInfo: 'MIT'
    RetentionPolicy: Retain
  Metadata:
    BuildMethod: ruby2.7

Each function needs to be told to use this new layer, by adding this to its properties:

Layers:
  - !Ref Dependencies

sam build / sam deploy ran ok - this all seems too easy!

Trouble in paradise

Unfortunately I’ve simultaneously broken things and made them slower:

sam build still wants to run bundler in each of my function directories
The layer doesn’t put gems where ruby is looking for them (see this issue)
The layer installed all the gems, including test/development ones

It’s tempting to fix the first issue by just removing the Gemfile from each function, but I didn’t want to do that - I really value bundler enforcing that the correct versions of gems and ruby are installed and the work it does to avoid various forms of dependency hell.

Luckily there is a way out: while sam build has a default way of building functions for a given runtime (i.e. run bundler for ruby, pip for python etc.) you can override this by setting the build method to ‘makefile’, in which case it will just run make instead. This is most useful for custom runtimes but pretty handy here too.

This results in functions that look like

Function1:
  Type: AWS::Serverless::Function
  Properties:
    CodeUri: function1/
    Handler: app.lambda_handler
    Runtime: ruby2.7
    Layers:
      - !Ref Dependencies
  Metadata:
    BuildMethod: 'makefile'

and a Makefile (in the directory for the function) that just copies the source into the destination directory

build-Function1:
  cp -pLR . "$(ARTIFACTS_DIR)"

You do end up with a bunch of nearly identical makefiles, which is a bit tedious but ok. Don’t forget that makefiles need to use tabs, not spaces!

Finding Nemo Rubygems

While researching the second issue I came across a post by Josh Kahn that seemed to do exactly what I wanted: it builds the layer using a custom makefile that runs bundler and then moves the output around so that the gems are in ruby’s search path.

This was so close to what I wanted! Unfortunately we have some private code that we install as gems build from git, so that approach doesn’t quite work - you need to preserve more of bundler’s structure. I ended up letting mostly bundler just do its thing:

build-Dependencies:
  BUNDLE_WITHOUT="development:test" bundle install "--path=$(ARTIFACTS_DIR)/ruby/lib/vendor/bundle"
  rm -rf "$(ARTIFACTS_DIR)/ruby/lib/vendor/bundle/ruby/2.7.0/cache"
  rm -rf "$(ARTIFACTS_DIR)/ruby/lib/vendor/bundle/ruby/2.7.0/bin"

This uses BUNDLE_WITHOUT to avoid installing unneeded gems and --path to control the install location. The last two rm-rf steps just make the deployment package a little smaller by removing unneeded files.

On its own this isn’t quite enough, because the lambda function can’t see the gems. It also doesn’t know that it shouldn’t expect to find the development / test only gems, so it will complain about that too. The magic step is to create .bundle/config in each of the lambda functions with these contents:

---
BUNDLE_DEPLOYMENT: "true"
BUNDLE_WITHOUT: "development:test"
BUNDLE_PATH: "/opt/ruby/lib/vendor/bundle"

This does three things:

Puts bundler into deployment mode, so that it will complain if Gemfile and Gemfile.lock are out of sync
Tells bundler to ignore gems in development / test groups
Tells bundler where gems have been installed

And that’s it - with this the lambda function is able to load the gems from the layer.

Summary

I’m pretty happy with this - it achieved all the goals I had when I started out:

One consistent gemfile across functions and in production vs running specs locally
Bundler runs once only per invocation of sam build
Bundler still enforcing gem versions etc.

Combined with the recent sam build --cached feature (sam 1.9.0 and higher) it gets even better: bundler is only run if Gemfile/Gemfile.lock has changed. While --cached is cool all on its own, without dependencies isolated in this fashion, it would run bundler for each function where at least one file has changed (since sam has no knowledge of what file changes should trigger bundler).

If you want to play around with this, I’ve made a very simple repo that demonstrates this.

I don’t love the separate gemfiles. More than once I’ve added something to the toplevel gemfile (used for running specs) but not the per function Gemfile, so although my specs were green the dependiences weren’t correct when deployed. I’ve also made mistakes because the per function Gemfiles have a ruby constraint but the top level one didn’t, or having 2 functions accidentally end up with different versions of the same dependency. While in theory there’s nothing wrong with this, I find it a lot easier if those variations between functions don’t exist.↩
These symlinks will break if you need to run sam build -u - my workaround is to run the entire sam build process in docker (and not pass -u at all).↩

Space Vatican

Ramblings of a curious coder

AWS Lambda Layers and Ruby

One Gemfile to rule them all

Trouble in paradise

Finding Nemo Rubygems

Summary