8 minute read
Hundreds of Shopify developers work on our largest codebase, the monolithic Rails application that powers most of our product offering. There are various benefits to having a “majestic monolith,” but also a few downsides. Chief among them is the amount of time people spend waiting for Rails to boot.
Doing development, two of the most common tasks are running a development server and running a unit test file. By improving the performance of these tasks, we will also improve the experience for developers working on this codebase and achieve higher iteration speed. We started measuring and profiling the following code paths:
- Development server: time to first request
- Unit testing: time to first unit test
The two actions have booting the app’s environment (bin/rails environment) in common. With over 6,000 Ruby source files loaded from 530 $LOAD_PATH
entries, and over 200 YAML files that require parsing, this boot time is substantial to say the least. To compare, a vanilla Rails app has about 1600 Ruby source files to load spread over about 100 $LOAD_PATH
entries, and only needs to load 3 YAML files.
Over the years, we’ve built up a small suite of optimizations to speed this boot process up, achieving reductions of 75% on our monolith. These 75% reductions generally translate to 50% reductions in smaller applications.
We recently wrapped the most important and generalizable of these optimizations into a single gem and released it publicly as bootsnap.
Presenting Bootsnap
Bootsnap is a library that plugs into a number of Ruby and (optionally) ActiveSupport
and YAML
methods to optimize and cache expensive computations.
Using Bootsnap consists of adding it to your Gemfile and adding a snippet, such as the one below, right after your require ‘bundler/setup’
line:
require 'bootsnap'
Bootsnap.setup(
cache_dir: 'tmp/cache', # Path to your cache
development_mode: ENV['MY_ENV'] == 'development',
load_path_cache: true, # Should we optimize the LOAD_PATH with a cache?
autoload_paths_cache: true, # Should we optimize ActiveSupport autoloads with cache?
disable_trace: false, # Sets `RubyVM::InstructionSequence.compile_option = { trace_instruction: false }`
compile_cache_iseq: true, # Should compile Ruby code into ISeq cache?
compile_cache_yaml: true # Should compile YAML into a cache?
)
How Does This Work?
Bootsnap’s features can be grouped into two broad categories:
Path Pre-Scanning
Kernel#require
and Kernel#load
are modified to eliminate $LOAD_PATH
scansActiveSupport::Dependencies.{autoloadable_module?,load_missing_constant,depend_on}
are overridden to eliminate scans of ActiveSupport::Dependencies.autoload_paths.
Compilation caching
RubyVM::InstructionSequence.load_iseq
is implemented to cache the result of Ruby bytecode compilationYAML.load_file
is modified to cache the result of loading a YAML object in MessagePack format (or Marshal, if the message uses types unsupported by MessagePack) Path Pre-Scanning
(This work is a minor evolution of bootscale).
Upon initialization of bootsnap or modification of the path (e.g. $LOAD_PATH
), Bootsnap::LoadPathCache
will fetch a list of requirable entries from a cache, or, if necessary, perform a full scan and cache the result.
Later, when we run (e.g.) require 'foo'
, Ruby would iterate through every item on our $LOAD_PATH ['x', 'y', ...]
, looking for x/foo.rb, y/foo.rb
, and so on. Bootsnap instead looks at all the cached requirables for each $LOAD_PATH
entry and substitutes the full expanded path of the match Ruby would have eventually chosen.
If you look at the syscalls generated by this behaviour, the net effect is that what would previously look like this:
open x/foo.rb # (fail)
# (imagine this with 500 $LOAD_PATH entries instead of two)
open y/foo.rb # (success)
close y/foo.rb
open y/foo.rb
...
becomes this:
open y/foo.rb
...
(This open/close/open behaviour is an implementation quirk in Ruby, which could be eliminated. We’ve posted a patch that eliminates it in some cases).
Exactly the same strategy is employed for methods that traverse ActiveSupport::Dependencies.autoload_paths
if the autoload_paths_cache
option is given to Bootsnap.setup
.
Cache Invalidation
The following diagram flowcharts the overrides that make the *_path_cache
features work.
Bootsnap classifies path entries into two categories: volatile and stable. Volatile entries are scanned each time the application boots, and their caches are only valid for 30 seconds. Stable entries do not expire; once their contents have been scanned, it is assumed to never change.
The only directories considered "stable" are things under the Ruby install prefix (RbConfig::CONFIG['prefix']
, e.g. /usr/local/ruby
or ~/.rubies/x.y.z
), and things under the Gem.path
(e.g. ~/.gem/ruby/x.y.z
). Everything else is considered "volatile".
In addition to the Bootsnap::LoadPathCache::Cache
source, this diagram may help clarify how entry resolution works:
Caching LoadErrors
It's also important to note how expensive LoadError
s can be. If Ruby invokes require 'something'
, but that file isn't on $LOAD_PATH
, it takes 2 * $LOAD_PATH.length
filesystem accesses to determine that. The coefficient of 2 is due to scanning for both 'something.rb'
and 'something.bundle'
for native extensions. Bootsnap caches this result too, raising a LoadError
without touching the filesystem at all.
Ruby Compilation Caching
(A simpler implementation of this concept can be found in yomikomu).
Ruby has complex grammar and parsing it is not a particularly cheap operation. Since 1.9, Ruby has translated Ruby source to an internal bytecode format, which is then executed by the Ruby VM. Since 2.2, Ruby exposes an API that allows caching that bytecode. This allows us to bypass the relatively-expensive compilation step on subsequent loads of the same file.
These compilation results are stored using xattrs on the source files. This is likely to change in the future, as it has some limitations (notably precluding Linux support except where the user feels like changing mount flags). However, this is a very performant implementation.
Whereas before, the sequence of syscalls generated to require a file would look like:
open /c/foo.rb -> m
fstat64 m
close m
open /c/foo.rb -> o
fstat64 o
fstat64 o
read o
read o
...
close o
The multiplicity of fstat64
calls here are largely redundant. The three calls could be refactored into one if someone cared to modify Ruby’s file-loading extensively. With bootsnap, we get:
open /c/foo.rb -> n
fstat64 n
fgetxattr n
fgetxattr n
close n
Bootsnap writes two xattr
s attached to each file read:
-
user.aotcc.value
, the binary compilation result; and -
user.aotcc.key
, a cache key to determine whetheruser.aotcc.value
is still valid
The key includes several fields:
-
version
, hardcoded in bootsnap. Essentially a schema version; -
compile_option
, which changes withRubyVM::InstructionSequence.compile_option
does; -
data_size
, the number of bytes inuser.aotcc.value
, which we need to read it into a buffer usingfgetxattr(2)
; -
ruby_revision
, the version of Ruby this was compiled with; and -
mtime
, the last-modification timestamp of the source file when it was compiled
If the key is valid, the result is loaded from the value. Otherwise, it is regenerated and clobbers the current cache.
This diagram may help illustrate how it works:
YAML Compilation Caching
We also noticed that we spend a lot of time loading YAML documents during our application boot, and that MessagePack and Marshal are much faster at deserialization than YAML, even compared to a fast YAML implementation. We use the same strategy of compilation caching for YAML documents, with the equivalent of Ruby's "bytecode" format being a MessagePack document (or, in the case of YAML documents with types unsupported by MessagePack, a Marshal stream).
Putting it all together
Imagine we have this file structure:
/
├── a
├── b
└── c
└── foo.rb
And this $LOAD_PATH
:
["/a", "/b", "/c"]
When we call require 'foo'
without bootsnap, Ruby would generate this sequence of syscalls:
open /a/foo.rb -> -1
open /b/foo.rb -> -1
open /c/foo.rb -> n
close n
open /c/foo.rb -> m
fstat64 m
close m
open /c/foo.rb -> o
fstat64 o
fstat64 o
read o
read o
...
close o
With bootsnap, we get:
open /c/foo.rb -> n
fstat64 n
fgetxattr n
fgetxattr n
close n
If we call require 'nope'
without bootsnap, we get:
open /a/nope.rb -> -1
open /b/nope.rb -> -1
open /c/nope.rb -> -1
open /a/nope.bundle -> -1
open /b/nope.bundle -> -1
open /c/nope.bundle -> -1
...
and if we call require 'nope'
with bootsnap, we get...
# (nothing!)
Results
The benefit one can expect from bootsnap depends quite a bit on application size. The larger the app, the greater the relative improvement:
- One of our smaller internal apps also sees a reduction of 50%, from 3.6 to 1.8 seconds
- Our core platform -- a rather large monolithic application -- boots about 75% faster, dropping from around 25s to 6.5s
- To show that Bootsnap also works for other applications, Discourse reports a boot time reduction of approximately 50%, from roughly 6 to 3 seconds on one machine
Though we use bootsnap extensively, there’s still work to do. The compilation caching strategy is only tested on macOS right now. There’s no compelling reason we couldn’t do the same thing on our production Linux servers, or on Linux development machines.