Over the past year and a bit, Shopify has been progressively rebuilding parts of our developer tooling with Nix. I initially planned to write about how we're using Nix now, and what we're going to do with it in the future (spoiler: everything?). However, I realize that most of you won't have a really clear handle on what Nix is, and I haven't found a lot of the introductory material to convey a clear impression very quickly, so this article is going to be a crash course in what Nix is, how to think about it, and why it's such a valuable and paradigm-shifting piece of technology.
There are a few places in this post where I will lie to you in subtle ways to gloss over all of the small nuances and exceptions to rules. I'm not going to call these out. I'm just trying to build a general understanding. At the end of this post, you should have the basic conceptual scaffolding you need in order to think about Nix. Let's dive in!
What is Nix?
The most basic, fundamental idea behind Nix is this:
Everything on your computer implicitly depends on a whole bunch of other things on your computer.
- All software exists in a graph of dependencies.
- Most of the time, this graph is implicit.
- Nix makes this graph explicit.
Four Building Blocks
Let's get this out of the way up front: Nix is a hard thing to explain.
There are a few components that you have to understand in order to really get it, and all of their explanations are somewhat interdependent; and, even after explaining all of these building blocks, it still takes a bit of mulling over the implications of how they compose in order for the magic of Nix to really click. Nevertheless, we'll try, one block at a time.
The major building blocks, at least in my mental model of Nix, are:
- The Nix Store
- Derivations
- Sandboxing
- The Nix Language.
The Nix Store
The easiest place to start is the Nix Store. Once you've installed Nix, you'll wind up with a directory at /nix/store
, containing a whole bunch of entries that look something like this:
3mfcmgmpcqjajpdhfh8pdazmmd4vskns-nix-2.3.3-man/
h9bvv0qpiygnqykn4bf7r3xrxmvqpsrd-nix-2.3.3/
nrb3rkvwz114053yh00r7p2dlc9igp03-nix-2.3.3.drv
This directory, /nix/store
, is a kind of Graph Database. Each entry (each file or directory directly under /nix/store
) is a Node in that Graph Database, and the relationships between them constitute Edges.
The only thing that's allowed to write directories and files into /nix/store
is Nix itself, and after Nix writes a Node into this Graph Database, it's completely immutable forever after: Nix guarantees that the contents of a Node doesn't change after it's been created. Further, due to magic that we'll discuss later, the contents of a given Node is guaranteed to be functionally identical to a Node with the same name in some other Graph, regardless of where they're built.
What, then, is a "relationship between them?" Put another way, what is an Edge? Well, the first part of a Store path (the 32-character-long alphanumeric blob) is a cryptographic hash (of what, we'll discuss later). If a file in some other Store path includes the literal text "h9bvv0qpiygnqykn4bf7r3xrxmvqpsrd-nix-2.3.3
", that constitutes a graph Edge pointing from the Node containing that text to the Node referred to by that path. Nodes in the Nix store are immutable after they're created, and the Edges they originate are scanned and cached elsewhere when they're first created.
To demonstrate this linkage, if you run otool -L
(or ldd
on Linux) on the nix
binary, you'll see a number of libraries referenced, and these look like:
/nix/store/gk9l41kp852lddrvjx9cfkgxwjs3vls8-libsodium-1.0.16/lib/libsodium.23.dylib
That's extracted by otool
or ldd
, but ultimately comes from text embedded in the binary, and Nix sees this too when it determines the Edges directed from this Node.
Highly astute readers may be skeptical that scanning for literal path references in a Node after it's created is a reliable way to determine a dependency. For now, just take it as given that this, surprisingly, works almost flawlessly in practice.
To put this into practice, we can demonstrate just how much of a Graph Database this actually is using nix-store --query
. /nix/store
is a tool built in to Nix that interacts directly with the Nix Store, and the --query
mode has a multitude of flags for asking different questions of the Graph Database that is the Store.
Let's find all of the Nodes that <hash>-nix-2.3.3
has Edges pointing to:
$ nix-store --query --references /nix/store/h9bvv0qpiygnqykn4bf7r3xrxmvqpsrd-nix-2.3.3/
/nix/store/fxvxl64g1b336ayhzsrqdcv541zpb6lx-Libsystem-osx-10.12.6
/nix/store/2d0ikpigmr9fi2gx3g3gb0g8mg4f6a0n-xz-5.2.4
/nix/store/gk9l41kp852lddrvjx9cfkgxwjs3vls8-libsodium-1.0.16
...(and 21 more)...
Similarly, we could ask for the Edges pointing to this node using --referers
, or we could ask for the full transitive closure of Nodes reachable from the starting Node using --requisites
.
The transitive closure is an important concept in Nix, but you don't really have to understand the graph theory: An Edge directed from a Node is logically a dependency: if a Node includes a reference to another Node, it depends on that Node. So, the transitive closure (--requisites
) also includes those dependencies' dependencies, and so on recursively, to include the total set of things depended on by a given Node.
For example, a Ruby application may depend on the result of bundling together all the rubygems specified in the Gemfile. That bundle may depend on the result of installing the Gem nokogiri
, which may depend on libxml2 (which may depend on libc or libSystem). All of these things are present in the transitive closure of the application (--requisites
), but only the gem bundle is a direct reference (--references
).
Now here's the key thing: This transitive closure of dependencies always exists, even outside of Nix: these things are always dependencies of your application, but normally, your computer is just trusted to have acceptable versions of acceptable libraries in acceptable places. Nix removes these assumptions and makes the whole graph explicit.
To really drive home the "graphiness" of software dependencies, we can install Ruby via nix (nix-env -iA nixpkgs.ruby
) and then build a graph of all of its dependencies:
nix-store --query --graph $(which ruby) \
| nix run nixpkgs.graphviz -c dot > ruby.svg
Graphiness of Software Dependencies
Derivations
The second building block is the Derivation. Above, I offhandedly mentioned that only Nix can write things into the Nix Store, but how does it know what to write? Derivations are the key.
A Derivation is a special Node in the Nix store, which tells Nix how to build one or more other Nodes.
If you list your /nix/store
, you'll see a whole lot of items most likely, but some of them will end in .drv
:
/nix/store/ynzfmamryf6lrybjy1zqp1x190l5yiy5-demo.drv
This is a Derivation. It's a special format written and read by Nix, which gives build instructions for anything in the Nix store. Just about everything (except Derivations) in the Nix store is put there by building a Derivation.
So what does a Derivation look like?
$ cat /nix/store/ynzfmamryf6lrybjy1zqp1x190l5yiy5-demo.drv
Derive([("out","/nix/store/76gxh82dqh6gcppm58ppbsi0h5hahj07-demo","","")],[],[],"x86_64-darwin","/bin/sh",["-c","echo 'hello world' > $out"],[("builder","/nix/store/5arhyyfgnfs01n1cgaf7s82ckzys3vbg-bash-4.4-p23/bin/bash"),("name","demo"),("out","/nix/store/76gxh82dqh6gcppm58ppbsi0h5hahj07-demo"),("system","x86_64-darwin")])
That's not especially readable, but there's a couple of important concepts to communicate here:
- Everything required to build this Derivation is explicitly listed in the file by path (you can see "bash" here, for example).
- The hash component of the Derivation's path in the Nix Store is essentially a hash of the contents of the file.
Since every direct dependency is mentioned in the contents, and the path is a hash of the contents, that means that if the dependencies and whatever other information the derivation contains don't change, the hash won't change, but if a different version of a dependency is used, the hash changes.
There are a few different ways to build Derivations. Let's use nix-build
:
$ nix-build /nix/store/ynzfmamryf6lrybjy1zqp1x190l5yiy5-demo.drv
/nix/store/76gxh82dqh6gcppm58ppbsi0h5hahj07-demo
$
This ran whatever the build instructions were and generated a new path in the Nix Store (a new Node in the Graph Database).
Take a close look at the hash in the newly-created path. You'll see the same hash in the Derivation contents above. That output path was pre-defined, but not pre-generated. The output path is also a stable hash. You can essentially think of it as being a hash of the derivation and also the name of the output (in this case: "out"; the default output).
So, if a dependency of the Derivation changes, that changes the hash of the Derivation. It also changes the hashes of all of that Derivation's outputs. This means that changing a dependency of a dependency of a dependency bubbles all the way down the tree, changing the hashes of every Derivation and all those Derivation's outputs that depend on the changed thing, directly or indirectly.
Let's break apart that unreadable blob of Derivation content from above a little bit.
- outputs: What nodes can this build?
- inputDrvs: Other Derivations that must be built before this one
- inputSrcs: Things already in the store on which this build depends
- platform: Is this for macOS? Linux?
- builder: What program gets run to do the build?
- args: Arguments to pass to that program
- env: Environment variables to set for that program
Or, to dissect that Derivation:
outputs
[("out","/nix/store/76gxh82dqh6gcppm58ppbsi0h5hahj07-demo","","")]
This Derivation has one output, named "out" (the default name), with some path that would be generated if we would build it.
inputDrvs
[ ]
This is a simple toy derivation, with no inputDrvs
. What this really means is that there are no dependencies, other than the builder. Normally, you would see something more like:
[("/nix/store/4kgf3y9sm84jzcl3k3bn8vzl7fgafpm9-openssh-8.1p1.drv",["out"])]
This indicates a dependency on the OpenSSH Derivation's default output.
inputSrcs
[ ]
Again, we have a very simple toy Derivation! Commonly, you will see:
["/nix/store/m00k69wikx3p7av28s0m40z9ipahw5ky-builder.sh"]
It's not really critical to the mental model, but Nix can also copy static files into the Nix Store in some limited ways, and these aren't really constructed by Derivations. This field just lists any static files in the Nix store on which this Derivation depends.
platform
"x86_64-darwin"
Nix runs on multiple platforms and CPU architectures, and often the output of compilers will only work on one of these, so the derivation needs to indicate which architecture it's intended for.
There's actually an important point here: Nix Store entries can be copied around between machines without concern, because all of their dependencies are explicit. The CPU details are a dependency in many cases.
builder
"/nix/store/5arhyyfgnfs01n1cgaf7s82ckzys3vbg-bash-4.4-p23/bin/bash"
This program is executed with args
and env
, and is expected to generate the output
(s).
args
["-c","echo 'hello world' > $out"]
You can see that the output name ("out") is being used as a variable here. We're running, basically, bash -c "echo 'hello world' > $out"
. This should just be writing the text "hello world" into the Derivation output.
env
[("builder","/bin/sh"),
("name","demo"),
("out","/nix/store/76gxh82dqh6gcppm58ppbsi0h5hahj07-demo"),("system","x86_64-darwin")]
Each of these is set as an Environment Variable before calling the builder
, so you can see how we got that $out
variable above, and note that it's the same as the path given in output
s above.
Derivation in Summary
So, if we build that Derivation, let's see what the output is:
$ nix-build /nix/store/ynzfmamryf6lrybjy1zqp1x190l5yiy5-demo.drv
/nix/store/76gxh82dqh6gcppm58ppbsi0h5hahj07-demo
$ cat /nix/store/76gxh82dqh6gcppm58ppbsi0h5hahj07-demo
hello world
$
As we expected, it's "hello world".
A Derivation is a recipe to build some other path in the Nix Store.
Sandboxing
After walking through that Derivation in the last section, you may be starting to develop a feel for how explicitly-declared dependencies make it into the build, and how that Graph structure comes together—but what prevents builds from referring to things at undeclared paths, or things that aren't in the Nix store at all?
Nix does a lot of work to make sure that builds can only see the Nodes in the Graph which their Derivation has declared, and also, that they don't access things outside of the store.
A Derivation build simply cannot access anything not declared by the Derivation. This is enforced in a few ways:
- For the most part, Nix uses patched versions of compilers and linkers that don't try to look in the default locations (
/usr/lib
, and so on). - Nix typically builds Derivations in an actual sandbox that denies access to everything that the build isn't supposed to access.
A Sandbox is created for a Derivation build that gives filesystem read access to—and only to—the paths explicitly mentioned in the Derivation.
What this amounts to is that artifacts in the Nix Store essentially can't depend on anything outside of the Nix Store.
The Nix Language
And finally, the block that brings it all together: the Nix Language.
Nix has a custom language used to construct derivations. There's a lot we could talk about here, but there are two major aspects of the language's design to draw attention to. The Nix Language is:
- lazy-evaluated
- (almost) free of side-effects.
I'll try to explain these by example.
Lazy Evaluation
Take a look at this code:
data = {
a = 1;
b = functionThatTakesMinutesToRun 1;
};
This is Nix code. You can probably figure out what's going on here: we're creating something like a hash table containing keys "a" and "b", and "b" is the result of calling an expensive function.
In Nix, this code takes approximately no time to run, because the value of "b" isn't actually evaluated until it's needed. We could even:
let
data = {
a = 1;
b = functionThatTakesMinutesToRun 1;
};
in data.a
Here, we're creating the table (technically called an Attribute Set in Nix), and extracting "a" from it.
This evaluates to "1" almost instantly, without ever running the code that generates "b".
Conspicuously absent in the code samples above is any sort of actual work getting done, other than just pushing data around within the Nix language. The reason for this is that the Nix language can’t actually do very much.
Free of Side Effects (almost)
The Nix language lacks a lot of features you will expect in normal programming languages. It has
- No networking
- No user input
- No file writing
- No output (except limited debug/tracing support).
It doesn't actually do anything at all in terms of interacting with the world…well, except for when you call the derivation
function.
One Side Effect
The Nix Language has precisely one function with a side effect. When you call derivation
with the right set of arguments, Nix writes out a new <hash>-<name>.drv
file into the Nix Store as a side effect of calling that function.
For example:
derivation {
name = "demo";
builder = "${bash}/bin/bash";
args = [ "-c" "echo 'hello world' > $out" ];
system = "x86_64-darwin";
}
If you evaluate this in nix repl
, it will print something like:
«derivation /nix/store/ynzfmamryf6lrybjy1zqp1x190l5yiy5-demo.drv»
That returned object is just the object you passed in (with name
, builder
, args
, and system
keys), but with a few extra fields (including drvPath
, which is what got printed after the call to derivation
), but importantly, that path in the Nix store was actually created.
It's worth emphasizing again: This is basically the only thing that the Nix Language can actually do. There's a whole lot of pushing data and functions around in Nix code, but it all boils down to calls to derivation
.
Note that we referred to ${bash}
in that Derivation. This is actually the Derivation from earlier in this article, and that variable substitution is actually how Derivations depend on each other. The variable bash
refers to another call to derivation
, which generates instructions to build bash
when it's evaluated.
The Nix Language doesn't ever actually build anything. It creates Derivations, and later, other Nix tools read those derivations and build the outputs. The Nix Language is just a Domain Specific Language for creating Derivations
Nixpkgs: Derivation and Lazy Evaluation
Nixpkgs is the global default package repository for Nix, but it's very unlike what you probably think of when you hear "package repository."
Nixpkgs is a single Nix program. It makes use of the fact that the Nix Language is Lazy Evaluated, and includes many, many calls to derivation
. The (simplified but) basic structure of Nixpkgs is something like:
{
ruby = derivation { ... };
python = derivation { ... };
nodejs = derivation { ... };
…
}
In order to build “ruby”, various tools just force Nix to evaluate the “ruby” attribute of that Attribute Set, which calls derivation
, generating the Derivation for Ruby into the Nix Store and returning the path that was built. Then, the tool runs something like nix-build
on that path to generate the output.
Shipit! Presents: How Shopify Uses Nix
Well, it takes a lot more words than I can write here—and probably some amount of hands-on experimentation—to let you really, viscerally, feel the paradigm shift that Nix enables, but hopefully I’ve given you a taste.
If you’re looking for more Nix content, I’m currently re-releasing a series of screencasts I recorded for developers at Shopify to the public. Check out Nixology on YouTube.
You can also join me for a discussion about how Shopify is using Nix to rebuild our developer tooling. I’ll cover some of this content again, and show off some of the tooling we actually use on a day-to-day basis.
What: ShipIt! Presents: How Shopify Uses Nix
Date: May 25, 2020 at 1:00 pm EST
Please view the recording at
https://engineering.shopify.com/blogs/engineering/shipit-presents-how-shopify-uses-nix
If you want to work on Nix, come join my team! We're always hiring, so visit our Engineering career page to find out about our open positions.