Reproducible builds - also known as deterministic builds - are the holy grail of a lot of organizations that care about the authenticity and security of their products. In simple terms it's the ability to produce the exactly same binary build using the same code version and parameters, regardless of where and when it is built, down to a single bit.
The main reason why this is useful is that anyone who might doubt the authenticity of a software release - or just wants to make sure they are getting exactly what they want - can build the software themselves and expect exactly the same result.
Because of the complexity of our stacks here at Status to achieve deterministic builds we needed something else in addition to the multiple Package managers we use:
While Yarn is deterministic because it uses a lock file which ensures your dependency versions will not change without you knowing, the other two are much more problematic.
Both Maven and Gradle handle Java dependencies using something called POM files: Project Object Model files. These XML files define project dependencies, their types, and relations. The issue with that is that they often do not specify the exact version required. In addition to that they are fetched from Maven Repositories at build time, and can change depending on when then they are downloaded. Same goes for the maven-metadata.xml
files which define latest versions of a package available.
The result of this are non-deterministic builds which use different collection of dependencies when built at different times on different machines. This can also be affected by local Maven cache.
In order to control all dependencies and versions of tools used during build a complex solution was necessary. That solution is the Nix package manager.
Nix is a tool which uses a subset of Haskell programming language to define the entire tree of software dependencies necessary to manage an entire Linux operating system: NixOS.
The language allows a developer to define everything that is necessary to build a piece of software in a fully deterministic manner. This includes:
Because of this all of the variables involved in a build of yours software are controlled for. This includes everything, including - for example - build time, which is always set to 0
Unix time, meaning zero seconds since 00:00:00 UTC on 1 January 1970.
As an example let's take the simplest program we can get: Hello World written in C.
#include <stdio.h>
int main()
{
printf("Hello World");
return 0;
}
This code is available from the collection of Hello World programs at www.helloworld.org.
In Nix we build software using a derivation. We'll make a simple one to build our helloworld.c
program:
{ pkgs ? import <nixpkgs> { } }:
let
inherit (pkgs) stdenv gcc fetchurl;
in stdenv.mkDerivation {
pname = "hello";
version = "1.0";
buildInputs = [ gcc ];
src = fetchurl {
url = https://www.helloworld.org/data/helloworld.c;
sha256 = "1syr8snddx5v71arsvv205ka82qljhjg2424yylrp5rymr049w69";
};
buildPhases = ["unpackPhase" "buildPhase" "installPhase"];
unpackPhase = ''
cp $src ./hello.c
'';
buildPhase = ''
gcc -o hello hello.c
'';
installPhase = ''
mkdir -p $out/bin
cp hello $out/bin/
'';
}
If we run nix-build
on this file, named default.nix
we'll get the resulting hello
binary in /nix/store
, which is where all nix build results and inputs are stored:
> nix-build default.nix
these derivations will be built:
/nix/store/hzd4ci09wj4bdif462hjwh8imdg6lrl6-hello-1.0.drv
building '/nix/store/hzd4ci09wj4bdif462hjwh8imdg6lrl6-hello-1.0.drv'...
unpacking sources
patching sources
configuring
no configure script, doing nothing
building
installing
post-installation fixup
shrinking RPATHs of ELF executables and libraries in /nix/store/a089mjw3ylg46d79xm0143hk8rvylpk2-hello-1.0
shrinking /nix/store/a089mjw3ylg46d79xm0143hk8rvylpk2-hello-1.0/bin/hello
strip is /nix/store/hrkc2sf2883l16d5yq3zg0y339kfw4xv-binutils-2.31.1/bin/strip
stripping (with command strip and flags -S) in /nix/store/a089mjw3ylg46d79xm0143hk8rvylpk2-hello-1.0/bin
patching script interpreter paths in /nix/store/a089mjw3ylg46d79xm0143hk8rvylpk2-hello-1.0
checking for references to /build/ in /nix/store/a089mjw3ylg46d79xm0143hk8rvylpk2-hello-1.0...
/nix/store/a089mjw3ylg46d79xm0143hk8rvylpk2-hello-1.0
> /nix/store/a089mjw3ylg46d79xm0143hk8rvylpk2-hello-1.0/bin/hello
Hello World
At the top of the default.nix
file sits the single argument:
{ pkgs ? import <nixpkgs> { } }:
The argument is an attribute set with one key: pkgs
This argument has a default value which is import <nixpkgs> { }
. This incantation essentially means we are importing the default nixpkgs
, which is a massive Git repository that contains derivations to build a lot of software, including GCC compiler we used in the build.
The next let/in
block is simply a way to prepare some things we need before the build:
let
inherit (pkgs) stdenv gcc fetchurl;
in
The stdenv
, gcc
, and fetchurl
variables are actually keys of the pkgs
set, so if we did not do this we could simply replace our use of fetchurl
later with pkgs.fetchurl
.
The mkDerivation
line is simply a call of the mkDerivation
function:
stdenv.mkDerivation {
The function is called with a set({ ... }
) which arguments are things like pname
or version
.
Everything within the curly braces after mkDerivation
call are the call arguments passed as a set:
stdenv.mkDerivation {
pname = "hello";
version = "1.0";
buildInputs = [ gcc ];
...
They define things like buildInputs
, which in our case is the gcc
compiler coming from nixpkgs
. Because it is passed to the build via buildInputs
the tools available in /nix/store/blznzy96bwzv58v7iy9bgp1r8hmd3g1f-gcc-wrapper-9.2.0/bin
will be made available via PATH
environment variable.
The next build needs to get the program source from somewhere. In our case we use fetchurl
function with the set argument passing url
for the source file and sha256
to make sure we are getting what we expect.
src = fetchurl {
url = https://www.helloworld.org/data/helloworld.c;
sha256 = "1syr8snddx5v71arsvv205ka82qljhjg2424yylrp5rymr049w69";
};
Nix provides many other functions like fetchFromGitHub
or fetchFromGitLab
for other possible code sources.
The buildPhases
key should be quite self-explanatory. It defines the phases that the build will have to run.
buildPhases = ["unpackPhase" "buildPhase" "installPhase"];
unpackPhase = ''
cp $src ./hello.c
'';
buildPhase = ''
gcc -o hello hello.c
'';
installPhase = ''
mkdir -p $out/bin
cp hello $out/bin/
'';
The slightly magic elements are $src
and $out
environment variables.
$src
- Literally the result of our call to fetchurl
:
/nix/store/imk4qa4k8rrfmsyckmwsvzd99dlfnp3c-helloworld.c
$out
- The directory where build result should end up:
/nix/store/vrggb8dhxc9cl8330b8xrwfcgkrzsq9w-hello-1.0
This shows another special thing about Nix. All arguments passed to mkDerivation
are made available in the shell that executes the build steps as environment variables. This includes ones like buildPhase
or installPhase
.
Calling nix-build
on our derivation simply takes all of the inputs for mkDerivation
and constructs a shell in which all of the buildPhases
are executed with all the specified tools and env variables available. Because packages come from nixpkgs
repository they can be locked on specific versions, making the builds deterministic and reproducible.
The truth is, Nix already has a hello
package defined in nixpkgs
, but it's a much more elaborate one called GNU Hello.
You can find its derivation in pkgs/applications/misc/hello/default.nix
, which you'll find to be much simpler than ours. That is because of something called genericBuild which does the most common sense steps for packages written in C.
This of course only scratches the surface of what Nix is capable of, and to really understand what can be achieved using Nix a much deeper dive is required.
If you found this interesting, I've organized two presentations on Nix these past two weeks to make our developers more acquainted with Nix package manager, help them use it more, and be able to debug its issues:
You can watch the video or browse the presentation PDFs. Hopefully this will help more people discover the Nix and NixOS, which are very powerful tools for software developers and sysadmins alike.