Using DistCC to speed up builds

I have this old laptop. It's nothing too powerful anymore; a Sager NP5125 to be exact. Sporting an Intel Core i5 M560 CPU clocked at 2.67 GHz, the machine has 2 cores and 4 threads, this CPU scores 2588 on Passmark's CPU Mark test. While it certainly runs Linux just fine, it isn't going to break any records. For compiling a reasonably large C++ codebase, it's simply not ideal.

Just how slow is this processor? Well, I want to use this machine for Dolphin-emu development, and I clocked a full rebuild from scratch at just over 10 wallclock minutes - 10 minutes and 25.373 seconds. This is clearly not great, although certainly workable. Can we do better?

Setting up DistCC

There's a neat piece of software called DistCC that does pretty much what you might imagine: it takes builds and throws them over to other machines. In its most basic configuration, it simply handles the preprocessing on the 'Master' machine, and then passes over full translation units to the 'Slave' machines.

Unfortunately, it's a bit rough around the edges.

To start, I created an n1-standard-4 on Google Cloud Platform. This is a modest VM with 4 CPU cores and 15 GiB of RAM. Why Google Cloud Platform? Simple: There were Arch Linux images recent enough. I just got it running, then did a quick update to get it to the latest version of Arch.  I didn't need Arch Linux, but here it is preferable because the Linux distribution running on my laptop was Arch Linux. I could easily run the same versions of distcc and GCC.

For reference, I was using GCC version 8.2.1 20180831 and distcc 3.3.

My first stumbling block was SSH. DistCC's SSH mode forks a new SSH client for every connection. I didn't really want to set up DistCC authentication, I much prefer SSH, a system that I trust. But, this actually incurred a pretty bad performance penalty. In the end, I decided to use daemon mode but then portforward over SSH.

So, firstly, we need to install DistCC on the server.

sudo pacman -S distcc gcc
echo 'DISTCC_ARGS="--allow 127.0.0.0/32 --jobs 20"' | sudo tee /etc/distcc/distccd
sudo systemctl enable distcc
sudo systemctl restart distcc

With that out of the way, we need to setup DistCC on the client. Small wrinkle here: localhost is special-cased to simply exec locally, but we want to use SSH portforwarding, which listens on loopback, so we'll specify 127.0.0.1 instead.

sudo pacman -S distcc
echo "127.0.0.1/8,lzo,cpp" | sudo tee /etc/distcc/hosts

Next, we need to portforward. This is relatively simple. DistCC runs over port 3632 by default.

ssh -L 3632:127.0.0.1:3632 [host]

Now for an aside. CMake loves absolute paths. DistCC hates absolute paths. If you run DistCC and pass it the absolute path of a compiler, it will fail violently and in a way that tells you literally nothing about the actual problem. This unfortunately leads us to a situation where we need some glue to get things working together. Luckily this is very simple. In my case, I just made two very simple wrapper scripts that call DistCC with the proper relative path, then pointed CMake at those. This is less elegant than CMake's compiler wrapper directives, but it will do.

distcc-gcc
#!/bin/sh
exec distcc gcc "$@"
distcc-g++
#!/bin/sh
exec distcc g++ "$@"

Make sure these are made executable. Next, we need to start our build.

cmake -DCMAKE_C_COMPILER=$HOME/bin/distcc-gcc -DCMAKE_CXX_COMPILER=$HOME/bin/distcc-g++ ../dolphin
time DISTCC_BACKOFF_PERIOD=0 make -j8

After all of this work, my builds were now down to 6 minutes and 35.231 seconds. I suspect a large amount of time is spent sending data over the network, and it may possibly make sense to try using DistCC's Pump mode to offload preprocessing, since this is some reasonably heavy C++ code.

Conclusion

After all is said and done, I have a working build setup that indeed builds almost twice as fast as local. Unfortunately, it's not cheap. Running an n1-standard-4 for an entire month would rack up about $100. You can obviously optimize this by shutting off the box when it isn't in use, but this requires more work. Therefore, I don't think I would recommend this particular setup, and I probably will not be using it.

There's a few things left to ponder about. Google Cloud Platform cores are lower frequency than off-the-shelf consumer CPU cores; my VM claimed to be running at 2.00 GHz. This suggests that it may be worth it to either use a local machine or a different cloud vendor for this purpose. Also, it would probably make a hell of a lot more sense to create ephemeral machines using either Google Cloud's preemptible VMs or Amazon Web Service's spot instances, as these are significantly cheaper, and it's really not a big deal if our DistCC gets cut off (it will just fall back to local gracefully.)

So, I have an idea of what I might want in a distributed build setup, but I might need to try more options to find out what setup would work best. In the future, I will potentially investigate CCache, DistCC's pump mode, wrapping builds in Docker to make keeping versions in sync easier, and rigging a setup that configures ephemeral cloud machines as needed in order to run builds. For now, I'm going to leave it at this, and probably just run my slow builds locally until I have a better setup.