Deploying distcc in a kubernetes cluster to transform it into a big, fast build-machine.
What is distcc?
Distcc acts like a simple, distributed C/C++ compiler. It has a client program distcc
and a server application distccd
. When you compile with distcc
, it will use your compiler (typically gcc, g++ or clang) to preprocess your source files, and then send the output to a server for compilation. Since you preprocess the sources locally, all header files, macro expansions and compile definitions are correct - just like when you compile locally. On the server, all that is required is the same compiler as you use locally. I use Kubuntu 18.04 for development, so I just install distccd, gcc, g++ and clang to a Ubuntu 18.04 Docker image. That's all that is required to compile the code.
You tell Distcc what servers to use, and how many simultaneous jobs they can handle, in the DISTCC_HOSTS
environment variable on the client side, in ~/.distcc/hosts
or /etc/distcc/hosts
config files. It will pick jobs from the first servers first, so it makes sense to start with localhost, and then the most powerful nodes sorted by CPU performance.
The only thing to be aware of is that since the local machine must preprocess all the source files, it has less CPU to contribute to the compile pool. On my setup, with 40 cores combined on the k8 nodes, I use only 2 of my 6 (12 with hyper-threading) cores for compile, and the rest for preprocessing and running the distcc client. When I also use ccache, my client machine need all it's cores to preprocess and cache the object files - so all the compilations are done on the kubernetes cluster.
Local k8 cluster?
Why would somebody have a local multi-node kubernetes cluster? I have one, partially deployed on bare metal servers and partially in virtual machines because I like to play with technology - and you know - the cloud is really just somebody else's computer. For me it's cheaper to have my own cluster built with normal PC hardware than to rent 16, 32 or 48 core cloud servers whenever I build a large C++ project or stress-test some kubernetes ready distributed application.
A local cluster is also required for distributed builds from a local PC or laptop. If I use cloud servers I need to run the full build from there - and that mean that I cannot just build & debug with kdevelop or Qt Creator.
Making the distcc Docker image
For my setup I wanted a simple Docker container that use all cores on the local machine when building, but virtually no resources when idle. This is simple to achieve if you just start distccd
from the command line.
Dockerfile
FROM ubuntu:18.04
LABEL maintainer="jarle.lastviking.eu"
RUN DEBIAN_FRONTEND="noninteractive" apt-get -q update &&\
DEBIAN_FRONTEND="noninteractive" apt-get -y -q --no-install-recommends upgrade &&\
DEBIAN_FRONTEND="noninteractive" apt-get install -y -q g++ gcc clang distcc &&\
DEBIAN_FRONTEND="noninteractive" apt-get -y -q autoremove &&\
DEBIAN_FRONTEND="noninteractive" apt-get -y -q clean
ENV ALLOW 192.168.0.0/16
RUN useradd distcc
USER distcc
EXPOSE 3632
CMD distccd --jobs $(nproc) --log-stderr --no-detach --daemon --allow ${ALLOW} --log-level info
I have already built this container and pushed it to dockerhub.
Deploying the DaemonSet
My local kubernetes cluster is for software development, so unless I use it for something (like testing some software) it's idle. That makes it perfectly suited as a build machine when I need some extra horse power.
I deploy the distcc pod as a DaemonSet
, which means that kubernetes will deploy it on each node in the cluster. I don't specify any resource constrains or limits - so it won't reserve any resources when it's unused, and it will use all the CPU and memory it needs when its building.
The deployment file distcc.yml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: distcc
namespace: devtools
labels:
app: distcc-devtool
spec:
selector:
matchLabels:
name: distcc-devtool
template:
metadata:
labels:
name: distcc-devtool
spec:
containers:
- name: distcc
image: jgaafromnorth/distcc
ports:
- containerPort: 3632
hostPort: 3632
protocol: TCP
env:
- name: ALLOW
value: 10.0.0.0/8
As you may notice, I use a hostPort
with distcc's default port, so from the outside world, it appears that distcc is running as a normal daemon on the server. To send jobs to a node, I just need to know it's ip and it's number of cores.
To deploy it, just run:
$ kubectl apply -f distcc.yml
Using with cmake and ccache
There are several ways to use distcc with cmake. The simplest is to set -DCMAKE_CXX_COMPILER_LAUNCHER=distcc
But I want builds, not just compilations, to be fast. ccache
is a compiler cache that tries to avoid repeating the same compilation several times by caching the compilations. Ccache can be configured to call distcc to perform an actual compilation.
On my system (Ubuntu 18.04) it seems like cmake or make finds ccache automatically, so all I have to do is to install ccache and configure it to call distcc when it needs to compile a source file.
What's the difference?
I'll use arangodb as an example here. It's a medium sized C++ project using cmake. On my 6 core (12 with hyper-threading) laptop, it takes about 14 minutes to compile without ccache and distcc.
Normal compilation
time make -j 12
...
[100%] Built target arangodbtests
real 14m5,931s
user 113m21,600s
sys 13m29,895s
Ccache and distcc, using Cmake's ccache detection. In this case parts of the compilations for 3rd party libraries were not sent to ccache/distcc, and slowed down the compilation significantly, and consumed > 12G of memory (running 40 simultaneous compilations on 6 cores).
time make -j 40
...
[100%] Built target arangodbtests
real 10m21,424s
user 59m14,562s
sys 10m23,702s
Ccache and distcc, with ccache called from symlinks from cc/c++ to ccache. This is supported out of the box on Debian and Ubuntu, if you prefix your PATH variable with /usr/lib/ccache: export PATH=/usr/lib/ccache:$PATH
. However, this don't work with arangodb, without a small tweak.
In stead of calling distcc from ccache, we call a little script:
distcc-wrap.sh
#!/bin/sh
compiler=$(basename $1)
shift
exec distcc "/usr/bin/$compiler" "$@"
With this change in place, a full build with a clear ccache took about 8 minutes:
time make -j40
...
[100%] Built target arangodbtests
real 8m31,189s
user 22m24,886s
sys 7m45,816s
Without ccache and using two cores on the client machine for build, it went down to about 5 minutes. But we don't want to avoid ccache. Ccache is your friend.
Second full build with ccache and distcc. Now ccache will detect that no compilations are required, so most of the time is spent linking the cached object files.
make clean
time make -j40
...
[100%] Built target arangodbtests
real 1m38,137s
user 2m24,563s
sys 0m47,227s
So the real performance gain is from ccache. But if you need to clear the cache for some reason, or change some header file or declaration that require many files to be re-compiled, distcc cuts the build-time significantly.
Configuration
My ccache Configuration
~/.ccache/ccache.conf
max_size = 25.0G
compression = true
prefix_command = distcc-wrap.sh
My distcc configuration
~/.distcc/hosts
k8vm0/16 k8n0/12 k8n1/12
The hostnames here, k8vm0 ... are present in /etc/hosts. I could just as well have used the IP numbers to the nodes. The numbers after the slash specifies how many concurrent compilations that node can handle.