Nsblast - engineering blog

Thoughts, ideas and lessons learned while creating a new massively scalable authoritative DNS server in C++

This series is mostly written for myself to digest challenges faced and choices made during the development of Nsblast.

But I hope it can also be useful for other people who are facing similar challenges in their work or pet projects. Nsblast is after all a pet project.

The first DNS server I wrote was part of a web server, designed to run 50% of the distinct sites on the Internet on my Thinkpad Laptop. The DNS server took me about 16 hours to write, split over my early hours one week when I worked at one of the largest software companies in the world. At work that week, I spent most of my time waiting for a bug fix of a few lines to go trough the code review process and automated tests. The test infrastructure was broken (the CTO followed the "Eat your own dog-food mantra") and some days was spent just monitoring the test dashboard and re-starting the tests when the broken "dog-shit" infrastructure timed out. Such a waste of time! Anyway, the lack of progress at work led me to long, intense coding sessions on my spare time. The 1 billion distinct web-sites I hosted on my Laptop, with it's HTTP server (70k HTTP requests/sec - maxing out the network card during tests) and DNS server was one thing I accomplished.

Then at some point I decided to write a real DNS server. In January 2023 I started from scratch with Nsblast. It's a pure authoritative DNS server. It support the DNS protocol over UDP and TCP. It can be a primary or follower in a mix with other standard DNS servers, syncing the zones via the DNS protocol. It can also run in a "simple cluster" mode, where replication is done at the database layer via gRPC streams. In addition to the DNS protocol and gRPC, it also has a HTTP REST interface where admins and users can CRUD zones and resource records.

Nsblast use RocksDB as it's database. RocksDB is a C++ library made at and maintained by Facebook. That means that it runs inside the application itself (like SQLite), and not as an external database like MySQL or PostgreSQL. This makes Nsblasts' database lookups really fast.

Since the DNS infrastructure was designed as a distributed system (in fact one of the first distributed "databases" on the internet) - a single instance of nsblast don't have to deal with fail-over or redundancy. Any DNS compliant domain need to point to at least two DNS servers - which ensures redundancy (as long as those servers are more or less in sync). By architecture, Nsblast is a monolith - an application with all the "stuff" stuffed into itself. But it is deployable much like a micro-service, where a cluster of instances can scale horizontally and load-balance any number of DNS lookups from the Internet. If you think of it as a DNS database (which it actually is) - it is a 1 primary, any number of replicas design. All the data is replicated to all the replicas, so when a replica handle a DNS request, it just needs to do a local database lookup, and it can reply immediately.

In this blog series, I invite you into the thought process behind various engineering decisions for the project.