Published:

Jarle Aase

My first mini-project in 2026: Deldupes

bookmark 3 min read

Many, many years ago, in a country far North, I sat at the right side of the table in a job interview. I led a small team of software developers in a little city, and I was interviewing for a new position. The man at the wrong side of the table was old, but according to his CV very experienced, with more than 20 years as a software developer. I had a good feeling about the guy. He was social, had humor, showed signs of high intelligence, and had been in the game much longer than me.

About 20 minutes in, I realized that his knowledge was totally obsolete. He had worked for a larger company for most of his career, maintaining a legacy system. He had never done anything to update his knowledge. He didn’t have 20 years of experience. He was 20 years behind everybody else in the industry.

The realization was a shock to me. I swore I would never be that guy. That’s one of the reasons I try to keep an open mind to new technology and spend significant time experimenting with new stuff.

I looked at the Rust programming language a few years ago. After reading The Rust Book and experimenting a little, I was unimpressed. Rust seemed fine for single-threaded command-line programs, but little else.

Early in January 2026, a friend of mine suggested that I take a new look. I read another Rust book, Rust for C++ Programmers, and used ChatGPT actively to explain everything that was unclear to me. That worked surprisingly well. After finishing the book, I decided to use Rust to build the next product for my company. But before that, I wanted to get my hands dirty with a small, simple project.

I have long thought about writing a new CLI tool to find and remove duplicate files. I wrote one in C++ many years ago, long before “Modern C++”. I don’t even want to look at that source code today. So I decided to spend two days making this project in Rust. As the scope grew a bit along the way, I ended up spending around 25 hours on the project.

I used ChatGPT actively to suggest code and solutions. It did not produce very good code on its own. I ended up doing a lot of refactoring when I added features. But the program became impressive.

One nice thing about Rust, compared to C++, is how trivial it is to start a new project and get all the “libraries” (crates) you need. Command-line parsing — which is possible in C++ with libraries like Boost.Program_options — is really streamlined in Rust, especially if you need different subcommands, each with their own command-line options. I’ve done the latter in C++, and it is not trivial.

It was also very, very simple to use threads efficiently. When the app scans the file system, the main thread walks the provided directories recursively. If it decides to add a file to the database, the file metadata is put into a queue. A pool of worker threads pulls from that queue until it is drained and closed.

First, the workers may calculate a SHA-1 prefix hash over the first 32 KB of the file. This is used to identify potential incomplete file versions if I ask for it later. Then they calculate a full 256-bit BLAKE3 hash of the entire file. I chose this algorithm because it is extremely fast — and that matters when you hash many terabytes of data.

Once the hashes for a file are ready, the data is put into another queue, this time for the database. A single thread pulls data from that queue and updates the database, committing changes in large batches.

Doing this in Rust with an array of threads feels similar to doing it manually in C++ (threads plus a work queue), or to using goroutines in Go. QVocalWriter, the C++ AI app I wrote in November 2024, uses the same basic pattern to handle audio data in near real time — but there I implemented the queue myself. In deldupes, Rust’s mpsc channels take care of that.

workflow

The biggest difference I experienced in the workflow with Rust versus C++ was the lack of friction. With C++, you must spend energy getting CMake to do what you want, and suddenly you get weird errors that can only be fixed by deleting the build directory and rebuilding everything. Then there are the 200-page error messages if you get a letter wrong when writing or using a template.

What I really like — and the main reason I will probably write many of my future CLI programs in Rust — is that it produces a mostly static binary that runs on a wide range of Linux systems (for the target architecture). Finalizing a C++ project and producing binaries that run reliably across Linux distributions is a real pain for anything but the most trivial “hello world” apps.

So, making this little tool in Rust was a fun exercise.

Deldupes is currently released as beta. For this project, beta means that the design is finished and the tool is useful, but I want some real-world usage to catch bugs and edge cases before I declare it stable.

Still, nothing keeps me calm and happy in the zone quite like C++ coding ;)