Published:

Jarle Aase

How to make std::shared_ptr<> objects singleton instances

bookmark 4 min read

I am currently implementing a crypto-wallet for IOS and Android. The responsibilities of the wallet is to maintain state of a users different accounts, make REST calls to a service that have access to the block-chain, and respond to user interface requests.

At the bottom is a data storage, modeled like a very simple database with key/value pairs. Besides accounts, the wallet has a number of other objects that the user may request. There is no application limit on how many objects the user may eventually create. It's also for mobile, so the app may get killed at any moment. And even though mobile phones today are more powerful than yesterdays PC's, I still prefer to keep resource usage at a minimum. So just reading all objects into memory at startup is not an option. Data are stored on disk, and loaded on demand. At least from the wallets perspective.

The UI may request an object, based on it's name or cryptographic location in the universe at any time. The UI is made by a team in another country, in languages I don't know (Swift/Kotlin). The wallet library is written in C++ 14. So I cannot make many assumptions regarding how the library will be used. They may ask for the same object again and again, and they may cache it, and then ask for it again - effectively having two or more references to an object with the same key. The library may also have internal references to objects, for example during a REST-request. Both the UI code and library code may want change some properties and save the object, potentially simultaneously from different threads.

The simple way to implement a Consumer --> Data Factory --> Database pipeline, where the Consumer wants an object, the Data Factory is responsible to produce an instance of an object (in our case a C++ object with properties and methods) derived from data in a database, is to just create the object on demand, fill it with data from the database, and return the object to the Consumer.

Something like this.

void consume() {

    auto bitcoin_account = get_account("my precious money");

    std::cout << bitcoin_account->get_balance() << std::endl;
}

auto get_account(const std::string& name) {

    // Never, ever send data from a user directly in to a database query like this.
    // My code is strictly to show application logic, not implementation!
    if (auto db_record = db("SELECT * FROM accounts WHERE name = '"s + name + "'")) {

        auto account = std::make_shared<Account>();
        // Shuffle data from the database to the account instance here...
        return account;
    }

    throw std::runtime_error("No such account: "s + name);
}

This works great as long as the data is read-only, or we know for a fact that only one instance will make any changes at any given time. (Lets assume that the Account object have some setter methods that can propagate the changes back to the database).

But what happens if there exists two instance of the Account (both created from get_account with the same name), and both wants to update some data in the object? Well, we get a race condition. One of them succeeds, the other may believe it updated the data but actually did not. Like in the illustration below. (I used cars and fruits to represent data properties, as that's more pleasant to look at in an image than text).

Race condition

In the illustration we have the data stored on the disk in the middle, and the data stored in the instances on the left and right. We start by loading the red car and a lemon from the disk to both instances. Then both instances make changes in their copy of the data. The data saved by the left instance gets overwritten by the data saved by the right instance. To make things even worse, in the last row, the left and the right instances work with completely different data.

This is a classic problem that is solved in many ways over the years.

In my wallet I solved it in a few minutes using the C++ standard library.

My original code was basically as above. Each time get_account was called, I created a new instance of the object. I did it that way, because it's often better to find the simplest solution when a problem is crystal clear. In my wallet there were many unknowns during the initial development, so I made very few assumptions and designed the library to do exactly what it had to do. The simple factory worked until yesterday, when I added a feature that broke it.

So what is the simple solution for this?

My solution was to add a registry of live Accounts. I added code to get_account to check if the object with the given key was already instantiated and in use somewhere, and if so, I just returned another std::shared_ptr to it. If no such object existed, the existing code constructed a new one.

Now, you may think that I added something like std::map<std::string, std::shared_ptr<Account>>, which was tempting, since caching all used instances would make the wallet faster. However, I did not want a cache. Just a registry of live objects, without affecting their life-cycle. So I used std::map<std::string, std::weak_ptr<Account>> in stead. This way get_account could check the register, make another shared pointer if the object existed, and not waste memory if it did not.

All in all, the code needed to convert get_account from a dumb factory to a smart provider was just a few lines.

A generalized version of the registry can be coded like:

#include <map>
#include <memory>
#include <string>

template <typename keyT, typename valueT>
class Registry {
public:
    Registry() = default;

    std::shared_ptr<valueT> fetch(const keyT& key) const {
        auto it = registry_.find(key);
        if (it != registry_.end()) {
            if (auto ptr = it->second.lock()) {
                return ptr;
            }

            // Expired.
            registry_.erase(it);
        }

        return {};
    }

    void add(const keyT& key, const std::shared_ptr<valueT>& value) {
        registry_[key] = value;
    }

    void clean() {
        for(auto it = registry_.begin(); it != registry_.end();) {
            if (it->second.expired()) {
                static_assert(__cplusplus >= 201103L, "Require at least C++ 11");
                it = registry_.erase(it);
            } else {
                ++it;
            }
        }
    }

    size_t size() const {
        return registry_.size();
    }

private:
    mutable std::map<keyT, std::weak_ptr<valueT>> registry_;
};

The interesting methods in Registry are fetch and add. You may want to call clean from time to time to free up memory referencing stale entries. The size method is useful for unit tests, to assert that the thing actually works as expected (it does!). If you have lots of objects, you may want to use std::unordered_map in stead of std::map, as it's supposed to be faster. For my use, with relatively few objects, I assume that std::map will be more memory-efficient, and because of that, potentially even faster as well (I have not verified that assumption).

In my case, the object creation is guaranteed to happen in only one thread, so I had no need for locking.

After adding the registry class, my factory code looked like something like this:

auto get_account(const std::string& name, Registry<std::string, Account>& registry) {

    if (auto ptr = registry.fetch(name)) {
        return ptr;
    }

    if (auto db_record = db("SELECT * FROM accounts WHERE name = '"s + name + "'")) {

        auto account = std::make_shared<Account>();
        // Shuffle data from the database to the account instance here...

        registry.add(name, account);
        return account;
    }

    throw std::runtime_error("No such account: "s + name);
}

And that's all. Its very very simple to accomplish something like this in C++, using nothing but the standard library.