Technique: Proof types to ensure preconditions

30 Nov 2022 by Jonathan

Consider a library using hidden global state that needs to be initialized by calling an initialization function. If you don’t call the function before you start using the library, it crashes.

How do you design the library in such a way that it is impossible to use it before initialization?

One idea is to use a technique where you create a special proof type, which needs to be passed as an additional parameter. Let’s look at it in more detail.

The Problem

Let’s say the library consists of two functions: lib::init() and lib::do_sth().

namespace lib
{
    // Initialize internal global state.
    void init()
    {
        …
    }

    // Does something.
    // Precondition: library has been initialized.
    int do_sth(int arg)
    {
        …
    }
}

Correct usage of the library would look like this:

int main()
{
    lib::init();
    std::print("{}\n", lib::do_sth(11));
}

We want to prevent incorrect usage like that:

// Initialization happens before main(),
// so before the library is initialized.
const int the_answer = lib::do_sth(11);

int main()
{
    lib::init();
    std::print("{}\n", the_answer);
}

One solution would be to ensure global state is initialized properly by controlling the initialization carefully. For more details, watch my talk on the topic.

Here we want to get a compile error instead: We want to ensure that we cannot call lib::do_sth() without having called lib::init() first.

Proof types

The trick is simple: Return something from lib::init() that needs to be passed to lib::do_sth(). If the only way to get the value is by calling lib::init(), it necessarily needs to be called before lib::do_sth().

As such, we create a proof type: an empty type that can only be constructed by lib::init().

namespace lib
{
    class init_proof
    {
        init_proof() {} // private

        // Only init() is allowed to construct a new instance.
        friend init_proof init();
    };

    init_proof init()
    {
        …
        return {};
    }

    int do_sth(init_proof, int arg)
    {
        …
    }
}

Note that only the default constructor of init_proof is private: Everybody can freely copy it because that is safe – there already exists an instance at that point.

Correct usage of the library now requires storing the init_proof object and passing it around.

int main()
{
    auto proof = lib::init();
    std::print("{}\n", lib::do_sth(proof, 11));
}

Incorrect usage is now longer possible.

At least it’s no longer possible by accident. Given that it’s C++, there are various ways to construct an object without a public constructor. Some techniques are even entirely legal.

Affine proof types

Let’s suppose the library is extended further with a destroy() function that destroys the global state. Naturally, destroy() requires init(), so it also takes the init_proof object.

namespace lib
{
    class init_proof { … }; // as before

    init_proof init();
    int do_sth(init_proof, int arg);
    void destroy(init_proof);
}

While it is not possible to call destroy() before init(), nothing is stopping us from calling do_sth() after destroy()!

int main()
{
    auto proof = lib::init();
    lib::destroy(proof);
    std::print("{}\n", lib::do_sth(proof, 11)); // ups
}

The fix is straightforward: destroy() needs to consume the init_proof object. That way it can no longer be used as an argument to lib::do_sth(). This means init_proof needs to become a move-only type. Sadly, unlike Rust, C++ lacks destructive move – there is nothing stopping us from using an object in the moved-from state.

This means that we can’t get a full guarantee at compile-time and need to add some runtime checks to guard against moved-from use.

namespace lib
{
    class init_proof
    {
        bool _is_moved_from;

        init_proof() : _is_moved_from(false) {}

    public:
        // Move constructor copies the moved-from state,
        // and definitely marks other as moved-from.
        init_proof(init_proof&& other) noexcept
        : _is_moved_from(std::exchange(other._is_moved_from, true))
        {}

        // We don't need assignment.
        init_proof& operator=(init_proof&&) = delete;
        // Destroy does nothing.
        ~init_proof() = default;

        // All functions that use the proof object need to assert it.
        explicit operator bool() const
        {
            return !_is_moved_from;
        }

        // Only init() is allowed to construct a new instance.
        friend init_proof init();
    };

    init_proof init()
    {
        …
        return {};
    }

    int do_sth(const init_proof& proof, int arg)
    {
        assert(proof);
        …
    }

    void destroy(init_proof proof)
    {
        assert(proof);
        …
    }
}

Now we need to move the proof object to destroy to consume it.

int main()
{
    auto proof = lib::init();
    std::print("{}\n", lib::do_sth(proof, 11));
    lib::destroy(std::move(proof));
}

Objects of type init_proof can only be used (as-in “passed by value”) at most once. They are called affine types.

Linear proof types

Now we got enforcement that we can destroy a library only after it has been initialized, and at least runtime enforcement that we don’t use it after destruction. However, nothing is stopping us from never destroying the library at all!

A naive attempt to tackle that using proof types might be to make the destructor of init_proof private and only allow lib::destroy() to access it. This nicely mirrors the construction.

class init_proof
{
    bool _is_moved_from;

    init_proof() : _is_moved_from(false) {}
    ~init_proof() = default;

public:
    init_proof(init_proof&& other) noexcept
    : _is_moved_from(std::exchange(other._is_moved_from, true))
    {}
    init_proof& operator=(init_proof&&) = delete;

    explicit operator bool() const
    {
        return !_is_moved_from;
    }

    // Only init() is allowed to construct a new instance;
    // only destroy() is allowed to destroy it.
    friend init_proof init();
    friend void destroy(init_proof);
};

Unfortunately, this doesn’t work because, again, C++ lacks destructive move: The compiler is going to call the destructor of a moved-from object.

int main()
{
    auto proof = lib::init();
    std::print("{}\n", lib::do_sth(proof, 11));
    lib::destroy(std::move(proof));
} // error: call to deleted destructor of moved-from `proof`

So sadly, we still need to use runtime enforcement. The idea is to abort if the destructor is called on a non-consumed proof.

class init_proof
{
    bool _is_moved_from;

    init_proof() : _is_moved_from(false) {}

public:
    init_proof(init_proof&& other) noexcept
    : _is_moved_from(std::exchange(other._is_moved_from, true))
    {}
    init_proof& operator=(init_proof&&) = delete;

    ~init_proof()
    {
        // We are only allowed to destroy moved-from proof objects.
        assert(_is_moved_from);
    }

    explicit operator bool() const
    {
        return !_is_moved_from;
    }

    friend init_proof init();
    friend void destroy(init_proof);
};

Then lib::destroy() needs to mark it as moved-from since we want to be able to destroy it there:

void destroy(init_proof proof)
{
    assert(proof);
    proof._is_moved_from = true;
    …
}

In Rust, this idiom is called a drop bomp.

Now it is at least a runtime-error if we forget to destroy the library by not consuming the proof. Objects of type init_proof can now only be used exactly once. They are called linear types.

Conclusion

Using proof types is a simple yet powerful technique to ensure certain preconditions are fulfilled. By requiring the user to pass an object that can only come from a function that ensures the precondition, it is impossible to call the function with a failed precondition – the compiler will complain.

It can be applied whenever there is some hidden or implicit global state that is being manipulated: a library that needs to be initialized before use, a global mutex that needs to be locked, interrupts that need to be disabled before executing some code, etc. The general pattern is always the same.

Create a type that can only be constructed by the function that ensures the precondition.
Have the function return an object of that type.
Require it as an additional argument for functions that require the precondition.

The technique can even be extended to ensure teardown of the global state by consuming the proof object. Unfortunately, C++’s lack of destructive move makes it impossible to check that completely at compile-time; we have to restort to runtime assertions to prevent access in the moved-from state.