foonathan::blog()

Thoughts from a C++ library developer.

optional<T> in Containers Ⅱ — Not All std::vector Usages Are The Same

Okay, so in the previous post I talked about putting optional<T> in container. I came to conclusions which I though were reasonable at the time, however, people — rightfully — pointed out some flaws in my argumentation.

As I was at ACCU last week, I wasn’t able to respond to them earlier (note to self: don’t publish and then fly away to a conference), so I’m doing that now. Let’s revisit my argumentations and see where I was wrong.

std::optional<T> vs. std::variant<T, std::monostate>

I argued that std::optional<T> and std::variant<T, std::monostate> fulfill the same purpose: Both represent a type that either stores a value of type T or none at all.

I still think this is valid. Of course — as someone on reddit pointed out — you wouldn’t want to actually use std::variant<T, std::monostate> in place of std::optional<T>: the interface is clumsier and it is simply more to type. But conceptually they are the same type.

I also argued that you shouldn’t use std::optional<T> (or std::variant<T, std::monostate>) if the empty type has a special semantic meaning like “id invalid”. Instead you should use std::variant<T, special_meaning>. I still think that following this advice can lead to cleaner code.

std::optional<T> in Sets

I said that you shouldn’t put std::optional<T> in a set, simply because it is somewhat pointless: You can only put a single empty optional in there anyway, and then you could also simply put nothing in there. So don’t use std::optional<T> in a set (or as a key type in a map).

If your algorithm works differently whether or not std::nullopt is in the set, you don’t actually mean std::nullopt, you mean special_meaning and want to store a std::variant.

Nobody seems to argue against that, so that advice is fine.

std::optional<T> in Maps

std::optional<T> as a key type in a map doesn’t make sense as argued above, so the only thing to look at is using std::optional<T> as a mapped type.

I said that a std::map<T, std::optional<U>> is a partial map: a key may or may not have a value. And if you need that, this is a fine abstraction.

However, a map of optionals is somewhat unwieldy: A potential lookup() function that returns an optional<mapped_type> leads to a nested optional, which is a little bit weird to use. A std::map<T, std::variant<U, no_value>> is a somewhat cleaner abstraction in my opinion.

But the best solution would be a partial_map<T, U> that supports it natively.

Not much objection there either, so let’s move to the main point of controversy:

std::optional<T> in Sequence Containers

I said that you don’t need to put std::nullopt in a sequence container: just put nothing in there instead.

And this is where many think I’m wrong. And I am — but my advice is still valid, just not for a “sequence container” per se.

Let me elaborate.

In a recent project I’m working on (just something fun for personal use) I’m using a lot of std::vector<T>. However, I’m not using them like you might want to use a std::vector<T>. In particular, I’m just using them as a place to stuff things in, and then I later need to do a range-based for over them:

std::vector<int> my_ints;
// fill container up with some integers

for (auto i : my_ints)
    do_sth(i);

// fill container up with some more ints

for (auto i : my_ints)
    do_sth_else(i);

I don’t really care about the interface that makes std::vector<T> special: I don’t need random access because asking for the i-th element doesn’t make any sense with my usage!

I don’t really care about order either: All I care about is whether or not I will process the element eventually if it is in there. This means that I would remove an element by swapping it with the last one and doing a pop_back(), which is O(1) compared to the usual O(n) of std::vector<T>::erase.

And for this kind of usage of std::vector<T> my advice is correct: I don’t need to store std::optional<T> in the container because I don’t need to process the std::nullopts. It leads to faster and more efficient code if I just store the Ts directly and nothing in case of a std::nullopt.

However, this is not the usual usage of std::vector<T>: Order usually matters — after all it is a sequence container. But I didn’t realize that my usage of std::vector<T> doesn’t match that usage, so I wrote that advice.

Bag of T

There’s something we can learn about this mistake: The need for a new container. A container that is like std::vector<T> but doesn’t provide ordering or an array access operator, it just has insert(element) and erase(iter), both are O(1).

Let’s call it bag<T> because it is just that: a bag where you put elements. A simple implementation on top of std::vector<T> can look like this:

template <typename T>
class bag
{
    std::vector<T> container_;

public:
    using value_type    = T;
    using iterator       = typename std::vector<T>::iterator;
    using const_iterator = typename std::vector<T>::const_iterator;

    //=== constructors/destructors ===//
    bag() = default;

    // other constructors, assignment if needed

    //=== access ===//
    iterator begin() noexcept
    {
        return container_.begin();
    }
    const_iterator begin() const noexcept
    {
        return container_.begin();
    }
    const_iterator cbegin() const noexcept
    {
        return container_.begin();
    }

    iterator end() noexcept
    {
        return container_.end();
    }
    const_iterator end() const noexcept
    {
        return container_.end();
    }
    const_iterator cend() const noexcept
    {
        return container_.end();
    }
    
    // note: no array access, front, back
    // maybe data() if necessary

    //=== capacity ===//
    bool empty() const noexcept
    {
        return container_.empty();
    }

    size_type size() const noexcept
    {
        return container_.size();
    }

    size_type capacity() const noexcept
    {
        return container_.capacity();
    }

    void reserve(size_type new_capacity)
    {
        container_.reserve(new_capacity);
    }

    void shrink_to_fit()
    {
        container_.shrink_to_fit();
    }

    //=== modifiers ===//
    template <typename... Args>
    void emplace(Args&&... args)
    {
        container_.emplace_back(std::forward<Args>(args)...);
    }

    void insert(const T& value)
    {
        emplace(value);
    }
    void insert(T&& value)
    {
        emplace(std::move(value));
    }

    // range insert if needed

    void clear() noexcept
    {
        container_.clear();
    }

    void erase(iterator iter) 
    {
        if (iter != std::prev(container_.end())
        {
            // swap with last element
            using std::swap;
            swap(*iter, container_.back());
        }
        container_.pop_back();
    }
    
    // range erase if needed
};

This is just a proof-of-concept implementation programmed in the editor. I am working on a library that contains a real implementation, expect it soon™.

Now, for this container it definitely doesn’t make sense to store optionals in there.

In the previous post I’ve also mentioned an optimization for std::vector<std::variant<T...>> that unwraps it into multiple std::vector<T>... internally. This is better for branch prediction and uses less memory. Of course, this optimization doesn’t make sense if you use std::vector<T> as a sequence container. But for bag it makes sense, and is in fact the main datastructure in my side project.

Why Bother At All?

Some of you also questioned why I was on such a crusade against std::optional<T> inside a container. The reason is simple: I had a similar design originally, realized its flaws and wanted to prevent others from doing the same. So I generalized and thought about other containers as well. What I didn’t realize at the time was that my use of std::vector was different than the normal use.

But I think this still lead to an interesting discovery: the need for a new container type, bag<T>.

This post was made possible by my Patreon supporters. If you'd like to support me as well, please head over to my Patreon and do so! One dollar per month can make all the difference.