Should You Put optional in a Container?

09 Apr 2018 by Jonathan

Title says it all: should you put std::optional<T> in a container?

To answer that we have to take a slight detour first.

Update: A follow-up post is now available, read that as well.

`std::optional<T>` vs. `std::variant<T, std::monostate>`

What’s the difference between an std::optional<T> and a std::variant<T, std::monostate>?

Well, easy:

std::optional<T> is class that either stores a value of type T or nothing.

std::variant<T, std::monostate> is a class that either stores a value of type T or a value of type std::monostate.

What’s std::monostate?

Well, it’s a class whose primary purpose is to allow a std::variant that either stores one of the types or none at all.

Technically, it is to make the variant default-constructible if none of the types are default constructible. The standard library really likes default-constructible for some reason.

So, std::variant<T, std::monostate> is a class that either stores a value of type T or nothing.

Ergo:

template <typename T>
using optional = std::variant<T, std::monostate>;

The only difference is in the interface.

But let’s take a look at a different example:

// the id of something
struct id { … }; // not really empty

// tag type to mark an invalid id
struct invalid_id {}; // really empty

// parses an id giving a str
std::variant<id, invalid_id> parse(std::string_view str);

As not every string is a valid id, the result either returns a valid id or a tag type to mark an invalid id.

Now, what’s the difference between std::variant<id, invalid_id> and std::variant<id, std::monostate>?

The name of the empty state.

std::monostate is also comparable with all instances being equal.

However, in my opinion the name of the empty state is important for the semantics: std::variant<id, invalid_id> has a special empty state — an invalid id, whereas std::variant<id, std::monostate> just a generic one.

This difference can be even bigger if we add another empty state:

std::variant<id, invalid_id, empty_string> parse(std::string_view str);

Either we get an id, or the string was invalid or the string was empty.

I’m not saying that this is an ideal design of the parse function, better would probably a std::variant<id, error_code> — i.e. expected<id>, this is just a motivational example.

So using std::variant<T, std::monostate> and std::optional<T> have the same semantic meaning: Either there is an object or there is none. Because std::optional has a somewhat nicer interface I’d recommend using that one instead.

In my type_safe library std::variant<T, std::monostate> is spelled type_safe::variant<type_safe::nullopt_t, T>, you put the empty optional tag type into the variant which makes the semantics clearer.

However, there is a difference between std::variant<T, std::monostate> and std::variant<T, U> where U is an empty type: The latter gives the empty state a special semantic meaning and not just “empty state”.

I’d recommend using variant instead of optional whenever you can give the state a special name and/or it isn’t clear what it means.

`std::optional<T>` in Sequence Containers

What does this have to do with containers, you’d ask.

Well, consider std::vector<std::optional<int>>:

std::vector<std::optional<int>> vec;
vec.push_back(42);
vec.push_back(std::nullopt);

This creates a container containing two elements — a 42 and a std::nullopt.

But if you put an empty optional in a container, why put it in there at all?

std::vector<int> vec;
vec.push_back(42);

This creates a container containing one element - a 42. I’d argue that this is identical to the previous example, just nicer to work with. So don’t put empty optionals into sequence containers, put nothing in there instead.

Note that this doesn’t apply if you use std::vector<T> as a map from integers to T, see below for that.

Now if you say that the empty optional has a special meaning for your algorithm or something like that — read the first part: You don’t want std::optional<T> you want std::variant<T, special_meaning>.

`std::optional<T>` in Sets

The same applies to std::set and the variants. However, here it is especially stupid as you can only put the empty state into it once:

std::set<std::optional<int>> set;
set.insert(42);
set.insert(std::nullopt);
set.insert(std::nullopt); // won't insert it again

So don’t use std::optional<T> as the key type in sets.

Again, if you want an “empty key” pick a std::variant<T, empty_key>. This also allows multiple empty keys (they just need different types).

`std::optional<T>` in Maps

A map like std::map has two places where you can put an optional: as a key or as a value. As key doesn’t make sense as already discussed.

However, as value is interesting:

std::map<int, std::optional<int>> map;
map[42] = 42; // map 42 to 42
map[3] = 5; // map 3 to 5
map[9] = std::nullopt; // map 9 to empty optional

Here, we can either map an int to an int, or an int to nothing. This is useful if you want to model a set of keys, where some have associated values and some don’t.

But consider a map designed with std::optional<T> in mind. It would probably have a lookup function:

template <typename Key, typename Value>
std::optional<Value> map<Key, Value>::lookup(const Key& key) const;

A real implementation would probably have a different return type to prevent copies of the return value. An optional reference, if you will. The question is: is that a pointer?

Come to my ACCU or C++Now talk to find out.

But consider a call to it with our given map:

std::optional<std::optional<int>> result = map.lookup(i);

The result is an optional optional int which can have three states:

empty optional — key isn’t in the map at all
optional containing an empty optional — key is in the map but without associated value
optional containing an optional containing an int — key is in the map with this associated value

if (!result)
{
    // key is not in the map
}
else if (!result.value())
{
    // key is in the map but without value
}
else
{
    // key is in the map with this value
    auto value = result.value().value();
}

This is sort of ugly, it would be nice if they had names:

std::map<int, std::variant<int, no_value>> map;

std::optional<std::variant<int, no_value>> result = map.lookup(42);
if (!result)
{
    // key not in the map
}
else if (auto value = std::get_if<int>(&result.value()))
{
    // key has this associated value
}
else
{
    // key doesn't have an associated value
}

Ignoring the fact that dealing with variants in C++ is horrendously ugly, this is more readable than the std::optional<std::optional<int>> it was before.

However, the perfect solution would be a special partial_map container:

// only some int's are mapped to others
partial_map<int, int> map;

std::variant<int, no_value, unknown_key> result = map.lookup(42);
if (std::holds_alternative<unknown_key>(result))
{
    // key not in the map
}
else if (std::holds_alternative<no_value>(result))
{
    // key doesn't have a value
}
else
{
    // key has this associated value
    auto value = std::get<int>(result);
}

If you want a fun meta programming exercise, try writing a flatten function that takes a nested optional and unpacks it into a variant:

std::optional<std::optional<int>> nested_opt;
std::variant<outer_empty, inner_empty, int> variant = flatten(nested_opt, outer_empty{}, inner_empty{});

Solution at the end of the post.

`std::optional<T>` in Containers — Performance

Even if you don’t care about the semantic and readability argument, you might care about the performance argument.

If you have a std::optional<T> in a container, the iteration looks like this:

std::vector<std::optional<T>> container;

…

for (auto& el : container)
{
    if (el)
    {
        // handle element
    }
    else
    {
        // handle no element
    }
}

You have a branch in a — potentially — hot loop. As it is unlikely that the existing and non-existing elements are in any particular order, the branch predictor can’t help you much.

Now, if you need the non-existing elements to be processed relative to the existing elements in the correct order, you’re out of luck. But if you don’t need to do that, this can be optimized:

Better would be to do something similar to struct of arrays:

std::vector<T> t_container;
std::vector<std::nullopt> null_container;

…

for (auto& el : container)
{
    // handle element
}

for (auto& null : null_container)
{
    // handle no element
}

Here there isn’t a branch at all. Furthermore, T is smaller than std::optional<T> so you’re even saving memory.

Now you might reasonable see that it is silly to store std::nullopt at all:

std::vector<T> t_container;
std::size_t null_container_size;

…

for (auto& el : container)
{
    // handle element
}

for (auto i = 0u; i != null_container_size; ++i)
{
    // handle no element
}

This also applies to std::vector<std::variant<Ts...>> in general: Consider multiple vectors, one for each variant. A possible variant_vector<Ts...> that does this automatically is left as an exercise for the reader.

Conclusion

If you put an empty optional into a container, just put nothing in there instead. This makes it easier to deal with the container.

If the empty state has a special semantic meaning, don’t use std::optional<T>, use std::variant<T, special_meaning>. This makes it easier to reason about the code.

One possible exception is std::map<Key, std::optional<Value>> for mapping only some keys to values. However, there are possible better implementations out there.

Update: A follow-up post is now available, read that as well.

Appendix: `flatten()`

Here is a quick sample implementation of the flatten() function.

First, let’s calculate the type:

// helper trait to check whether a type is an optional
template <typename T>
struct is_optional : std::false_type {};
template <typename T>
struct is_optional<std::optional<T>> : std::true_type {};

// helper trait to convert a `std::variant<...>` to `std::variant<T, ...>`
template <typename T, class Variant>
struct append_variant;
template <typename T, typename ... Types>
struct append_variant<T, std::variant<std::variant<Types...>>>
{
    using type = std::variant<T, Types...>;
};


template <class NestedOpt, class ... Empty>
struct flatten_type_impl;

// base case: optional not further nested
template <typename T, class ... Empty>
struct flatten_type_impl<std::enable_if_t<!is_optional<T>{}>, std::optional<T>, Empty...>
{
    static_assert(sizeof...(Empty) == 1);

    // result is the empty type or T
    using type = std::variant<Empty..., T>;
};

// recursive case: nested optional
template <class Opt, typename Head, class ... Empty>
struct flatten_type_impl<std::enable_if_t<is_optional<Opt>{}>, std::optional<Opt>, Head, Empty...>
{
    // variant for the value of the nested optional
    using recursive_type = typename flatten_type_impl<void, Opt, Empty...>::type;
    // put Head empty type in front
    using type = typename append_variant<Head, recursive_type>::type;
};

// convenience typedef
template <class NestedOpt, class ... Empty>
using flatten_type = typename flatten_type_impl<void, NestedOpt, Empty...>::type;

Then we can write the function by recursively unpacking:

// helper function to recursively fill the variant
template <class Result, typename T, typename Empty, typename ... Rest>
void flatten_impl(Result& result, const std::optional<T>& opt, Empty empty, Rest... rest)
{
    if (opt)
    {
        // optional has a value, store the corresponding inner value
        if constexpr (is_optional<T>{})
            // nested optional, recurse
            flatten_impl(result, opt.value(), rest...);
        else
            // not a nested optional, store value directly
            result = opt.value();
    }
    else
        result = empty;
}

// actual flatten function
template <class NestedOpt, class ... Empty>
auto flatten(const NestedOpt& opt, Empty... empty)
{
    // create the variant
    // it is always default constructible, as the first type is an empty type
    flatten_type<NestedOpt, Empty...> result;
    // fill it recursively
    flatten_impl(result, opt, empty...);
    return result;
}

This blog post was written for my old blog design and ported over. If there are any issues, please let me know.

std::optional<T> vs. std::variant<T, std::monostate>

std::optional<T> in Sequence Containers

std::optional<T> in Sets

std::optional<T> in Maps

std::optional<T> in Containers — Performance