Should You Put optional in a Container?
Title says it all: should you put std::optional<T>
in a container?
To answer that we have to take a slight detour first.
Update: A follow-up post is now available, read that as well.
std::optional<T>
vs. std::variant<T, std::monostate>
What’s the difference between an std::optional<T>
and a std::variant<T, std::monostate>
?
Well, easy:
std::optional<T>
is class that either stores a value of type T
or nothing.
std::variant<T, std::monostate>
is a class that either stores a value of type T
or a value of type std::monostate
.
What’s std::monostate
?
Well, it’s a class whose primary purpose is to allow a std::variant
that either stores one of the types or none at all.
Technically, it is to make the
variant
default-constructible if none of the types are default constructible. The standard library really likes default-constructible for some reason.
So, std::variant<T, std::monostate>
is a class that either stores a value of type T
or nothing.
Ergo:
template <typename T>
using optional = std::variant<T, std::monostate>;
The only difference is in the interface.
But let’s take a look at a different example:
// the id of something
struct id { … }; // not really empty
// tag type to mark an invalid id
struct invalid_id {}; // really empty
// parses an id giving a str
std::variant<id, invalid_id> parse(std::string_view str);
As not every string is a valid id, the result either returns a valid id or a tag type to mark an invalid id.
Now, what’s the difference between std::variant<id, invalid_id>
and std::variant<id, std::monostate>
?
The name of the empty state.
std::monostate
is also comparable with all instances being equal.
However, in my opinion the name of the empty state is important for the semantics:
std::variant<id, invalid_id>
has a special empty state — an invalid id, whereas std::variant<id, std::monostate>
just a generic one.
This difference can be even bigger if we add another empty state:
std::variant<id, invalid_id, empty_string> parse(std::string_view str);
Either we get an id, or the string was invalid or the string was empty.
I’m not saying that this is an ideal design of the parse function, better would probably a
std::variant<id, error_code>
— i.e.expected<id>
, this is just a motivational example.
So using std::variant<T, std::monostate>
and std::optional<T>
have the same semantic meaning:
Either there is an object or there is none.
Because std::optional
has a somewhat nicer interface I’d recommend using that one instead.
In my type_safe library
std::variant<T, std::monostate>
is spelledtype_safe::variant<type_safe::nullopt_t, T>
, you put the empty optional tag type into the variant which makes the semantics clearer.
However, there is a difference between std::variant<T, std::monostate>
and std::variant<T, U>
where U
is an empty type:
The latter gives the empty state a special semantic meaning and not just “empty state”.
I’d recommend using variant
instead of optional whenever you can give the state a special name and/or it isn’t clear what it means.
std::optional<T>
in Sequence Containers
What does this have to do with containers, you’d ask.
Well, consider std::vector<std::optional<int>>
:
std::vector<std::optional<int>> vec;
vec.push_back(42);
vec.push_back(std::nullopt);
This creates a container containing two elements — a 42
and a std::nullopt
.
But if you put an empty optional in a container, why put it in there at all?
std::vector<int> vec;
vec.push_back(42);
This creates a container containing one element - a 42
.
I’d argue that this is identical to the previous example, just nicer to work with.
So don’t put empty optionals into sequence containers, put nothing in there instead.
Note that this doesn’t apply if you use
std::vector<T>
as a map from integers toT
, see below for that.
Now if you say that the empty optional has a special meaning for your algorithm or something like that — read the first part:
You don’t want std::optional<T>
you want std::variant<T, special_meaning>
.
std::optional<T>
in Sets
The same applies to std::set
and the variants.
However, here it is especially stupid as you can only put the empty state into it once:
std::set<std::optional<int>> set;
set.insert(42);
set.insert(std::nullopt);
set.insert(std::nullopt); // won't insert it again
So don’t use std::optional<T>
as the key type in sets.
Again, if you want an “empty key” pick a std::variant<T, empty_key>
.
This also allows multiple empty keys (they just need different types).
std::optional<T>
in Maps
A map like std::map
has two places where you can put an optional: as a key or as a value.
As key doesn’t make sense as already discussed.
However, as value is interesting:
std::map<int, std::optional<int>> map;
map[42] = 42; // map 42 to 42
map[3] = 5; // map 3 to 5
map[9] = std::nullopt; // map 9 to empty optional
Here, we can either map an int
to an int
, or an int
to nothing.
This is useful if you want to model a set of keys, where some have associated values and some don’t.
But consider a map designed with std::optional<T>
in mind.
It would probably have a lookup function:
template <typename Key, typename Value>
std::optional<Value> map<Key, Value>::lookup(const Key& key) const;
A real implementation would probably have a different return type to prevent copies of the return value. An optional reference, if you will. The question is: is that a pointer?
Come to my ACCU or C++Now talk to find out.
But consider a call to it with our given map:
std::optional<std::optional<int>> result = map.lookup(i);
The result is an optional optional int
which can have three states:
- empty optional — key isn’t in the map at all
- optional containing an empty optional — key is in the map but without associated value
- optional containing an optional containing an
int
— key is in the map with this associated value
if (!result)
{
// key is not in the map
}
else if (!result.value())
{
// key is in the map but without value
}
else
{
// key is in the map with this value
auto value = result.value().value();
}
This is sort of ugly, it would be nice if they had names:
std::map<int, std::variant<int, no_value>> map;
std::optional<std::variant<int, no_value>> result = map.lookup(42);
if (!result)
{
// key not in the map
}
else if (auto value = std::get_if<int>(&result.value()))
{
// key has this associated value
}
else
{
// key doesn't have an associated value
}
Ignoring the fact that dealing with variants in C++ is horrendously ugly, this is more readable than the std::optional<std::optional<int>>
it was before.
However, the perfect solution would be a special partial_map
container:
// only some int's are mapped to others
partial_map<int, int> map;
std::variant<int, no_value, unknown_key> result = map.lookup(42);
if (std::holds_alternative<unknown_key>(result))
{
// key not in the map
}
else if (std::holds_alternative<no_value>(result))
{
// key doesn't have a value
}
else
{
// key has this associated value
auto value = std::get<int>(result);
}
If you want a fun meta programming exercise, try writing a flatten
function that takes a nested optional and unpacks it into a variant:
std::optional<std::optional<int>> nested_opt;
std::variant<outer_empty, inner_empty, int> variant = flatten(nested_opt, outer_empty{}, inner_empty{});
Solution at the end of the post.
std::optional<T>
in Containers — Performance
Even if you don’t care about the semantic and readability argument, you might care about the performance argument.
If you have a std::optional<T>
in a container, the iteration looks like this:
std::vector<std::optional<T>> container;
…
for (auto& el : container)
{
if (el)
{
// handle element
}
else
{
// handle no element
}
}
You have a branch in a — potentially — hot loop. As it is unlikely that the existing and non-existing elements are in any particular order, the branch predictor can’t help you much.
Now, if you need the non-existing elements to be processed relative to the existing elements in the correct order, you’re out of luck. But if you don’t need to do that, this can be optimized:
Better would be to do something similar to struct of arrays:
std::vector<T> t_container;
std::vector<std::nullopt> null_container;
…
for (auto& el : container)
{
// handle element
}
for (auto& null : null_container)
{
// handle no element
}
Here there isn’t a branch at all.
Furthermore, T
is smaller than std::optional<T>
so you’re even saving memory.
Now you might reasonable see that it is silly to store std::nullopt
at all:
std::vector<T> t_container;
std::size_t null_container_size;
…
for (auto& el : container)
{
// handle element
}
for (auto i = 0u; i != null_container_size; ++i)
{
// handle no element
}
This also applies to std::vector<std::variant<Ts...>>
in general:
Consider multiple vectors, one for each variant.
A possible variant_vector<Ts...>
that does this automatically is left as an exercise for the reader.
Conclusion
If you put an empty optional into a container, just put nothing in there instead. This makes it easier to deal with the container.
If the empty state has a special semantic meaning, don’t use std::optional<T>
, use std::variant<T, special_meaning>
.
This makes it easier to reason about the code.
One possible exception is std::map<Key, std::optional<Value>>
for mapping only some keys to values.
However, there are possible better implementations out there.
Update: A follow-up post is now available, read that as well.
Appendix: flatten()
Here is a quick sample implementation of the flatten()
function.
First, let’s calculate the type:
// helper trait to check whether a type is an optional
template <typename T>
struct is_optional : std::false_type {};
template <typename T>
struct is_optional<std::optional<T>> : std::true_type {};
// helper trait to convert a `std::variant<...>` to `std::variant<T, ...>`
template <typename T, class Variant>
struct append_variant;
template <typename T, typename ... Types>
struct append_variant<T, std::variant<std::variant<Types...>>>
{
using type = std::variant<T, Types...>;
};
template <class NestedOpt, class ... Empty>
struct flatten_type_impl;
// base case: optional not further nested
template <typename T, class ... Empty>
struct flatten_type_impl<std::enable_if_t<!is_optional<T>{}>, std::optional<T>, Empty...>
{
static_assert(sizeof...(Empty) == 1);
// result is the empty type or T
using type = std::variant<Empty..., T>;
};
// recursive case: nested optional
template <class Opt, typename Head, class ... Empty>
struct flatten_type_impl<std::enable_if_t<is_optional<Opt>{}>, std::optional<Opt>, Head, Empty...>
{
// variant for the value of the nested optional
using recursive_type = typename flatten_type_impl<void, Opt, Empty...>::type;
// put Head empty type in front
using type = typename append_variant<Head, recursive_type>::type;
};
// convenience typedef
template <class NestedOpt, class ... Empty>
using flatten_type = typename flatten_type_impl<void, NestedOpt, Empty...>::type;
Then we can write the function by recursively unpacking:
// helper function to recursively fill the variant
template <class Result, typename T, typename Empty, typename ... Rest>
void flatten_impl(Result& result, const std::optional<T>& opt, Empty empty, Rest... rest)
{
if (opt)
{
// optional has a value, store the corresponding inner value
if constexpr (is_optional<T>{})
// nested optional, recurse
flatten_impl(result, opt.value(), rest...);
else
// not a nested optional, store value directly
result = opt.value();
}
else
result = empty;
}
// actual flatten function
template <class NestedOpt, class ... Empty>
auto flatten(const NestedOpt& opt, Empty... empty)
{
// create the variant
// it is always default constructible, as the first type is an empty type
flatten_type<NestedOpt, Empty...> result;
// fill it recursively
flatten_impl(result, opt, empty...);
return result;
}
This blog post was written for my old blog design and ported over. If there are any issues, please let me know.