optional in Containers Ⅱ — Not All std::vector Usages Are The Same
Okay, so in the previous post I talked about putting optional<T>
in container.
I came to conclusions which I though were reasonable at the time, however, people — rightfully — pointed out some flaws in my argumentation.
As I was at ACCU last week, I wasn’t able to respond to them earlier (note to self: don’t publish and then fly away to a conference), so I’m doing that now. Let’s revisit my argumentations and see where I was wrong.
std::optional<T>
vs. std::variant<T, std::monostate>
I argued that std::optional<T>
and std::variant<T, std::monostate>
fulfill the same purpose:
Both represent a type that either stores a value of type T
or none at all.
I still think this is valid.
Of course — as someone on reddit pointed out — you wouldn’t want to actually use std::variant<T, std::monostate>
in place of std::optional<T>
:
the interface is clumsier and it is simply more to type.
But conceptually they are the same type.
I also argued that you shouldn’t use std::optional<T>
(or std::variant<T, std::monostate>
) if the empty type has a special semantic meaning like “id invalid”.
Instead you should use std::variant<T, special_meaning>
.
I still think that following this advice can lead to cleaner code.
std::optional<T>
in Sets
I said that you shouldn’t put std::optional<T>
in a set, simply because it is somewhat pointless:
You can only put a single empty optional in there anyway, and then you could also simply put nothing in there.
So don’t use std::optional<T>
in a set (or as a key type in a map).
If your algorithm works differently whether or not std::nullopt
is in the set, you don’t actually mean std::nullopt
, you mean special_meaning
and want to store a std::variant
.
Nobody seems to argue against that, so that advice is fine.
std::optional<T>
in Maps
std::optional<T>
as a key type in a map doesn’t make sense as argued above, so the only thing to look at is using std::optional<T>
as a mapped type.
I said that a std::map<T, std::optional<U>>
is a partial map: a key may or may not have a value.
And if you need that, this is a fine abstraction.
However, a map of optionals is somewhat unwieldy:
A potential lookup()
function that returns an optional<mapped_type>
leads to a nested optional, which is a little bit weird to use.
A std::map<T, std::variant<U, no_value>>
is a somewhat cleaner abstraction in my opinion.
But the best solution would be a partial_map<T, U>
that supports it natively.
Not much objection there either, so let’s move to the main point of controversy:
std::optional<T>
in Sequence Containers
I said that you don’t need to put std::nullopt
in a sequence container: just put nothing in there instead.
And this is where many think I’m wrong. And I am — but my advice is still valid, just not for a “sequence container” per se.
Let me elaborate.
In a recent project I’m working on (just something fun for personal use) I’m using a lot of std::vector<T>
.
However, I’m not using them like you might want to use a std::vector<T>
.
In particular, I’m just using them as a place to stuff things in, and then I later need to do a range-based for over them:
std::vector<int> my_ints;
// fill container up with some integers
for (auto i : my_ints)
do_sth(i);
// fill container up with some more ints
for (auto i : my_ints)
do_sth_else(i);
I don’t really care about the interface that makes std::vector<T>
special:
I don’t need random access because asking for the i
-th element doesn’t make any sense with my usage!
I don’t really care about order either:
All I care about is whether or not I will process the element eventually if it is in there.
This means that I would remove an element by swapping it with the last one and doing a pop_back()
,
which is O(1)
compared to the usual O(n)
of std::vector<T>::erase
.
And for this kind of usage of std::vector<T>
my advice is correct:
I don’t need to store std::optional<T>
in the container because I don’t need to process the std::nullopt
s.
It leads to faster and more efficient code if I just store the T
s directly and nothing in case of a std::nullopt
.
However, this is not the usual usage of std::vector<T>
:
Order usually matters — after all it is a sequence container.
But I didn’t realize that my usage of std::vector<T>
doesn’t match that usage, so I wrote that advice.
Bag of T
There’s something we can learn about this mistake: The need for a new container.
A container that is like std::vector<T>
but doesn’t provide ordering or an array access operator, it just has insert(element)
and erase(iter)
,
both are O(1)
.
Let’s call it bag<T>
because it is just that: a bag where you put elements.
A simple implementation on top of std::vector<T>
can look like this:
template <typename T>
class bag
{
std::vector<T> container_;
public:
using value_type = T;
using iterator = typename std::vector<T>::iterator;
using const_iterator = typename std::vector<T>::const_iterator;
//=== constructors/destructors ===//
bag() = default;
// other constructors, assignment if needed
//=== access ===//
iterator begin() noexcept
{
return container_.begin();
}
const_iterator begin() const noexcept
{
return container_.begin();
}
const_iterator cbegin() const noexcept
{
return container_.begin();
}
iterator end() noexcept
{
return container_.end();
}
const_iterator end() const noexcept
{
return container_.end();
}
const_iterator cend() const noexcept
{
return container_.end();
}
// note: no array access, front, back
// maybe data() if necessary
//=== capacity ===//
bool empty() const noexcept
{
return container_.empty();
}
size_type size() const noexcept
{
return container_.size();
}
size_type capacity() const noexcept
{
return container_.capacity();
}
void reserve(size_type new_capacity)
{
container_.reserve(new_capacity);
}
void shrink_to_fit()
{
container_.shrink_to_fit();
}
//=== modifiers ===//
template <typename... Args>
void emplace(Args&&... args)
{
container_.emplace_back(std::forward<Args>(args)...);
}
void insert(const T& value)
{
emplace(value);
}
void insert(T&& value)
{
emplace(std::move(value));
}
// range insert if needed
void clear() noexcept
{
container_.clear();
}
void erase(iterator iter)
{
if (iter != std::prev(container_.end())
{
// swap with last element
using std::swap;
swap(*iter, container_.back());
}
container_.pop_back();
}
// range erase if needed
};
This is just a proof-of-concept implementation programmed in the editor. I am working on a library that contains a real implementation, expect it soon™.
Now, for this container it definitely doesn’t make sense to store optionals in there.
In the previous post I’ve also mentioned an optimization for std::vector<std::variant<T...>>
that unwraps it into multiple std::vector<T>...
internally.
This is better for branch prediction and uses less memory.
Of course, this optimization doesn’t make sense if you use std::vector<T>
as a sequence container.
But for bag
it makes sense, and is in fact the main datastructure in my side project.
Why Bother At All?
Some of you also questioned why I was on such a crusade against std::optional<T>
inside a container.
The reason is simple: I had a similar design originally, realized its flaws and wanted to prevent others from doing the same.
So I generalized and thought about other containers as well.
What I didn’t realize at the time was that my use of std::vector
was different than the normal use.
But I think this still lead to an interesting discovery: the need for a new container type, bag<T>
.
This blog post was written for my old blog design and ported over. If there are any issues, please let me know.