My take on variant
C++17 is going to add std::variant.
To quote the linked documentation, it is a “type-safe union”.
A union
is like a struct
, but can only store one member at a time.
This has many applications, but sadly it doesn’t mix well with non-trivial types,
you have to call the destructor yourself etc.
Furthermore, nothing prevents you from accessing a union member that isn’t active.
std::variant
fixes that. It correctly calls the destructor when switching the active member,
it prevents invalid access, etc.
However, I’m not quite happy with it and I needed an implementation now.
So I’ve decided to implement my own variant as part of my type_safe library.
It was a fun challenge and since my previous attempt was two years ago, I could improve it a lot. Let’s go through some of my design decisions.
Building block: tagged_union<Types...>
The heart of a variant
is a tagged union.
A tagged union is like a union
but also remembers the currently stored type.
It stores some type_id
which uniquely represents one of the types.
As many variant
operations like copy-constructing have some overhead due to necessary type-erasure,
I’ve opted for making a separate tagged_union
class that has absolutely no overhead compared to a C union
- except the necessary space for the type_id
tag.
tagged_union<Types...>
stores one of the given types or no type.
It constructor puts it in the empty state and the destructor does nothing - it is the users responsibility to clean up,
and copy/move operations are deleted, so you can’t accidentally do a memcpy()
equivalent of the stored object.
You can do the following operations:
-
emplace<T>()
- creates a new object of the given type in the union. -
destroy<T>()
- destroys the currently stored object of given type (type must match). -
type()
- returns a type identifier of the currently stored type - the “tag”. -
value<T>()
- returns the stored value of given type (type must match).
While this interface is very primitive - you have to know the currently stored type and pass in a template parameter,
this is necessary due to the zero overhead implementation.
But this interface is also type-safe: You can’t switch the active types “accidentally” like in a C union
.
Whenever you emplace or destroy an object, the tag is updated automatically,
and value()
has a debug assertion that checks the tag.
The tag itself - the type_id
returned by type()
, is a strong typedef to std::size_t
, i.e. the index of the currently active type in the variadic type list.
It only provides comparison.
The strong typedef is also dependent on the tagged_union
type.
This means that you can’t compare type_id
s from different tagged_union
instantiations, as the uniqueness of the id depends on the type list.
The implementation of tagged_union
itself is pretty straightforward thanks to std::aligned_union.
However, there is one problem still left to solve.
emplace()
/destroy()
and value()
all require that you pass in the type you want to create.
This means that they are templates where you have to pass in an explicit template parameter.
However explicitly passing template parameters have some problems, in particular:
- If you have a dependent name, you need
.template
disambiguation. If you know what I mean, I pity you.
That’s also the reason
std::tuple
has non-memberget
, likestd::variant
.
- I dislike it for reasons outlined in this post.
But there is an even bigger problem:
In order to get the value of a tagged_union
, you’d write code like this:
tagged_union<int, float, char> u;
…
if (u.type() == type_id_for_int)
do_sth_with_int(u.value<int>());
But how do you spell type_id_for_int
?
tagged_union
could provide a get_type_id<T>()
function but that’s kind of awkward.
It would be more intuitive to use the constructor of type_id
.
However you can’t pass template parameters to a constructor!
Luckily, there is a solution. An elegant solution that solves all this problem. We use the trick I’ve shown in my function template parameter post I’ve already linked above.
If I wouldn’t know better I’d think that I have all these posts planned in advance…
The trick is to create a tag type that we use to allow template instantiations:
template <typename T>
struct union_type {};
This little struct
solves all of the problems.
With it, the signature of destroy()
, for example, looks like this:
template <typename T>
void destroy(union_type<T>)
{
…
}
And the example from above like this:
if (u.type() == union_t::type_id(union_type<int>{}))
do_sth_with_int(u.value(union_type<int>{}));
If you think that’s too verbose, I agree.
variant
has far better access operations, buttagged_union
is just a low-level building block.
You can find all details about tagged_union
in the documentation.
Building block: visitation
Using tagged_union
like this is pretty awkward.
For example, let’s say you want to destroy the currently stored type of a tagged_union<int, float, char>
:
Ignore the fact that these are all trivial types.
if (u.type() == union_t::type_id(union_type<int>{}))
u.destroy(union_type<int>{});
else if (u.type() == union_t::type_id(union_type<float>{}))
u.destroy(union_type<float>{});
else if (u.type() == union_t::type_id(union_type<char>{}))
u.destroy(union_type<char>{});
else
// no value stored - or maybe I forgot a type?
Every time, you don’t know statically which type is stored, you’d need this kind of type switch. It is verbose and error-prone.
So let’s implement it once in a generic way.
A couple types in type_safe provide a (non-member) with()
function.
It takes an object and a functor and invokes it with some form of stored/underlying type.
For tagged_union
, with()
can look like this:
template <typename ... Types, typename Func, typename ... Args>
void with(tagged_union<Types>& u, Func&& f, Args&&... additional_args);
// also overloads for `const&`, `&&` and `const&&`.
It basically calls std::forward<Func>(f)(u.value(union_type<T>{}), std::forward<Args>(additional_args))
,
where T
is the type currently stored in the union.
If the call isn’t well-formed or there is no type stored, with()
does nothing.
There is also a
visit()
function forvariant
s similar tostd::visit()
, butwith()
is simpler and more flexible with the additional arguments.
With with()
- sorry - you can implement a destroy()
function that destroys is without statically knowing the type:
template <typename ... Types>
void destroy(tagged_union<Types...>& u)
{
with(u, [&](auto& value)
{
// we don't actually need the stored object
// remember, never called if no object stored
using type = std::decay_t<decltype(value)>;
u.destroy(union_type<T>{});
});
}
But it can also implement copy()
, which would be used in variant
s copy constructor:
template <typename ... Types>
void copy(tagged_union<Types...>& dest, const tagged_union<Types...>& other)
{
// assume dest is empty
with(other, [&](const auto& value)
{
using type = std::decay_t<decltype(value)>;
dest.emplace(union_type<T>{}, value);
});
}
with()
is needed every time the stored type isn’t statically known and makes dealing with it quite elegant.
The variant
problem
tagged_union
has been very carefully crafted, so that it avoids a fundamental implementation and design problem of variant
s:
exception safety.
emplace()
requires that the previous value has been destroyed, copy()
requires that the destination is empty.
Consider a tagged_union
that contains an object of type T
and you want to change it to a new object of type U
.
This happens in assignment,
emplace()
etc.
You have to do two things:
-
Destroy the object of type
T
. -
Create a new object of type
U
in the same storage.
You have to destroy it before you can create the new one, but what happens when the constructor of U
throws an exception?
Then the variant will not contain any object anymore,
which does not provide the strong exception safety and further prevents a variant
that will always contain a value.
But if we use a temporary to create the new U
object and then move it in?
This could work:
-
Create temporary
U
object. -
Destroy the object of type
T
. -
Move the temporary
U
into the union storage.
This provides the strong exception safety unless the move constructor throws, in which case we have the same problem as before.
But maybe we always a variant where one type is no-throw default constructible - a fallback, then we can do this:
-
Destroy the object of type
T
. -
Create a new object of type
U
in the same storage. -
If 2) throws, create an object of the fallback type in the variant.
This still doesn’t provide the strong exception safety, but at least the variant
is not going to be empty.
But let’s sacrifice the never-empty variant guarantee.
A variant already has to provide a way to check whether it contains an object of a given type,
so it is an optional type anyway - either it stores an object of type T
, or it doesn’t.
The only difference is: variant
can store one of many types, optional
just one.
So just embrace the empty state in the interface.
While this is my favorite solution, it doesn’t work for many people.
There are some additional tricks but those require additional storage and thus overhead.
That’s why std::variant
is going to be “rarely empty”.
The empty state is “invalid” and happens, for example, when the move constructor in the “create-with-temporary” algorithm described above throws.
I personally think that’s a bad solution, but arguing that would go into the move constructor rabbit hole again.
So what’s a better solution?
Well, it depends on the usage of the variant. Sometimes you want a guaranteed never-empty and are able to provide no-throw move constructors. Sometimes you have a fallback type, sometimes you want the standard semantics.
That’s why my variant is a basic_variant
.
It uses policy based design to customize this behavior.
The variant policy only controls two things:
-
whether or not the variant has an “embraced” empty state, or whether empty is just an invalid state
-
the
change_value()
behavior, i.e. what to do when the type needs to be changed
And I’ve also implemented the algorithm I’ve described above.
There is optional_variant_policy
, fallback_variant_policy
, rarely_empty_variant_policy
- what std::variant
does - and never_empty_variant_policy
which requires no-throw move constructors.
It also provides convenience typedefs: fallback_variant
, where the first type is the fallback and variant
.
variant
uses the rarely_empty_variant_policy
mimicking std::variant
unless the first type is nullvar_t
,
in which case it uses the optional_variant_policy
.
Policy based design here really pays off.
basic_variant
interface design
But the interface of basic_variant
is very different from std::variant
and - I argue - better.
For starters, all access functions are member functions.
Like tagged_union
, they use a tag type - variant_type<T>
, which is just an alias for union_type<T>
.
This is like std::variant
does with std::in_place_type_t,
but consistent throughout the entire interface.
As you saw in tagged_union
, it is very cumbersome to query whether a variant contains a type and then do something with it:
if (u.type() == union_t::type_id(union_type<int>{}))
do_sth_with_int(u.value(union_type<int>{}));
This also works with basic_variant
,
but it requires accessing a nested typedef in order to create the type_id
.
A first simplification provides the has_value()
function:
if (variant.has_value(variant_type<int>{})
do_sth_with_int(variant.value(variant_type<int>{}));
But there are more advanced functions like value_or()
:
do_sth_with_int(variant.value_or(variant_type<int>{}, fallback_value));
As I’ve said above, a variant
is just an optional
: either there is a value of type T
or there isn’t.
So you can also get an optional
from a variant
.
An optional_ref<T>
to be precise.
This is an optional reference to a T
.
That’s right, an optional reference, not a pointer.
While optional_ref<T>
is basically a pointer after an even minor optimization level, it also provides all the advanced optional functions.
Simply use the optional_value()
function and you’ll get all the safe access functions you’d want.
If you call it on an rvalue
basic_variant
object, you’ll also get anoptional_xvalue_ref<T>
, i.e. an optional rvalue reference. Yet another thing that differentiates an optional reference from a pointer.
optional_value()
is a much better solution than std::variant
’s get_if()
.
basic_variant
also provides a member function map(functor)
.
map()
returns a new basic_variant
that will contain the result of functor(value(variant_type<T>{})
or value(variant_type<T>{})
, if that’s ill-formed.
This allows a transformation of a basic_variant
.
Note that basic_variant
fully embraces a possible empty state.
It has a default constructor putting it there - unlike std::variant
s which default constructs the first type,
special has_value()
, operator=()
and value()
for nullvar_t
as well as a reset()
functions.
All of those are of course statically disabled if the policy does not allow the empty state.
If you want to know how, check out my previous post. I admit it: now I had it planned in advanced.
It also provides with()
and visit()
.
The latter is like the std
version.
Conclusion
My ts::basic_variant
is a more flexible and improved variant compared to std::variant
.
Policy based design gives the user a way to choose how the variant should behave,
instead of forcing one decision.
If you want more control, you can easily use the ts::tagged_union
building block.
This post showed a lot less code than my usual posts. If you want to see code, take a look at the implementation. The relevant files are tagged_union.hpp, variant_impl.hpp and variant.hpp. And if you’re really crazy, look how you have to do visit in C++11, i.e. without return type deduction.
For all others, do check out type_safe, it does a lot more, and take a look at the documentation of my variant.
This blog post was written for my old blog design and ported over. If there are any issues, please let me know.