Nested Optionals, Expected and Composition
Andrzej wrote about problems with CTAD and nested optionals, then Barry wrote about problems with comparison and nested optionals.
What do both problems have in common?
Nested optionals.
So let’s talk about them: What do they actually mean?
std::optional<T>
: a T
That Might Not Be There
Suppose you’re dealing with functions that might or might not be able to give you an object in return.
With std::optional
that’s easy to model:
/// Does a database lookup, returns `std::nullopt` if it wasn't found.
template <typename T>
std::optional<T> lookup(const database& db, std::string name);
/// Calls the function if the condition is `true` and returns the result,
/// `std::nullopt` if the condition was false.
template <typename T>
std::optional<T> call_if(bool condition, std::function<T()> func);
std::optional<T>
means “either a T
or nothing”.
In that sense it is like std::variant<T, std::monostate>
.
That also means “either a T
or nothing”.
Yet std::optional<T>
is preferred as it has a more convenient interface.
But note that both just mean “or nothing”.
Not “or not found” or “or function wasn’t called”.
The std::nullopt
has no inherent semantic meaning, the meaning is provided by context:
auto value = lookup<my_type>(db, "foo");
if (!value)
// optional is empty, this means the value wasn't there
…
auto result = call_if(condition, some_function);
if (!result)
// optional is empty, this means the condition was false
Here an empty optional means something different depending on the source of that optional.
Just by themselves all std::nullopt
’s are equal, context gives them different meaning:
template <typename T>
void process(std::optional<T> value)
{
if (!value)
// we don't know *why* the `T` isn't there, it just isn't
}
std::expected<T, E>
: a T
or an Error
If you want to provide additional information why the T
isn’t there, you can use the proposed std::expected<T, E>
.
It means “either a T
or the error that prevented its existence E
”.
The canonical example would be something like this:
/// Opens the file or returns an error code if it was unable to do so.
std::expected<file, std::error_code> open_file(const fs::path& p);
If the function could not return a file, it returns a std::error_code
instead.
As such std::expected<T, E>
is like std::variant<T, E>
— just with a nicer interface and more defined semantics.
std::variant<T, E>
just means T
or E
, std::expected<T, E>
gives the E
a special meaning.
But something interesting happens when E
is an empty type with a single state:
struct value_not_found {};
template <typename T>
std::expected<T, value_not_found> lookup(const database& db, std::string name);
This lookup()
implementation also returns a T
or nothing if it wasn’t found.
But “nothing” has a well-defined meaning encoded in the type — value_not_found
.
This is different from std::optional<T>
:
In that case the meaning was only present given the context/origin of the optional.
Now the meaning is encoded into the type itself:
template <typename T>
void process(std::expected<T, value_not_found> value)
{
if (!value)
// ah, the `T` wasn't found in the database
}
This is an important distinction as we’ll see later.
Recap: std::optional<T>
, std::expected<T, E>
and std::variant<T, E>
So to recap:
std::optional<T>
is a nicerstd::variant<T, std::monostate>
std::expected<T, E>
is a nicerstd::variant<T, E>
std::nullopt_t
andstd::monostate
are both generic types meaning “empty”, special meaning is only imbued by context- other empty types such as
value_not_found
are specialised with meaning without any context, just by themselves std::optional<T>
andstd::expected<T, std::monostate>
both mean the same thing: either aT
is there or it isn’t — if it isn’t there is no meaning whystd::expected<T, empty_type>
has more semantic meaning thanstd::optional<T>
: theempty_type
gives the error more information
Note that I’m making an important assumption here:
std::optional<T>
and std::expected<T, E>
should be used in the same places.
You’d use std::optional<T>
if the reason why you didn’t have the T
isn’t important enough,
you’d use std::expected<T, E>
if the reason is.
Both types are fine for different APIs.
I repeat the assumption again, because if you don’t agree with that, you won’t agree with the rest of the post:
std::optional<T>
and std::expected<T, E>
both model the same thing “a T
that might not be there.
std::expected
just stores additional information why it isn’t there.
There are other situations where you might want to use std::optional<T>
but I consider those more or less problematic.
I’ll elaborate in more detail in a follow-up post, for now, just consider the situations where my assumption holds.
You might initially be irritated by the use of the “error” terminology in
std::expected<T, E>
. Is it really an “error” if the key isn’t found in some dictionary?But don’t confuse “error” with “exception”. It is not some unexpected, fatal problem. Just some failure to produce a proper value.
Nesting Optional and Expected
Let’s consider our two APIs again:
/// Does a database lookup, returns `std::nullopt` if it wasn't found.
template <typename T>
std::optional<T> lookup(const database& db, std::string name);
/// Calls the function if the condition is `true` and returns the result,
/// `std::nullopt` if the condition was false.
template <typename T>
std::optional<T> call_if(bool condition, std::function<T()> func);
There are two interesting situations with those APIs.
The first happens when we want to do a database lookup of a value that that might be null
in itself.
auto result = lookup<std::optional<my_type>>(db, name);
if (!result)
// not found in database
else if (!result.value())
// found in database, but `null`
else
{
// found and not null
auto value = result.value().value();
}
We end up with a std::optional<std::optional<my_type>>
.
If the outer optional is empty that means the value was not stored in the database.
If the inner optional is empty that means the value was stored in the database but it was null
.
If both are non-empty the value was stored and non-null
.
The second situations happens when we simply combine the two functions:
auto lambda = [&] { return lookup<my_type>(db, name); };
auto result = call_if(condition, lambda);
if (!result)
// condition was false
else if (!result.value())
// condition was true, but the lookup failed
else
{
// condition was true and the lookup succeeded
auto actual_value = result.value().value();
}
Again, we have a nested optional. And again it means something different depending on which optional is empty.
But just a std::optional<std::optional<T>>
by itself doesn’t have that information!
An empty optional means nothing, an optional containg an empty optional as well.
void process(std::optional<std::optional<my_type>> result)
{
if (!result)
// ah, the result was not found in the database
// or the condition was false
// or the value was null?
else if (!result.value())
// was found, but `null`
// or the condition was true but not found?
else
…
}
Context and now even the order of operations gives it the meaning.
With a std::expected
API on the other hand, the information is clear:
void process(std::expected<std::expected<my_type, value_not_found>, func_not_called> result)
{
if (!result)
// function wasn't called
else if (!result.value())
// value not found
}
Note that I am not saying that the std::expected
API is better:
It is awkward to have call_if()
return a std::expected
, std::optional
is clearly the better choice for that function.
And I’d also argue that lookup()
should use std::optional
unless there are multiple reasons why a value isn’t there.
I’m merely demonstrating that std::expected
preserves information about the empty state while std::optional
does not.
Flattening Optional and Expected
We hopefully can all agree that both situations are above are not ideal.
Working with nested std::optional
or std::expected
is weird.
If you want to process a value you would probably do it like so:
auto result = lookup<std::optional<my_type>>(db, name);
if (!result)
process(std::nullopt);
else if (!result.value())
process(std::nullopt);
else
process(result.value().value());
void process(const std::optional<my_type>& result)
{
if (!result)
// wasn't there — for whatever reason
else
// it was there, go further
}
That is, you’d combine the two different empty states of the std::optional
into just one.
You flatten the std::optional<std::optional<T>>
into a std::optional<T>
.
Flattening a std::optional<T>
loses information:
We’re squashing two distinct empty states into one.
But without additional contexts the two empty states are the same anyway — a process()
called from multiple places can’t distinguish between them.
All it cares about is whether or not it actually has a value.
If it does care about the reason, the std::expected
API might be better.
auto result = lookup<std::optional<my_type>>(db, name);
if (!result)
process(name_not_found);
else if (!result.value())
process(value_null);
else
process(result.value().value());
Now we’re passing distinct error information to process()
which is actually usable information.
In a sense, that is also a flattening.
But a flattening that preserves information.
Such a preserving flattening needs the context, the meaning of std::nullopt
, so it can’t be done in a generic way.
With a combination of std::expected
based APIs we can also end up with a nested std::expected<std::expected<T, E1>, E2>
.
How would we flatten that?
Well, we either have a T
or failed to do so.
When we failed we either failed because of E1
or because of E2
.
That is: std::expected<std::expected<T, E1>, E2>
flattens to std::expected<T, std::variant<E1, E2>
.
This flattening preserves all informations.
Note that if E1
and E2
are empty types, std::variant<E1, E2>
is analogous to an error code enum
with to possible values.
It is important to point out that this is not actually the flattening from the M-word.
Just for the sake of completeness what happens when we mix std::expected
and std::optional
?
If we remember that std::optional<T>
is std::expected<T, std::monostate>
, the flattening rules follow naturally:
std::optional<std::expected<T, E>>
is std::expected<T, std::variant<E, std::monostate>
is std::expected<T, std::optional<E>>
.
And std::expected<std::optional<T>, E>
is std::expected<std::expected<T, std::monostate>, E>
is std::expected<T, std::optional<E>>
.
If you think about them, this makes sense.
In both cases we have three states: a T
, a failure to do so because of E
or a failure to do so because of generic reasons.
You might argue that we’re losing information because the generic failure happens in a different order, but that isn’t really usable information anyway. It is just a “generic failure”.
We know that the std::expected
flattening rules are well-formed because std::optional<std::optional<T>>
is std::expected<std::expected<T, std::monostate>, std::monostate>
is std::expected<T, std::variant<std::monostate, std::monostate>>
is std::expected<T, std::monostate>
is std::optional<T>
.
The optional flattening rules simply follow!
So to recap:
std::expected<std::expected<T, E1>, E2>
flattens tostd::expected<T, std::variant<E1, E2>>
, preserving all informationstd::optional<std::optional<T>>
flattens tostd::optional<T>
, losing some information, but that information wasn’t really there in the first place- other flattening rules follow from treating
std::optional<T>
asstd::expected<T, std::monostate>
You Don’t Want Nested Optionals or Expecteds
Dealing with nested optionals and expected is awkward, you have to check multiple layers, write .value().value().value()
etc.
So in real code you would avoid them: as soon as you have them, you’d flatten them, possibly manual.
And again, flattening nested optionals does not lose you any usable information by itself. The empty states only gain semantic meaning from context. If the context isn’t there, they’re equivalent.
So if you are writing a user-facing, high-level API you would never return a nested optional or expected on purpose!
Note that I said “on purpose”:
template <typename T>
std::optional<T> lookup(const database& db, std::string name);
Just looking at it, this API doesn’t return a nested optional.
But as we’ve seen a nested optional appears if T
is an optional itself.
Yet this API has done nothing wrong.
For its intents and purposes, T
is just some opaque generic type.
It doesn’t really concern itself with the exact details.
All generic code using that API will never realize that it is in fact a nested optional, it just deals with a std::optional<T>
where T
is “something”.
Only the final user that explicitly passed a std::optional<T>
to it will end up with a nested optional.
But the API itself didn’t create on “on purpose”, it happened “accidentally”, so to speak.
Once you write std::optional<std::optional<T>>
you should flatten it.
If you just write std::optional<U>
where U
might be a std::optional<T>
but you don’t care, you’re good.
Automatic Flattening?
So when we immediately flatten nested optionals once we got them, why not do that automatically?
Why not make std::optional<std::optional<T>>
and std::optional<T>
the same type?
I proposed that on twitter without thinking too much of the consequences and without this 2800 word essay to back up my justifications, so it just seemed harmful and weird to do.
Of course a std::optional<std::optional<T>>
and std::optional<T>
are different things:
One is a T
that might not be there, the other is a std::optional<T>
that might not be there.
But as I’ve might have convinced you, the distinction — without any context — isn’t really usable.
Both just model a T
that might not be there.
So I think I’m justified in wanting to do that, but sadly it is still impractical.
We expect the following test to hold for all T
:
T some_value = …;
std::optional<T> opt1;
assert(!opt1.has_value());
std::optional<T> opt2(some_value);
assert(opt2.has_value());
assert(opt2.value() == some_value);
But if T
is a std::optional<U>
and we flatten automatically, opt2.value()
will not give you a T
object back, it will give you a U
!
You can imagine that this might cause some issues in generic code.
Even though the exact example still works with the standard library comparison implementation.
So automatically flattening everything is a bad idea.
Composing Optionals
At this point in the blog post, I’ll have to introduce monads.
For our purposes, a monad is a container of T
, C<T>
, with the following operations:
- Flatten
C<C<T>>
intoC<T>
- Apply a
std::function<U(T)>
on aC<T>
yielding aC<U>
, calledmap()
- Apply a
std::function<C<U>(T)>
on aC<T>
yielding aC<U>
, calledbind()
orand_then()
This is how you’d implement it for std::vector<T>
:
template <typename T>
std::vector<T> flatten(const std::vector<std::vector<T>>& vec)
{
std::vector<T> result;
for (auto& outer : vec)
for (auto& inner : outer)
result.push_back(inner);
return result;
}
template <typename T, typename U>
std::vector<U> map(const std::vector<T>& vec, const std::function<U(T)>& func)
{
std::vector<U> result;
// just std::transform, really
for (auto& value : vec)
result.push_back(func(value));
return result;
}
template <typename T, typename U>
std::vector<U> and_then(const std::vector<T>& vec, const std::function<std::vector<U>(T)>& func)
{
std::vector<U> result;
for (auto& value : vec)
for (auto& transformed : func(value))
result.push_back(transformed);
return result;
}
Implementation for std::optional
or std::expected
is left as an exercise for the reader.
Note that for std::expected
there are two implementations: one on the value and one on the error.
And the flatten I’ve described doesn’t really match the flatten expected here (no pun intended).
Note that the map()
and and_then()
are really similar.
In one case the function transforms every element individually, yielding a single element.
In the other case the function transforms every element into a container again.
You can even implement and_then()
by calling map()
and then flatten()
it.
And clearly for std::vector
there is a huge difference between a std::vector<T>
and std::vector<std::vector<T>>
.
But for std::optional
?
I’ve argued, not really. Yet still you’d have to think about which one you do:
std::optional<int> opt = …;
opt = map(opt, [](int i) { return 2 * i; } );
opt = and_then(opt, [](int i) { return i ? std::make_optional(4 / i) : std::nullopt; } );
The first lambda returns an int
, so you use map()
.
The second returns a std::optional<int>
, so you use and_then()
.
If you accidentally use map()
you have a std::optional<std::optional<int>>
.
Thinking about that distinction is annoying: Composing optionals is awkward enough already in C++, such differences shouldn’t matter.
A single function should just do the right thing, no matter what you throw at it.
Yes, this is mathematically impure and doesn’t really implement a monad for std::optional
.
But C++ isn’t category theory, it’s fine to be pragmatic.
You wouldn’t really have templates taking “monads” anyway, while they are mathematically similar, the actual usages and performance differences are too different.
Not that I am not saying that monads should automatically flatten in general.
Just std::optional
.
And you can still have the proper monadic functions if you want. They just shouldn’t be the default.
Similarly, composing multiple functions returning expected’s should flatten in a similar way.
You wouldn’t want a nested std::expected
, you want a single std::expected
combining all errors.
Note that this automatic flattening on composition has precedent:
Rust’s expected, Result<T, E>
will flatten in a similar way to what I’ve described.
If you’re composing functions returning Result<T, E1>
in a function returning Result<T, E2>
,
they will be automatically converted.
Conclusion
The empty state of std::optional<T>
does not have any inherent meaning.
It just means “empty”.
Only the origin gives it meaning such as “not found”.
As such a std::optional<std::optional<T>>
only means T
or empty or really empty.
Without additional context that is the same as std::optional<T>
.
Flattening a nested optional does lose information, but not usable information.
If you want to give special meaning to the empty state use std::expected<T, E>
where E
is that special meaning.
Flattening a nested expected preserves all information.
As working with nested optionals or expecteds is awkward, they want to be flattened. Flattening automatically every time breaks in generic code, but flattening on composition is a bit mathematically impure, but works.
With that information we can also answer the comparison problem outlined in Barry’s blog post.
What should f6(std::nullopt, std::nullopt)
return?
As std::nullopt
doesn’t have any special meaning on its own, all instances are equal.
It does not matter how many nested optionals we have.
This blog post was written for my old blog design and ported over. If there are any issues, please let me know.