Mathematics behind Comparison #4: Three-Way Comparison
In order to sort a collection of elements you need to provide a sorting predicate that determines when one element is less than the other. This predicate must “induce a strict total ordering on the equivalence classes” according to cppreference. Wait, what?
The upcoming C++ spaceship operator implements a three-way comparison,
i.e. it is a single function that can return the results of <
, ==
and >
combined.
But related to it are terms like “strong equality” and “weak ordering” which are somewhat confusing if you don’t have the mathematical background.
So let’s untangle it: This series will explain both the mathematics behind equality and ordering, as well as give concrete guidelines for implementing the comparison operators and the spaceship operator.
Now that we’ve covered both equivalence and ordering relations we can finally talk about the spaceship operator and three-way comparisons.
Note: The C++ language rules for <=>
have changed since writing this post. See https://jonathanmueller.dev/talk/cppcon2019/ for the current rules.
This blog post is outdated.
Three-Way Comparison
As described in the second part, two elements can be in one of these ordering relationships:
- They are both equal.
- They are both equivalent.
- One is strictly less/greater than the other.
- They are incomparable.
But mathematically a relation is just a set, meaning it can only give a boolean result.
So mathematicians had to pick one relationship, resulting in the theory behind ≤
and <
orderings.
But a three-way comparison is a function that will give the entire relationship in one query.
Traditionally, strcmp()
is such a function.
Given two strings it will return an integer where < 0
means the first string is less, == 0
if both are equal and > 0
if the first string is greater.
It can give one of three results, hence it’s a three-way comparison.
Other languages — and C++20 — have a comparison operator that does a three-way comparison.
It is commonly spelled <=>
as it gives the result of <
, ==
and >
simultaneously.
And as
<=>
sort of looks like a spaceship, it is called the “spaceship operator”. Note that it is neither a TIE fighter (that would be|=|
) nor a TIE bomber ({oo}
) but a TIE interceptor. (credit to reddit).
The advantage of a three-way comparison over the mathematical relation is simple:
Instead of doing the whole !(a < b) && !(b < a)
or a <= b && b <= a
dance to figure out whether two elements are equal, you can just ask that directly.
And the user still needs to write only one predicate.
Comparison Categories for Ordering
The ≤
and <
orderings are categorized based on two dimensions:
- Is the order partial or total?
- Does equality actually mean equality or just equivalence?
Read part 2 again if you don’t know what those terms mean, or at least read the summary at the end.
Three-way comparisons can also be classified based on those dimensions.
For two elements a
and b
they can give the following results:
| Total | Partial |
---|---|---|
Equivalence | less, equivalent, greater | less, equivalent, greater, unordered |
Equality | less, equal, greater | less, equal, greater, unordered |
So strictly speaking, only on a total order do we truly have a three-way comparison, otherwise it’s a four-way comparison.
Because of those semantic differences, the return type of the C++ TIE interceptor overload is not simply an int
,
but instead different types based on those dimensions — the ordering categories:
| Total | Partial |
---|---|---|
Equivalence | std::weak_ordering |
std::partial_ordering |
Equality | std::total_ordering |
n/a |
There is no type for a partial ordering that provides true equality, e.g. ⊆
on sets.
Instead the weaker std::partial_ordering
has to be used.
This is not a big problems as actual algorithms on orderings don’t care about equivalence vs equality but only about total vs partial orderings (more on that in the next part).
Note that those types have the intuitive conversion between them, and are comparable with 0
in the same way you’d use the result of std::strcmp
.
But — and I really like this part — they are only comparable with the literal number 0
, not 1
, 42
or some integer variable!
The comparison operators take
std::nullptr_t
notint
. And as the literal0
is the only integer value that is a null pointer literal…Yes, that means you can write
a <=> b < nullptr
. Don’t.
And the best thing about three-way comparisons:
Once you have an operator<=>
overload returning one of the ordering types, the compiler will also support all comparison operators!
Note that it will just rewrite a < b
to a <=> b < 0
, it does not actually synthesize an operator<
overload.
Comparison Categories for Equality
But what about types that don’t have an order but only equality, like std::complex
?
There are special categories for those.
As we’ve learned in part one there are two kinds of equivalence relations: true equality and equivalence. And each one of those can give one of two results:
Kind | Values |
---|---|
Equivalence | equivalent, nonequivalent |
Equality | equal, nonequal |
The matching categories are:
Kind | Category |
---|---|
Equivalence | std::weak_equality |
Equality | std::strong_equality |
But otherwise they behave like the ordering categories.
When you have an overloaded operator<=>
returning an equality type, the compiler will support operator==
and operator!=
as well.
It does that by mapping a == b
to a <=> b == 0
.
Designing Orderings and Equalities using <=>
The proposal for <=>
provides the following design guide for choosing the correct category for your type:
Substitutability? | Equality Only | Full Ordering |
---|---|---|
Yes | std::strong_equality |
std::strong_ordering |
No | std::weak_equality |
std::weak_ordering |
Here substitutability means whether a == b
implies f(a) == f(b)
.
Strictly speaking that is incorrect, as
a == b
does not necessarily imply that for any useful equivalence relation wheref
isstd::addressof
. The correct formulation would be whethera == b
impliesf(a) == f(b)
wheref
only observers salient properties, but okay.
Note that this table leaves out std::partial_ordering
, which is good:
As explained in part three, the comparison operators should always implement a total ordering.
However, I disagree that you would ever want an operator<=>
that returns a weak_*
type:
Such a comparison operator would mean that a == b
would be true for objects that are not necessarily equal in terms of their values.
I’ve talked more about that in the first part, as it is a rather complex question that touches the topics of regular types and more.
Let me just give another argument here:
The proposal uses the CaseInsensitiveString
as an example of a type that has a weak equality.
This is the standard example and, quite frankly, the only one I can come up with.
You don’t really need weak orderings and equalities for your type as the default comparison.
So I give this guideline for choosing the return type of operator<=>
:
Guideline: If your type should have full ordering, return std::strong_ordering
from operator<=>
.
Otherwise, if your type should only have equality, return std::strong_equality
.
Otherwise, don’t overload operator<=>
.
Does this mean that the other category types are useless and there is no way to have a case insensitive string comparison?
No, of course not.
It just shouldn’t be used as an operator<=>
!
Instead you should implement a std::weak_ordering case_insensitive_compare(const std::string& lhs, const std::string& rhs)
function,
maybe coupled with a compare function for the other Unicode equivalences you can have.
This is a superior approach, in my opinion.
Guideline: If you need one of the other ordering types, implement them in a named function, not operator<=>
.
More on using such functions in algorithms in the next and final part of the series.
Implementing Ordering Relations in C++20
Thanks to the compiler magic, you only need to overload operator<=>
and get the other ones for free.
In the previous post I’ve used a pair
types as an example of a total ordering and we needed to implement operator==
and operator<
by chaining the member comparisons, and then doing the mindless implementation of the other operators in terms of those two.
But now we just need an operator<=>
that does a member chaining:
template <typename T, typename U>
struct pair
{
T first;
U second;
// it's a total order with true equality, so std::strong_ordering
std::strong_ordering operator<=>(const pair& other) const
{
if (auto first_comp = first <=> other.first;
first_comp != 0)
// sort by first member if they're not equal
return first_comp;
else
// sort by second member
return second <=> other.second;
}
};
Note how I’ve used C++17’s if with initializer there.
Yes, you’ve noticed it correctly: that is a member function. There is no need to make it a free function, the compiler will automatically do the right thing.
However, there are a couple of problems with this implementation:
1. What happens if T
or U
don’t support <=>
but only the “older” operators?
Sadly the compiler will not synthesize an <=>
based of ==
and <
, only the other way around.
But there is a helper function std::compare_3way()
which does exactly that.
A possible implementation looks like this:
// types that only have an `operator==`
struct equal_only {};
template <typename T, typename U>
constexpr auto compare_3way_impl(equal_only, const T& lhs, const U& rhs)
-> decltype(lhs == rhs, std::strong_equality::equal)
{
if (lhs == rhs)
return std::strong_equality::equal;
else
return std::strong_equality::nonequal;
}
// types that have an `operator==` and `operator<`
struct equal_and_less : equal_only {};
template <typename T, typename U>
constexpr auto compare_3way_impl(equal_and_less, const T& lhs, const U& rhs)
-> decltype(lhs == rhs, lhs < rhs, std::strong_ordering::equal)
{
if (lhs == rhs)
return std::strong_ordering::equal;
else if (lhs < rhs)
return std::strong_ordering::less;
else
return std::strong_ordering::greater;
}
// types that have an `operator<=>`
struct spaceship : equal_and_less {};
template <typename T, typename U>
constexpr auto compare_3way_impl(spaceship, const T& lhs, const U& rhs)
-> decltype(lhs <=> rhs)
{
return lhs <=> rhs;
}
// the generic function dispatching to the others
template <typename T, typename U>
constexpr auto compare_3way(const T& lhs, const U& rhs)
{
return compare_3way_impl(spaceship{}, lhs, rhs);
}
Each of the
_impl
function is disabled using SFINAE if the type does not have the necessary operators. The inheritance hierarchy is used to try them in a given order, each base class requires one less conversion, so they are a worse and worse match.Read my SFINAE, tag dispatching and SFINAE + tag dispatching blog post for more information.
Note that the implementation in terms of the “normal” comparison operators will always deduce a std::strong_ordering
,
and never one of the other types.
This is following my guideline that the overloaded comparison operators should always implement a total order with true equality.
Also note that the implementation of operator==
and operator<
need to match, otherwise the results are inconsistent.
This is another guideline I gave in part three.
So our operator<=>
should look like this:
std::strong_ordering operator<=>(const pair& other) const
{
if (auto first_comp = std::compare_3way(first, other.first);
first_comp != 0)
// sort by first member if they're not equal
return first_comp;
else
// sort by second member
return std::compare_3way(second, other.second);
}
All generic code has to use (std::)compare_3way()
instead of using <=>
directly, which is unfortunate.
2. What happens if T
or U
don’t have a std::strong_ordering
?
The standard library also provides a helper for that:
a type trait std::common_comparison_category
, which will calculate the correct category based on the categories for T
and U
.
This can then be returned.
And while the standard library certainly has to care about such types, I will not do that in my code.
Just follow my guideline and only return std::strong_ordering
from operator<=>
, never another ordering type.
3. What happens if T
or U
only have a std::strong_equality
?
Ah, but I do have to care about that as this is following my own guideline.
We certainly want to have pair<int, std::complex<double>>
comparison:
it’s just not an ordering, but only equality.
And because I don’t want to have an operator<=>
returning something other than std::strong_ordering
or std::strong_equality
,
I can’t use std::common_comparison_category
directly.
Instead I have to define my own helper:
template <typename ... CompCategories>
struct common_strong_comparison_category
{
using type = std::conditional_t<(std::is_same_v<CompCategories, std::strong_equality> || ...), std::strong_equality, std::strong_ordering>;
};
If any of the categories is std::strong_equality
, the ordering is only equality.
Otherwise, the ordering is std::strong_ordering
.
(We assume that the categories are either one of those)
This means the final std::pair
operator<=>
looks like this:
auto ordering operator<=>(const pair& other) const
-> common_strong_comparison_category_t<decltype(std::compare_3way(first, other.first)), decltype(std::compare_3way(second, other.second))>
{
if (auto first_comp = std::compare_3way(first, other.first);
first_comp != 0)
// sort by first member if they're not equal
return first_comp;
else
// sort by second member
return std::compare_3way(second, other.second);
}
Note that we only needed to change the return type!
Thanks to the logic and conversion of the comparison categories, everything else works out fine.
This is the true power of returning proper types and not just int
s.
Default Ordering and Equality
This is all good, but I haven’t told you the best part: You could simply do this:
auto operator<=>(const pair& other) = default;
The compiler will then generate an implementation that does the member-wise comparison chaining, and deduce the proper return type automatically.
There’s a catch, however:
As before, a <=> b
will not try and use ==
or <
the way std::compare_3way()
does.
This is also the case here.
So you can only default
it if all members have an operator<=>
overload.
But as built-in types have one and there is a proposal for standard library types, most types in the future will get one.
This is another unfortunate consequence that the generic spelling of “three-way comparison” is std::compare_3way()
and not operator<=>
.
Note that the = default
implementation would also deduce a weak ordering, for example.
Preventing that is left as an exercise for the reader.
Hint: You can put a type instead of
auto
and if you putvoid
the= default
means= delete
.
But otherwise this is the ordering you want most of the time, but don’t just blindly put it for all your types! You should still only provide an ordering or equality if it is actually sensible, see the previous parts.
Custom Ordering and Equality
In cases where you can’t use the default ordering, you have to implement it manually as shown.
For reference, this is the ordering for std::optional
, the same example I’ve used before:
auto operator<=>(const optional& other) const
-> decltype(std::compare_3way(value(), other.value())) // again, should really constrain that
{
if (!*this && !other)
// both empty
// ::equal implicitly converts to std::strong_equality::equal as well
return std::strong_ordering::equal;
else if (!*this)
// empty optional less than non-empty
// ::less converts to std::strong_equality::unequal
return std::strong_ordering::less;
else if (!other)
// non-empty optional greater than empty
// ::greater converts to std::strong_equality::unequal
return std::strong_ordering::greater;
else
// forward to value
return std::compare_3way(value(), other.value());
}
Notice the power of those implicit conversions! It will always do the right thing, it doesn’t matter whether it implements an equality comparison or an ordering.
It would also do the right thing for
std::weak_ordering
andstd::weak_equality
, but …
And as before, implementing a named comparison predicate that maybe does some weaker comparison, is the same in principal:
You write a function with the appropriate category as return type and use the members to implement your comparison.
The algorithm std::lexicographical_compare_3way()
) can be used to compare arrays using operator<=>
.
But be careful that you’ve actually implemented a proper ordering.
Implementing Ordering Relations in the C++20 Standard Library
I’ve mentioned multiple times that the operator<=>
should really only return std::strong_ordering
or std::strong_equality
.
This is consistent with the behavior of operator==
and operator<
as determined by std::compare_3way()
.
But is it also consistent with the behavior of all operator<=>
that are proposed for the standard library!
Ignoring the types that wrap the comparison of other types (like std::pair
or std::vector
), they all either provide a std::strong_ordering
or a std::strong_equality
.
The comparison concepts like EqualityComparable
or LessThanComparable
can work with either operator==
/operator<
or a suitable operator<=>
.
They require only a weak ordering or equality.
More on that in the final part.
Conclusion
With the introduction of operator<=>
both the design and implementation of ordering and equivalence relations is simplified.
There is now a good way to describe the kind of ordering/equivalence your type supports and often the implementation is just = default
.
Just remember to only use std::strong_ordering
and std::strong_equality
as the comparison category for operator<=>
,
other orderings should be implemented in a named function.
Generic code needs to be careful using operator<=>
directly.
It should either continue using <
and ==
or std::compare_3way()
if a three-way comparison is required.
For more information, check out:
- the original proposal
- the new <compare> header (which really should have been
#include <=>
..) - Simon’s high level introduction
The next and final part of this series will take a look at algorithms that require orderings, like finding maximums or searching.
This blog post was written for my old blog design and ported over. If there are any issues, please let me know.