foonathan::blog()

Thoughts from a C++ library developer.

Mathematics behind Comparison #4: Three-Way Comparison

In order to sort a collection of elements you need to provide a sorting predicate that determines when one element is less than the other. This predicate must “induce a strict total ordering on the equivalence classes” according to cppreference. Wait, what?

The upcoming C++ spaceship operator implements a three-way comparison, i.e. it is a single function that can return the results of <, == and > combined. But related to it are terms like “strong equality” and “weak ordering” which are somewhat confusing if you don’t have the mathematical background.

So let’s untangle it: This series will explain both the mathematics behind equality and ordering, as well as give concrete guidelines for implementing the comparison operators and the spaceship operator.

Now that we’ve covered both equivalence and ordering relations we can finally talk about the spaceship operator and three-way comparisons.

Three-Way Comparison

As described in the second part, two elements can be in one of these ordering relationships:

  • They are both equal.
  • They are both equivalent.
  • One is strictly less/greater than the other.
  • They are incomparable.

But mathematically a relation is just a set, meaning it can only give a boolean result. So mathematicians had to pick one relationship, resulting in the theory behind and < orderings.

But a three-way comparison is a function that will give the entire relationship in one query. Traditionally, strcmp() is such a function. Given two strings it will return an integer where < 0 means the first string is less, == 0 if both are equal and > 0 if the first string is greater. It can give one of three results, hence it’s a three-way comparison.

Other languages — and C++20 — have a comparison operator that does a three-way comparison. It is commonly spelled <=> as it gives the result of <, == and > simultaneously.

And as <=> sort of looks like a spaceship, it is called the “spaceship operator”. Note that it is neither a TIE fighter (that would be |=|) nor a TIE bomber ({oo}) but a TIE interceptor. (credit to reddit).

The advantage of a three-way comparison over the mathematical relation is simple: Instead of doing the whole !(a < b) && !(b < a) or a <= b && b <= a dance to figure out whether two elements are equal, you can just ask that directly. And the user still needs to write only one predicate.

Comparison Categories for Ordering

The and < orderings are categorized based on two dimensions:

  • Is the order partial or total?
  • Does equality actually mean equality or just equivalence?

Read part 2 again if you don’t know what those terms mean, or at least read the summary at the end.

Three-way comparisons can also be classified based on those dimensions. For two elements a and b they can give the following results:

Total Partial
Equivalence less, equivalent, greater less, equivalent, greater, unordered
Equality less, equal, greater less, equal, greater, unordered

So strictly speaking, only on a total order do we truly have a three-way comparison, otherwise it’s a four-way comparison.

Because of those semantic differences, the return type of the C++ TIE interceptor overload is not simply an int, but instead different types based on those dimensions — the ordering categories:

Total Partial
Equivalence std::weak_ordering std::partial_ordering
Equality std::total_ordering n/a

There is no type for a partial ordering that provides true equality, e.g. on sets. Instead the weaker std::partial_ordering has to be used. This is not a big problems as actual algorithms on orderings don’t care about equivalence vs equality but only about total vs partial orderings (more on that in the next part).

Note that those types have the intuitive conversion between them, and are comparable with 0 in the same way you’d use the result of std::strcmp. But — and I really like this part — they are only comparable with the literal number 0, not 1, 42 or some integer variable!

The comparison operators take std::nullptr_t not int. And as the literal 0 is the only integer value that is a null pointer literal…

Yes, that means you can write a <=> b < nullptr. Don’t.

And the best thing about three-way comparisons: Once you have an operator<=> overload returning one of the ordering types, the compiler will also support all comparison operators! Note that it will just rewrite a < b to a <=> b < 0, it does not actually synthesize an operator< overload.

Comparison Categories for Equality

But what about types that don’t have an order but only equality, like std::complex? There are special categories for those.

As we’ve learned in part one there are two kinds of equivalence relations: true equality and equivalence. And each one of those can give one of two results:

Equivalence equivalent, nonequivalent
Equality equal, nonequal

The matching categories are:

Equivalence std::weak_equality
Equality std::strong_equality

But otherwise they behave like the ordering categories.

When you have an overloaded operator<=> returning an equality type, the compiler will support operator== and operator!= as well. It does that by mapping a == b to a <=> b == 0.

Designing Orderings and Equalities using <=>

The proposal for <=> provides the following design guide for choosing the correct category for your type:

Substitutability? Equality Only Full Ordering
Yes std::strong_equality std::strong_ordering
No std::weak_equality std::weak_ordering

Here substitutability means whether a == b implies f(a) == f(b).

Strictly speaking that is incorrect, as a == b does not necessarily imply that for any useful equivalence relation where f is std::addressof. The correct formulation would be whether a == b implies f(a) == f(b) where f only observers salient properties, but okay.

Note that this table leaves out std::partial_ordering, which is good: As explained in part three, the comparison operators should always implement a total ordering.

However, I disagree that you would ever want an operator<=> that returns a weak_* type: Such a comparison operator would mean that a == b would be true for objects that are not necessarily equal in terms of their values. I’ve talked more about that in the first part, as it is a rather complex question that touches the topics of regular types and more.

Let me just give another argument here: The proposal uses the CaseInsensitiveString as an example of a type that has a weak equality. This is the standard example and, quite frankly, the only one I can come up with. You don’t really need weak orderings and equalities for your type as the default comparison.

So I give this guideline for choosing the return type of operator<=>:

Guideline: If your type should have full ordering, return std::strong_ordering from operator<=>. Otherwise, if your type should only have equality, return std::strong_equality. Otherwise, don’t overload operator<=>.

Does this mean that the other category types are useless and there is no way to have a case insensitive string comparison?

No, of course not. It just shouldn’t be used as an operator<=>! Instead you should implement a std::weak_ordering case_insensitive_compare(const std::string& lhs, const std::string& rhs) function, maybe coupled with a compare function for the other Unicode equivalences you can have. This is a superior approach, in my opinion.

Guideline: If you need one of the other ordering types, implement them in a named function, not operator<=>.

More on using such functions in algorithms in the next and final part of the series.

Implementing Ordering Relations in C++20

Thanks to the compiler magic, you only need to overload operator<=> and get the other ones for free.

In the previous post I’ve used a pair types as an example of a total ordering and we needed to implement operator== and operator< by chaining the member comparisons, and then doing the mindless implementation of the other operators in terms of those two. But now we just need an operator<=> that does a member chaining:

template <typename T, typename U>
struct pair
{
    T first;
    U second;

    // it's a total order with true equality, so std::strong_ordering
    std::strong_ordering operator<=>(const pair& other) const
    {
        if (auto first_comp = first <=> other.first;
            first_comp != 0)
            // sort by first member if they're not equal
            return first_comp;
        else
            // sort by second member
            return second <=> other.second; 
    }
};

Note how I’ve used C++17’s if with initializer there.

Yes, you’ve noticed it correctly: that is a member function. There is no need to make it a free function, the compiler will automatically do the right thing.

However, there are a couple of problems with this implementation:

1. What happens if T or U don’t support <=> but only the “older” operators?

Sadly the compiler will not synthesize an <=> based of == and <, only the other way around.

But there is a helper function std::compare_3way() which does exactly that. A possible implementation looks like this:

// types that only have an `operator==`
struct equal_only {};

template <typename T, typename U>
constexpr auto compare_3way_impl(equal_only, const T& lhs, const U& rhs)
-> decltype(lhs == rhs, std::strong_equality::equal)
{
    if (lhs == rhs)    
        return std::strong_equality::equal;
    else
        return std::strong_equality::nonequal;
}

// types that have an `operator==` and `operator<`
struct equal_and_less : equal_only {};

template <typename T, typename U>
constexpr auto compare_3way_impl(equal_and_less, const T& lhs, const U& rhs)
-> decltype(lhs == rhs, lhs < rhs, std::strong_ordering::equal)
{
    if (lhs == rhs)    
        return std::strong_ordering::equal;
    else if (lhs < rhs)
        return std::strong_ordering::less;
    else
        return std::strong_ordering::greater;
}

// types that have an `operator<=>`
struct spaceship : equal_and_less {};

template <typename T, typename U>
constexpr auto compare_3way_impl(spaceship, const T& lhs, const U& rhs)
-> decltype(lhs <=> rhs)
{
    return lhs <=> rhs;
}

// the generic function dispatching to the others
template <typename T, typename U>
constexpr auto compare_3way(const T& lhs, const U& rhs)
{
    return compare_3way_impl(spaceship{}, lhs, rhs);
}

Each of the _impl function is disabled using SFINAE if the type does not have the necessary operators. The inheritance hierarchy is used to try them in a given order, each base class requires one less conversion, so they are a worse and worse match.

Read my SFINAE, tag dispatching and SFINAE + tag dispatching blog post for more information.

Note that the implementation in terms of the “normal” comparison operators will always deduce a std::strong_ordering, and never one of the other types. This is following my guideline that the overloaded comparison operators should always implement a total order with true equality.

Also note that the implementation of operator== and operator< need to match, otherwise the results are inconsistent. This is another guideline I gave in part three.

So our operator<=> should look like this:

std::strong_ordering operator<=>(const pair& other) const
{
    if (auto first_comp = std::compare_3way(first, other.first);
        first_comp != 0)
        // sort by first member if they're not equal
        return first_comp;
    else
        // sort by second member
        return std::compare_3way(second, other.second); 
}

All generic code has to use (std::)compare_3way() instead of using <=> directly, which is unfortunate.

2. What happens if T or U don’t have a std::strong_ordering?

The standard library also provides a helper for that: a type trait std::common_comparison_category, which will calculate the correct category based on the categories for T and U. This can then be returned.

And while the standard library certainly has to care about such types, I will not do that in my code. Just follow my guideline and only return std::strong_ordering from operator<=>, never another ordering type.

3. What happens if T or U only have a std::strong_equality?

Ah, but I do have to care about that as this is following my own guideline. We certainly want to have pair<int, std::complex<double>> comparison: it’s just not an ordering, but only equality.

And because I don’t want to have an operator<=> returning something other than std::strong_ordering or std::strong_equality, I can’t use std::common_comparison_category directly.

Instead I have to define my own helper:

template <typename ... CompCategories>
struct common_strong_comparison_category
{
    using type = std::conditional_t<(std::is_same_v<CompCategories, std::strong_equality> || ...), std::strong_equality, std::strong_ordering>;
};

If any of the categories is std::strong_equality, the ordering is only equality. Otherwise, the ordering is std::strong_ordering. (We assume that the categories are either one of those)

This means the final std::pair operator<=> looks like this:

auto ordering operator<=>(const pair& other) const
-> common_strong_comparison_category_t<decltype(std::compare_3way(first, other.first)), decltype(std::compare_3way(second, other.second))>
{
    if (auto first_comp = std::compare_3way(first, other.first);
        first_comp != 0)
        // sort by first member if they're not equal
        return first_comp;
    else
        // sort by second member
        return std::compare_3way(second, other.second); 
}

Note that we only needed to change the return type! Thanks to the logic and conversion of the comparison categories, everything else works out fine. This is the true power of returning proper types and not just ints.

Default Ordering and Equality

This is all good, but I haven’t told you the best part: You could simply do this:

auto operator<=>(const pair& other) = default;

The compiler will then generate an implementation that does the member-wise comparison chaining, and deduce the proper return type automatically.

There’s a catch, however: As before, a <=> b will not try and use == or < the way std::compare_3way() does. This is also the case here.

So you can only default it if all members have an operator<=> overload. But as built-in types have one and there is a proposal for standard library types, most types in the future will get one. This is another unfortunate consequence that the generic spelling of “three-way comparison” is std::compare_3way() and not operator<=>.

Note that the = default implementation would also deduce a weak ordering, for example. Preventing that is left as an exercise for the reader.

Hint: You can put a type instead of auto and if you put void the = default means = delete.

But otherwise this is the ordering you want most of the time, but don’t just blindly put it for all your types! You should still only provide an ordering or equality if it is actually sensible, see the previous parts.

Custom Ordering and Equality

In cases where you can’t use the default ordering, you have to implement it manually as shown. For reference, this is the ordering for std::optional, the same example I’ve used before:

auto operator<=>(const optional& other) const
-> decltype(std::compare_3way(value(), other.value())) // again, should really constrain that
{
    if (!*this && !other)
        // both empty
        // ::equal implicitly converts to std::strong_equality::equal as well
        return std::strong_ordering::equal;
    else if (!*this)
        // empty optional less than non-empty
        // ::less converts to std::strong_equality::unequal
        return std::strong_ordering::less;
    else if (!other)
        // non-empty optional greater than empty
        // ::greater converts to std::strong_equality::unequal
        return std::strong_ordering::greater;
    else
        // forward to value
        return std::compare_3way(value(), other.value());
}

Notice the power of those implicit conversions! It will always do the right thing, it doesn’t matter whether it implements an equality comparison or an ordering.

It would also do the right thing for std::weak_ordering and std::weak_equality, but …

And as before, implementing a named comparison predicate that maybe does some weaker comparison, is the same in principal: You write a function with the appropriate category as return type and use the members to implement your comparison. The algorithm std::lexicographical_compare_3way()) can be used to compare arrays using operator<=>. But be careful that you’ve actually implemented a proper ordering.

Implementing Ordering Relations in the C++20 Standard Library

I’ve mentioned multiple times that the operator<=> should really only return std::strong_ordering or std::strong_equality. This is consistent with the behavior of operator== and operator< as determined by std::compare_3way().

But is it also consistent with the behavior of all operator<=> that are proposed for the standard library! Ignoring the types that wrap the comparison of other types (like std::pair or std::vector), they all either provide a std::strong_ordering or a std::strong_equality.

The comparison concepts like EqualityComparable or LessThanComparable can work with either operator==/operator< or a suitable operator<=>. They require only a weak ordering or equality. More on that in the final part.

Conclusion

With the introduction of operator<=> both the design and implementation of ordering and equivalence relations is simplified. There is now a good way to describe the kind of ordering/equivalence your type supports and often the implementation is just = default. Just remember to only use std::strong_ordering and std::strong_equality as the comparison category for operator<=>, other orderings should be implemented in a named function.

Generic code needs to be careful using operator<=> directly. It should either continue using < and == or std::compare_3way() if a three-way comparison is required.

For more information, check out:

The next and final part of this series will take a look at algorithms that require orderings, like finding maximums or searching.

This post was made possible by my Patreon supporters. If you'd like to support me as well, please head over to my Patreon and do so! One dollar per month can make all the difference.