foonathan::blog()

Thoughts from a C++ library developer.

Proposals to Fix the Spaceship Operator

I did a series about comparisons recently where I gave some guidelines about using the upcoming spaceship operator for three-way comparison. In particular, I pointed out a couple of flaws with the design as it is currently.

Well, now the proposals for the next C++ standardization meeting are here — almost 300 of them. And I’ve counted eleven of them that deal with the spaceship operator.

So let’s take a look and them and see whether they’ll fix any of the issues I’ve pointed out.

Performance Impacts on Using <=> for Equality

The wonderfully named P1190 — “I did not order this!” — goes into more detail about the impact of using <=> if you just want equality. I did mention it briefly in the final part but the basic issue is this:

template <typename T>
auto operator<=>(const std::vector<T>& lhs, const std::vector<T>& rhs)
{
    auto lhs_cur = lhs.begin();
    auto lhs_end = lhs.end();
    auto rhs_cur = rhs.begin();
    auto rhs_end = rhs.end();

    for (; lhs_cur != lhs_end && rhs_cur != rhs_end; ++lhs_cur, ++rhs_cur)
    {       
        // compare each member
        auto cmp = *lhs_cur <=> *rhs_cur;
        if (cmp != 0)
            // they aren't equal, so return that as the result
            return cmp;
        // otherwise continue
    }

    // at this point all members in the common prefix are equal
    if (lhs_cur != lhs_end)
        // lhs is bigger, so it's greater
        return std::strong_ordering::greater;
    else if (rhs_cur != rhs_end)
        // lhs is smaller, so it's less
        return std::strong_ordering::less;
    else
        // both are completely equal
        return std::strong_ordering::equal.
}

The above is a possible implementation of the spaceship operator for std::vector: It simply does a lexicographical three-way comparison, as std::lexicographical_compare_3way would.

With that definition you can do vec_a < vec_b and the compiler rewrites it to vec_a <=> vec_b < 0.

Ignoring the fact that std::vector has an operator<

But you can also do vec_a == vec_b and the compiler rewrites it to vec_a <=> vec_b == 0. And this is not ideal!

If you just want to compare the containers for equality, you check the sizes first, not at the end: If the two containers have different sizes they can’t be equal, so there’s no need for the loop.

This means that writing operator<=> for containers isn’t enough, you also need operator== for performance reasons. And as vec_a != vec_b would defer to vec_a <=> vec_b != 0, you also need operator!=. So you still need three operators, not just one — which is better, but still not ideal.

The proposal points out a couple of solutions, but doesn’t suggest one explicitly.

Fixing the Performance Impact

This is where P1185 comes in. It proposes a good solution to the problem that comes in three parts:

  1. Change the lookup of a == b and a != b: a == b will only search for an operator== overload, not operator<=>. But it will still do that symmetrical, so you only need bool operator==(const std::string& lhs, const char* rhs), not an additional version with the types reversed. Likewise, a != b will try !(a == b) or !(b == a) and not a <=> b != 0. This allows writing operator<=> and operator== for maximum efficiency.

  2. Generate operator== when generating operator<=>: The above fix has an unfortunate consequence, however. When you just do auto operator<=>(const T& other) const = default, you will only get ordering, not equality. So the paper has an optional proposal that a defaulted spaceship operator will also generate a defaulted operator==, to have the full ordering and equality with just one default declaration again.

  3. Fix the defaulted comparison operator implementations: A defaulted operator== doesn’t help us if it just dispatched to operator<=> again! While the defaulted operator<=> will do a lexicographical comparisons of all members using <=>, the defaulted operator== will compare all members with == and return that result chained with &&. That way, it can actually pick up the more efficient of operator== of container types!

With this proposal the author of a container type would need to do two things:

  1. Write a lexicographical operator<=>.
  2. Write an optimized operator==.

Then all comparison operators work and are as fast as possible.

And the author of a simple class can just default the spaceship operator as before and will get the faster equality operators automagically!

The Generic Spelling of <=> Isn’t <=>

Look at the operator<=> implementation of std::vector<T> given above again: It does a lexicographical comparison of each member by calling their <=>.

As I’ve mentioned before: that is wrong.

If you do a <=> b it will not compile if the type doesn’t have an operator<=> but only operator== and operator<. And as of right now, no type has an operator<=>!

So in generic code you can’t use <=> directly, you have to try it and fallback to using operator== and operator< for a three-way comparison. At least there is std::compare_3way() that will do it for you.

But it is really unfortunate that the generic spelling of <=> is std::compare_3way().

P1186 agrees and proposes that a <=> b should automatically do the fallback to operator== and operator<. That way you can just always use <=> and everything is fine.

As then the name std::compare_3way is available again, it proposes that it should instead become a function object: Where std::less does a < comparison, std::compare_3way would do a <=> comparison.

In part 5 of my comparison series I implemented it as well, just called it default_ordering.

A Default Ordering

P0891 would like to take a similar name for something else, however.

There are types that cannot provide a sound ordering, like std::complex. It just doesn’t make sense that they have an operator< as the ordering wouldn’t be compatible with the mathematical properties.

Read my comparison series if you want to know why.

Yet it would be totally reasonable to use std::complex as a key in a map. For that you just need some ordering, not a sensible one.

And likewise using std::vector as a key in a map would also allow a more efficient ordering: First, order by length, then order each elements. As long as you have lots of containers with different lengths, the comparison is still fast. The resulting ordering is not very useful, but it doesn’t have to be — it just needs to be a valid one.

So std::map shouldn’t actually use operator< (or operator<=>) directly, it should use a different customization point.

This is what the paper proposes. The new customization point is called std::default_order() and it returns the default ordering of a type. It can be provided for types that don’t have an operator< but allows to use them inside containers anyway.

In part 5 of my comparison series I called it key_ordering.

If both of the previous proposals are accepted it would mean the following:

  • If you want to check something for equality in generic code, use a == b. It’ll be as fast as possible and not rewritten to three-way comparison.

  • If you want to do a three-way comparison, use a <=> b. There is no need for a manual fallback to a < b or a == b.

  • If you need to do a three-way comparison but as a function object, use std::compare_3way. It is just like std::less for operator< or std::plus for operator+.

  • If you need to have some ordering for a type, use std::default_order(). It implements an arbitrary ordering if you just need to sort and do a binary search.

Standard Library Types Don’t Have <=>

While the spaceship proposal added operator<=> to the built-in types like int, it didn’t add them to the standard library. With the current semantics of operator<=> this is bad as they cannot be used in a three-way comparison!

So P0790 proposes the addition of an operator<=> overload to all types that currently have operator< or operator==.

Remember that std::strong_ordering operator<=>(const T& other) const = default; will just allow equality.

If the automatic fallback is accepted, this addition might not be necessary.

What is still necessary is P1191, however. It proposes the addition of three-way comparison (and thus normal comparison) to a couple of types that don’t have any comparison at all currently. To be precise, it only proposes equality to types like filesystem::file_status or the very important and often used std::slice.

Before you get excited: std::slice isn’t what you think it is.

Other Library Improvements

To quote P1310, if you want to compare two strings, you have:

  • char_traits::eq (returns bool)
  • char_traits::eq_int_type (returns bool)
  • char_traits::lt (returns bool)
  • char_traits::compare (returns int)
  • basic_string::compare (returns int)
  • basic_string_view::compare (returns int)
  • sub_match::compare (returns int)
  • istreambuf_iterator::equal (returns bool)
  • filesystem::path::compare (returns int)
  • filesystem::equivalent (returns bool, provides the weak equality of whether two paths resolve to the same file)

That’s a bit of a mess with the different return types and what not.

So instead there should be a single unifying char_traits::cmp and deprecate some of the others in favor of that. Note that I do not agree to deprecate filesystem::equivalent in favor of std::weak_equality operator==! Read my comparison series or P1307 for more details.

This is actually incorrect, the proposal meant to specialize the std::weak_equal() algorithm instead, which follows that advice.

The current standard library has concepts like BinaryPredicate or Compare that work in terms of bool operator(). P1312 proposes that they also work with std::weak_equality operator() and std::weak_ordering operator(), respectively. This is a really important change as it allows you to follow my guideline about implementing weak orderings as named comparison functions like case_insensitive_compare(). Then you can just pass them to std::find_if() or std::sort() without wrapping them manually!

Note that it does not propose changing concepts like LessThanComparable to use operator<=> as a < b also works for types that have only <=>.

When I implemented some ordering algorithms, I wrote a trait ordering_category that returns the ordering category of two types. P1187 proposes it under the name compare_3way_type.

And finally, P0863 discusses fixes for a potential bug in std::partial_order(a, b). Quick recap from the maths behind orderings: In a partial order, two types can either be less/greater/equivalent or unordered. But std::partial_order() will never return std::partial_ordering::unordered!

Conclusion

Do quote me on this:

Without P1186 operator<=> is completely useless in generic code. And P1185 is essential for fast generic code. With concepts, generic code is supposed to be made easier and more approachable for beginners. We don’t need yet another pitfall.

While the other proposals are useful as well, those two are critical to really polish <=>. I sincerely hope they will make it into C++20.

This post was made possible by my Patreon supporters. If you'd like to support me as well, please head over to my Patreon and do so! One dollar per month can make all the difference.