At C++Now 2018 I gave a talk about rethinking pointers: foonathan.net/cppnow2018.html.
I highly recommend you check it out, even if you watched the similar talk I gave at ACCU, as that version is a lot better. It rediscovers and discusses the common guidelines about when to use references over pointers, when smart pointers, etc.
If you’re an expert, you might get a deeper meaning from the structured analysis. And if you’re a beginner, you get the condensed guidelines.
However, I think the most valuable thing is the taxonomy of pointer types.
It gives new vocabulary when talking about
std::optional<T&> which gives the whole discussion an obvious answer.
And here is also the big problem: Naming is hard.
In particular, my naming of the taxonomy in the talk is bad, so let’s introduce new names.
What do I mean by “taxonomy of pointer types”?
There are a lot of types you can use to refer to other objects:
It would be tedious to talk about every single possible implementation when giving guidelines.
It would also be unnecessary! A lot of the types are very similar.
So in the talk I looked at the properties they have.
Core Property Ⅰ: Object Access Syntax
The object access syntax answers the real obvious question: Given some pointer, how do I get the object it points to, i.e. the pointee?
There are a couple of options:
- Dot Access: Think
T&. You can just write
ref.member. No need to jump through additional hoops.
- Arrow Access: Think
T*. You have to explicitly dereference them, so
- (Member) Function Access: You have to call some (member) function to get the pointee, so
do_sth_with_pointee(ptr.get()), for example.
- Cast Access: You have to do something like
static_cast<T&>(ptr)to get the pointee.
For the purposes of the guidelines it doesn’t actually matter which exact syntax is required to get the pointee. All that matters is whether or not you need any special syntax to get the pointee.
So for the object access syntax the real question is between implicit access syntax (think
T&) and explicit access syntax (think
Core Property Ⅱ: Pointer Creation Syntax
This is the inverse of the object access syntax: Given an object, how do I get a pointer to that object?
The list of options is similar:
- Implicit Creation: Think
T&. You can just write
T& ref = obj, no need to do anything.
- Address-of Creation: Think
T*. You have to explicitly create a pointer using
- Function Creation: You have to call some function to get a pointer, like
- Function and Address-of Creation: You have to call some function passing it a pointer, like
And again, the exact syntax doesn’t really matter. What matters is whether you need any syntax.
So again the distinction is implicit creation syntax (think
T&) and explicit creation syntax (think
The Core Concepts
So we’ve got two important properties each with two values. That means we have four possible combinations:
- implicit creation syntax and implicit access syntax:
- implicit creation syntax and explicit access syntax:
- explicit creation syntax and implicit access syntax:
- explicit creation syntax and explicit access syntax:
I’ll get back to case two, it is really a special version of the core concept one.
And due to a lack of operator dot, you can’t really write a user-defined type with implicit access syntax.
The closest you can get is
std::reference_wrapper and this requires a cast for accessing members, which is annoying.
You have to write a forwarding function for every member function which makes it impossible to do generically.
Although in the talk I show one situation where this is possible.
And as there is no built-in type with explicit creation and implicit access, there is no real generic type for case three. So I didn’t bother providing a name for this concept.
That leaves case one and four.
A type with implicit creation syntax and implicit access syntax is what I called an
Alias in the talk.
And I think that name is reasonable —
T&, for example, behaves as if it was a
The problem is the name for case four.
I called a type with explicit creation syntax and explicit access syntax … a
Yes, this means that
T* is a
T& is not, which is unfortunate.
To be fair, there are some arguments for picking that name:
- You have to “dereference” a
Referencebefore accessing it, don’t need to “dereference” an
- Other languages like
Rusthave reference types that behave like a
T*, so model
- My type_safe library has
object_ref<T>that behaves like a
T*, so models
- I couldn’t use
Pointerbecause I used the term “pointer” to refer to any type that can point to another objects, i.e.
So I do think that in a perfect world, a
T& would be called an
Alias, not a
as that naming is more natural.
Sadly, C++ set a different precedent, so I’m adapting my names now.
Better Names for the Core Concepts
There are names that are pretty obvious in hindsight that work a lot better:
A type with implicit creation syntax and implicit access syntax, so something similar to a
T&, is a reference-like type.
A type with explicit creation syntax and explicit access syntax, so something similar to a
T*, is a pointer-like type.
The only downside of this naming scheme is that it might mean additional properties are tied to the concept as well.
For example, a pointer can be
nullptr, but there are non-null pointer-like types (like my
Or you can do pointer arithmetic on a pointer, but may not be able to do it on a pointer-like type.
However, this a relatively small downside.
Note that in my talk I used “pointer-like type” to mean any type that can point to something else (and used “pointer” as a short-hand for a type that can point to another object). So as an alternative for that meaning I propose zeiger, which is just the German word for “pointer”. A zeiger is any type that can point to a different location, so a reference-like type is a zeiger but a pointer-like type is as well.
I’m still open for alternatives though, but don’t think you need the term as often.
This means in my talk I had this Venn diagram:
But instead I now propose this one:
Technically, those are Euler diagrams, not Venn diagrams.
And yes, I drew those using paint, sorry.
Reference-like type vs pointer-like type is the most important distinction you need to make when picking a zeiger.
However, there are still huge differences between types that fall into the same category.
For example, a
const T& is different from a
T* has one more value than a
Those are the secondary properties:
- Mutability: Can I read the pointee? Can I write to the pointee? Or can I do both?
- Nullability: Does the zeiger have a special null value?
- Ownership: If the zeiger gets destroyed does it destroy the pointee as well?
Based on those properties we can talk about, for example, a nullable read-write pointer-like type or a non-null read-only reference-like type. If we don’t mention one of the secondary properties we don’t impose any requirements there. So in the example above it doesn’t matter whether the zeiger is owning or non-owning.
Note that for implementation reasons a nullable reference-like type cannot have implicit access syntax.
So case two from above — implicit creation and explicit access — is a nullable reference-like type.
boost::optional<T&> has these exact semantics, for example.
The core properties define the “nouns” while the secondary properties define additional “adjectives”.
And I repeat it again: the nouns are way more important than the adjectives.
If you want a non-null pointer-like type (so something like
gsl::non_null<T*> or my
type_safe::object_ref<T>) but don’t have access to those types,
don’t use a
While it is non-null, it is not a pointer-like type — it is a reference-like type.
And this difference is more important than the nullability difference.
Guidelines for Choosing the Correct Zeiger
Now that we have a vocabulary to talk about zeiger types, we can look at the situations requiring them, and analyse which noun is required and which adjectives. Then we can just pick any type that has those properties.
However, this is exactly what I did in the talk, so I won’t repeat it all here. I encourage you to watch the video or just look at the slides.
Just keep in mind that I used the different concept names there:
- “pointer-like type” → “zeiger”
Alias→ “reference-like type”
Reference→ “pointer-like type”
This post was made possible by my Patreon supporters. If you'd like to support me as well, please head over to my Patreon and do so! One dollar per month can make all the difference.