Write explicit constructors - but what about assignment?

Implicit conversions considered harmful.

Okay, this might be a little harsh:

Potentially dangerous and/or expensive implicit conversions considered harmful.

Better.

Implicit conversions will happen “accidentally” by their very nature, so if they happen, they should always do the right thing.

And how to prevent implicit conversions? Simple: use an explicit constructor.

But that’s only half of the problem: What about assignment? Is there explicit assignment? If so, when do I use it?

The Rules of explicit

First, let’s talk about explicit constructors in more detail.

You’ll probably know that if you mark a single-argument constructor as explicit, it cannot be used in implicit conversions:

struct foo
{
  // explicit construction from int
  explicit foo(int i);

  // implicit construction from const char*
  foo(const char* p);
};

void take_foo(foo f);



take_foo(0);         // error: no implicit construction
take_foo(foo(0));    // okay
take_foo("hi");      // okay, implicit allowed
take_foo(foo("hi")); // allowed as well

As with most defaults, this default is wrong. Constructors should be explicit by default and have an implicit keyword for the opposite. But that’s a different story.

What you might not know is that you can mark any constructor as explicit, for any number of parameters:

struct foo
{
  explicit foo(int a, int b); // okay
  
  template <typename ... Args>
  explicit foo(Args... args); // okay

  explicit foo(); // okay

  explicit foo(const foo& other); // okay, but you really shouldn't do that
};

Obviously, those constructors can’t be used for implicit conversions, so explicit must mean something else as well. And it does: an explicit constructor cannot be used for copy initialization.

Now what is copy initialization?

I won’t even try to explain the umptillion ways of initialization in C++, so what follows is just a simplified excerpt of copy initialization. Copy initialization happens when initializing variables with = (as in T a = b) but it is also used for function calls, return statements, and throw and catch (but the last two don’t really matter for explicit - except when they do). All those things must not call an explicit constructor.

This allows a generalized rule of explicit: If a constructor is marked explicit, the type must be mentioned in order to use that constructor. An explicit constructor cannot be used in a context where a type is not explicitly mentioned “nearby”:

struct foo
{
    explicit foo(int) {}
};

foo a(0); // type nearby
foo b{0}; // type nearby
foo c = foo(0); // type nearby

foo d = 0; // type not nearby enough
foo e = {0}; // type not nearby enough

foo function()
{
    return 0; // type far away
}

This also applies in reverse for an explicit conversion operator, but I’m not going to talk about them here.

When to use an explicit constructor?

Based on the generalization above, the answer is surprisingly simple: Use an explicit constructor whenever you want users to write the name of the type when creating an object of that type.

And in particular for single-argument constructors: Mark a single-argument constructor as explicit, unless that constructor has no preconditions, has no high runtime overhead, or an implicit construction seems desirable for some other reason (last one is for experts only).

The second rule is important to prevent implicit conversions, but the first one is also useful to prevent “multiple argument implicit conversions”.

For example, you might have a rational class with the following constructor:

rational(int num, int den);

You might want to mark it as explicit if you feel like foo({1, 2}) shouldn’t be allowed if the parameter is a rational.

However, I haven’t seen anyone use explicit for a constructor that always needs more than one argument, so there isn’t really enough data about its usefulness.

But note that you run into issues if you have a constructor with default parameters:

foo(int i, float f = 3.14);

As that constructor can be used for implicit conversions, you’d want it explicit. But marking this as explicit also applies to the two argument case, so you prevent return {0, 1};, for example. This is probably not desired.

Non-standard operator=

Okay, so let’s talk about operator=.

For copy/move assignment operators, there should be a symmetry between them and the copy/move constructor. In particular, given other_obj of type T, this

T obj(other_obj);

should be equivalent to

T obj; // assume default constructor here
obj = other_obj;

But what if other_obj has type U - should the behavior be equivalent then?

It depends on the constructor that is used to create a T given a U, i.e. whether or not that constructor is explicit.

Non-explicit constructor and operator=

If there is a non-explicit constructor taking a U, then there should be equivalent behavior. After all, you can even write:

T obj = other_obj;

So it would just be silly, if plain obj = other_obj was not allowed.

And this is already guaranteed by the language without doing extra additional work. The assignment operator will create a temporary T object using implicit conversions and then invoke the move assignment operator.

Quiz: Is there a way to allow T obj = other_obj, but prevent obj = other_obj while keeping a copy and move assignment operator? Answer is at the end of the post if you’re curios.

The cost of that operation is an extra move assignment, which might have a non-zero cost, and - more importantly - a more efficient assignment implementation might be possible.

Consider std::string, for example. Suppose it doesn’t have an operator= taking a const char* and just the implicit constructor. Then you write the following code:

std::string str = "abcde";
str = "12345";

Ignoring small string optimization, the first line invokes the implicit constructor, allocates memory for five characters and copies "abcde" into that memory. Then the second line wants to assign another string. As there is no operator= applicable directly a temporary std::string is created using the implicit constructor. This will again allocate memory. Then the move assignment operator is invoked, so str takes ownership over the recently allocated memory, freeing its own memory.

But the second memory allocation was unnecessary! "12345" would fit into the already allocated buffer, so a more efficient assignment would simply copy the string. Luckily, std::string provides such a more efficient assignment - an operator= taking const char*!

If that is also the case in your assignment, write an operator= that takes an U.

explicit constructor and operator=

So let’s suppose the constructor taking U is explicit. Should you allow assignment?

The answer is no.

If you write an assignment operator taking U, you’ll allow obj = other_obj. But T obj = other_obj is illegal! The = there has nothing to do with assignment, just with C++ having too many weird forms of initialization. This is inconsistency is - well - inconsistent, so it should not be happen.

How do you assign an U object to T then? You follow the rules of explicit and mention the type: obj = T(other_obj).

However, that has the same problem as the implicit constructor. The code is just more … explicit. You still have to pay for the temporary + move and can’t use a more efficient assignment implementation.

It would be nice if explicit assignment would be supported directly. An explicit assignment operator would be called when writing obj = T(other_obj) - and not a constructor - and not by obj = other_obj, so we could have a more efficient assignment while still being explicit. But that feature isn’t there.

So if overloading operator= leads to inconsistency and not overloading it to overhead: What should you do?

Well, there are multiple ways to implement assignment - you don’t need an operator=: Write a member function assign that takes an U and assign using obj.assign(other_obj). This is ugly, but the best solution.

Multi-argument constructor and operator=

What about multi-argument constructors and operator=? Well, obviously there is no syntax for a multi-argument assignment, it only takes a single argument on the right hand side.

But there is no restriction on multi-argument function calls, so you could write an assign() function that takes more than one argument. Should you though?

It again depends on the cost of the temporary plus move assignment alternative. If assign() could do it cheaper, implement it. Again, std::string provides assign() functions matching the constructors for that very reason.

Conclusion

To summarize:

Should I mark this constructor as explicit?

Should I write a T::operator= taking a U?

Should I write an assign() member function taking Args...?

Answer to Quiz: Is there a way to allow T obj = other_obj, but prevent obj = other_obj while keeping a copy and move assignment operator?

Yes.

We need to provide a non-explicit constructor to allow T obj = other_obj, but can prevent assignment by simply defining a deleted assignment operator:

T& operator=(const U&) = delete;

The compiler will consider this assignment operator over the constructor in obj = other_obj and issue an error because that operator is deleted. Read more about deleting arbitrary functions if you are interested in real use cases of = delete besides prevent copy/move assignment/construction.

This blog post was written for my old blog design and ported over. If there are any issues, please let me know.