(Awesome?) Allocator Additions - Thoughts regarding allocator proposals

04 Apr 2016 by Jonathan

The C++ Standards Committee Papers of the post-Jacksonville mailing were recently published. There are few quite interesting ones that deal with the STL’s allocator model: P0177R1 - Cleaning up allocator_traits, P0178R0 - Allocators and swap(actually from February) and P0310R0 - Splitting node and array allocation in allocators.

In this post, I’d like to discuss those with you and explain why I really hope some of them they will accepted. The first parts are also a follow-up to AllocatorAwareContainer: Introduction and pitfalls of propagate_on_container_XXX defaults.

Note that I am not involved in this proposals or the standardization process at all. I’m simply a guy that is working on a memory library himself and has designed a new allocator model in the process. As such my view is likely biased and I might drop a bit of self-advertisement in the process.

P0177R1 - Cleaning up allocator_traits

One of my earlier posts was AllocatorAwareContainer: Introduction and pitfalls of propagate_on_container_XXX defaults. In this post I’ve explained the C++11 AllocatorAwareContainer concept and looked at the propagate_on_container_XXX “boolean typedefs” you can set.

There are three of those:

propagate_on_container_copy_assignment: Only if this is std::true_type copy assigning an AllocatorAwareContainer will also copy-assign the allocator.
propagate_on_container_move_assignment: Same as the copy assignment typedef but for the move assignment. Note that there is a performance disadvantage if this is std::false_type (the default!). Because then move assigning cannot just change some pointers, it needs to allocate memory using its own allocator (if the allocators aren’t “equal”).
propagate_on_container_swap: Only if this is std::true_type swappping two AllocatorAwareContainer will also swap the allocator. Note that if this is std::false_type (the default), swapping two containers with “unequal” allocators will result in undefined behavior. Because then swap can’t do the pointer swap but needs to allocate the memory again using their own allocator. But swap can’t do that due to a rule forbidding iterator invalidation.

I highly recommend checking out the original post for more details, then come back and read on. Last link to it you’ll get!

The old post was kind of a rant regarding these design choices and especially the “wrong” defaults.

I mean, they can lead to performance pessimization and undefined behavior!

After it got popular back in October Alisdair Meredith contacted me. He is a large proponent of the STL Allocator model and also the author of this paper. In a long mail he explained why the defaults are what they are.

I intended to write a follow-up blog post but sort of … forgot. My apologies for that, please see this as the follow-up I never wrote!

Propagation is only important for stateful allocators and there are two different models of them. They are also explained in the proposal, so I just quote from there, starting with the second model.

A second model is that the allocator move with the allocated memory, so every move-assignment and swap should propagate the allocator to maintain a non-throwing wide contract, and not risk an allocation or undefined behavior when the allocators do not match, but without any guarantee that a given data structure will have a consistent allocation strategy using such allocators.

This is basically what I’ve said in the original blog-post in a more formal, non-ranty way. Allocators belong to the memory so they should always move - in a sense of travelling, not move assignment - with them. Why should you ever have allocators that don’t stay with their memory?!

Because of the first model, that’s way:

The first model, that drives the current default behavior that allocators do not propagate, is that all elements of a data structure should use the same allocator. For example, a container would pass its own allocator down to its elements, and they in turn would pass that allocator down to their bases and members. Once this invariant is established, we do not want to lose it by swapping with elements for elsewhere that use a different allocator, or change allocator when assigned-to from an external source.

STL Allocators can control construction and destruction of their allocated objects. With this functionality they can also control the allocators the objects in their memory use. This allows an allocator to pass itself down to the objects.

It is used by the Bloomberg allocator model and Boost.Interprocess, for example. In the latter case all allocations by the container’s value type should be in the same memory segment.

The allocator should also stay with its objects in this model. Otherwise we may get lifetime issues.

This is also the case for the polymorphic memory resource TS.

I might write another blog post about its design choices.

There the containers just have a pointer to their resource. When allocators are freely transferred between containers lifetime reasoning is more difficult. But if the allocators stay with one container, it is easy: the resource must life as long as the container object.

And that’s why the defaults are chosen the way they are.

If I haven’t quite convinced you that is because I’m not quite convinced either. foonathan/memory uses the second model with everything set to std::true_type and I’ve never run into any issues regarding lifetime. Just create the allocators inside main() or similar and you’re good. But on the other hand I haven’t yet implemented the model where you pass down the allocator so I wait with a final statement for that.

Okay, back to the paper itself. Got a little bit carried away there.

Note that in both models either all propagate_on_container_XXX is set either to std::true_type, i.e. full propagation, or std::false_type, i.e. no propagation. There is no model that uses propagate on swap but not on assignment or similar.

Supporting customizing all three makes implementing AllocatorAwareContainer classes just unnecessarily much harder. The paper gives an example and I gave one to motivate people using my allocator model here.

So the paper proposes that you must set all three to the same value. This makes implementation simpler and your own models easier to reason about. And since it is very unlikely that no one has actually implemented a sane model that requires those values to differ, this will be a non-breaking change.

It also proposes fixing some LWG issues in the process.

P0178R0 - Allocators and swap

P0178R0 addresses the issue of the undefined behavior introduced by swapping to unequal allocators.

The motivation is clear: undefined behavior is bad[citation needed]. It also makes generic code less generic because swap then sometimes has a narrow contract.

The solution is to keep the member swap as is (with the UB), but to modify the namespace version to look like this (taken from the paper):

void swap(CONTAINER_TYPE & left, CONTAINER_TYPE & right) {
   if (allocators are compatible) {
      left.swap(right);
   }
   else if (allocator propagation traits are sane) {
      std::swap<TYPE>(left, right);
   }
   else {
      CONTAINER_TYPE tempLeft {std::move(right), left.get_allocator() };
      CONTAINER_TYPE tempRight{std::move(left ), right.get_allocator()};
      swap(left,  tempLeft );
      swap(right, tempRight);
   }
}

My apologies for the awful placement of the braces, I’m just quoting.

“allocator are compatible” means that they compare equal, i.e. they can be used to deallocate memory allocated from the other, or propagate on swap. In this case the fast swap with the narrow contract is called (since the contract is fulfilled).

“allocator propagation traits are sane” means that the swap trait (or any, if the above proposal gets accepted) are the same. In this case, the ~~manual~~ ~~generic~~ more expensive swap with the temporary variable is used.

Note that as “last resort” a copy of the container is made via the move constructor and the other allocator. Then the allocators are swapped.

The last two cases were previously undefined, now they are just slower.

Yeah?

Also note that those cases also invalidate iterators.

But swap must not invalidate iterators!

Yes, swap must not invalidate iterators - “except when the allocator compare unequal” is what the proposal says. This is not a breaking change since previously the code was UB.

I think this proposal only deals with half of the problem. All swaps have now a wide contract but different post-conditions. Now totally generic code cannot rely on the fact that swap does not invalidate iterators.

This simply trades one undefined behavior with another.

I wouldn’t recommend to disable propagation on containers either. But as I’ve said, I’ll wait until I’ve made experience with the scoped allocators before making any “proper” guidelines.

P0310R0 - Splitting node and array allocation in allocators

On a less technical topic P0310R0 proposes a split between node and array allocations.

If you are anything like me, your initial reaction will most likely be “yes, please!”. Not necessarily on the split but on a different thing it proposes “by the way”.

The allocation function for STL allocators looks like this:

pointer allocate(size_type n, const void* hint = 0);

Ignore the hint, everybody does anyway.

This function shall allocate memory for n elements, i.e. calling std::allocator<int>::allocate(5) will allocate memory for 5 ints, i.e. 5 * sizeof(int) bytes of memory.

But this function actually must do two very different things!

For n = 1 it allocates memory for a single element. It is called node in this context, so it is a node allocation.
For n > 1 it allocates memory for an array of elements. It is thus an array allocation.

I don’t know what happens for n < 1.

Depending on the use-case of the allocator often it deals either only with node allocations or with array allocations. For example, using an allocator inside std::list and other node-based STL containers will result to calls to allocate(1) only because those containers are based upon single, interlinked nodes. On the other hand, using it inside std::vector will result into array allocations because std::vector requires continuous storage.

And using it inside the hashed containers will usually result into array allocations for the table and node allocations for the objects itself. This is partly against this argumentation, so please ignore it. Thanks.

As a matter of fact, node allocation is much simpler than array allocations in most allocators. For example, memory pools are designed for node allocation, putting array allocations into them massively affects performance.

So naturally when I designed my new allocator model one of the first things I did was to split up node and array allocations.

And to remove the dependency of the type being allocated and to give alignment support. But I don’t want to rant about the STL allocator model here.

This paper does that as well by proposing three additions to std::allocator_traits:

allocate_node()/deallocate_node() functions: Those are optimized for allocating one single node. They forward to member functions or allocate(1). This isn’t the big change. Allocators can do branching on the n argument to do node-specific or array-specific allocation. This is what I’m doing as well.
A boolean typedef node_allocation_only: It defaults to std::false_type, can be overriden to std::true_type. If it is std::true_type, the allocator only provides node allocations and it is an error to try and allocate arrays with it (I assume you must not call allocate()). This isn’t the big change either, it merely transforms a runtime assertion failure into a compile-time error.

So what is the big change if not the split between node and array allocations?

A memory pool is optimized for really fast allocations of nodes of a given size. But there is a certain problem, consider my library as an example:

#include <foonathan/memory/container.hpp>
#include <foonathan/memory/memory_pool.hpp>

namespace memory = foonathan::memory;

...

memory::memory_pool<> pool(???, 4096u);
memory::list<int, memory::memory_pool<>> list(pool);
// ^^^^ equivalent to: std::list<int, memory::std_allocator<int, memory::memory_pool<>>> list(pool);
// just a convenience typedef

The above code snippet creates a std::list using my memory pool implementation. The constructor of memory_pool takes two arguments: the first one is the size of each node in the pool, the second one the initial capacity it has.

We set the second one to 4KiB, but what is the node size?

sizeof(int)? No, each list node has the overhead of the pointers.

So sizeof(int) + 2 * sizeof(void*)? Maybe, depends on alignment and other things.

So just use 2 * (sizeof(int) + 2 * sizeof(void*) to be safe?

But what about the node of a tree structure? Two children + one parent?

Or the node of a hash map? Single linked list? Double linked list? Tree?

The answer is: We don’t know the node size. It is implementation-defined. But we need at least its size in order to properly use pool allocators!

To address this fundamental problem of the STL the paper proposes a nested node_type typedef. This is the node used by the node containers.

This was also one of the changes EASTL made.

With it we can substitute ??? with sizeof(memory::list<int, memory::memory_pool<>>::node_type).

And that is the big change of this proposal!

Just for the sake of completeness:

And to promote my library.

You can also get its size using my nodesize debugging facility. Upon building the library it runs a code generator that obtains the node size and generates constants you can use. In the case above it is memory::list_node_size<int>::value. But although it works™, it is very ugly and will break if the Allocator used has any effect upon the node type.

So I cannot wait for the moment to replace it!

Conclusion

Especially the node proposal is something I really want. Getting access to the container node types will make my life so much easier.

Cleaning up the allocator traits is also nice like trying to get rid of the UB associated with swap. If those changes have been in C++11 I wouldn’t have needed to write a blog post about it and less explaining of pitfalls is something every language needs.

On the other hand, the post was well-received and I liked writing it, so…

There are also a few other proposals that deal with allocations:

P00211 - Allocator-aware library wrappers for dynamic allocation: Simple paper that proposes the equivalent of allocate_shared() for std::unique_ptr and raw pointers. Also an allocator_deleter for std::unique_ptr and a deallocation function for raw pointers. foonathan/memory provides equivalent smart pointer functionality for my RawAllocators but not for the raw pointers. Getting a raw pointer from an allocator should be hard to encourage using smart pointers.
P0035R1 - Dynamic memory allocation for over-aligned data: This paper wants to add support for over-aligned data in new by providing a new ::operator new taking an alignment value. Not much to say - make it happen!
Most likely others I’ve missed or didn’t want to cover.

C++ future regarding allocator will be really great if the right papers get accepted.

This blog post was written for my old blog design and ported over. If there are any issues, please let me know.