How I have beaten Boost.Pool #3: Branches are bad

Branches and conditional jumps are essential for every program, you cannot write anything but the most trivial code without them. Yet they sometimes have a certain overhead and can lead to problems in performance critical code paths.

It is often faster if they weren’t there. But how can you do that?

In this series, I’ll explain my changes and share some lessons about optimization I’ve learned in the process of beating Boost.Pool. This time its all about branches and a more detailed information about the detail::small_free_memory_list.

» read more »
Author's profile picture Jonathan

(Awesome?) Allocator Additions - Thoughts regarding allocator proposals

The C++ Standards Committee Papers of the post-Jacksonville mailing were recently published. There are few quite interesting ones that deal with the STL’s allocator model: P0177R1 - Cleaning up allocator_traits, P0178R0 - Allocators and swap(actually from February) and P0310R0 - Splitting node and array allocation in allocators.

In this post, I’d like to discuss those with you and explain why I really hope some of them they will accepted. The first parts are also a follow-up to AllocatorAwareContainer: Introduction and pitfalls of propagate_on_container_XXX defaults.

» read more »
Author's profile picture Jonathan

How I have beaten Boost.Pool #2: Inlining is key

Calling a function has a certain overhead. Registers must be saved, a new stack frame pushed,… For small functions this overhead is more than the actual implementation of the function!

For those, it is much better if the compiler would copy-paste the implementation directly into the call site. This is what inlining does.

Luckily, the compiler is usually able to do this optimization. Or can it?

In this series, I’ll explain my changes and share some lessons about optimization I’ve learned in the process of beating Boost.Pool. This time I’m going to cover inlining. I’m going to share some of guidelines I’ve learned and also going to give you a look into some of memory`s internal code and design.

» read more »
Author's profile picture Jonathan

How I have beaten Boost.Pool #1: Introduction and profiling results

When I’ve released memory 0.5, a guy on reddit asked how my library compared to Boost.Pool. I provided a feature comparison and also quickly profiled both Boost’s and my implementation. Sadly, Boost.Pool did beat my library - in most cases.

So over the last weeks, I’ve took care of my performance problems and rewrote my implementations. So in version 0.5-1 they are basically still using the same algorithm, but now my library is equal or faster than Boost.Pool.

In this series, I’ll explain my changes and share some lessons about optimization I’ve learned doing them. The first part is an introduction to the different allocation algorithms used here and gives an overview about the profiling results.

» read more »
Author's profile picture Jonathan

Performing arbitrary calculations with the Concept TS

Last Tuesday I took a closer look at the Concept TS. This followed a discussion about the power and usefulness of concepts regarding a replacement for TMP (shout-out to @irrequietus and @Manu343726). So after compiling the GCC trunk that has concept support, I’ve specifically looked in a way to use concepts alone for doing arbitrary calculations.

» read more »
Author's profile picture Jonathan

Advertisement