saturating_add vs. saturating_int – new function vs. new type?

Suppose you want to do integer arithmetic that saturates instead of overflowing. The built-in operator+ doesn’t behave that way, so you need to roll something yourself. Do you write a saturating_add() function or a new saturating_int type with overloaded operator+? What about atomic_load(x) vs. atomic<int> x? Or volatile_store(ptr, value) vs. volatile int*?

When should you provide functions that implement new behavior and when should you write a wrapper type? Let’s look at the pro and cons.

» read more »
Jonathan

Technique: Compile Time Code Generation and Optimization

C++ constexpr is really powerful. In this blog post, we’ll write a compiler that can parse a Brainfuck program given as string literal, and generate optimized assembly instructions that can then be executed at runtime. The best part: we neither have to actually generate assembly nor optimize anything ourselves! Instead we trick the compiler into doing all the hard work for us.

The same technique can be used whenever you want to specify some sort of “program” in a different way and translate it at runtime: regexes, routing tables, etc.

» read more »
Jonathan

I accidentally wrote a Turing-complete parsing library

I’m currently working on lexy, a C++ parsing DSL library: you describe how input should be parsed, and lexy generates code for it, taking care of error recovery, parse tree generation, and parse values. Such parser generators are classified based on the expressiveness of the corresponding formal language. For example, a strict regular expression can only parse regular languages, which is a strict subset of a deterministic context free language, and so on.

lexy, being essentially syntax sugar for a recursive descent parser with (manually specified!) arbitrary lookahead but no other state, falls in the latter category. Parsers in that category are unable to parse context-sensitive languages such as XML with matching tags. To handle them, I’ve added support for “context variables”: state that can be modified during parsing.

However, in a recent refactoring of the context variables implementation, I’ve accidentally removed a big limitation, which makes lexy Turing-complete: the parser is thus able to do arbitrary computation while parsing the input.

TL;DR: I’ve written a lexy grammar that is able to execute, not just parse, a simple Turing-complete language.

» read more »
Jonathan

Tutorial: the CRTP Interface Technique

Generic code expects that your types model certain concepts. Sometimes, the concept requires many redundant member functions in your type. A big culprit here are iterators: they require many operator overloads, most of which are trivially implemented in terms of other overloads.

CRTP, the curiously recurring template pattern, can help here and automate the boilerplate away. Let’s look at the CRTP interface technique and explore how it works.

» read more »
Jonathan

C++20 concepts are structural: What, why, and how to change it?

C++20 added concepts as a language feature. They’re often compared to Haskell’s type classes, Rust’s traits or Swift’s protocols.

Yet there is one feature that sets them apart: types model C++ concepts automatically. In Haskell, you need an instance, in Rust, you need an impl, and in Swift, you need an extension. But in C++? In C++, concepts are just fancy boolean predicates that check for well-formed syntax: every type that makes the syntax well-formed passes the predicate and thus models the concepts.

This was the correct choice, but is sometimes not what you want. Let’s explore it further.

» read more »
Jonathan