Efficiently updating implicit in-order forests

You may find yourself looking at an ordered list of values:


And you may ask yourself: how do I efficiently compute a range aggregation on this data, e.g. finding the max value for indices 1 through 4?

Simple iteration works, but is O(N) in terms of range size.

A more efficient strategy is an "implicit in-order forest", which interleaves the raw data with a pre-computed reduction tree. This is the kind of data structure that's been independently re-invented dozens of times, but I first learned about it through Tristan Hume's blog post.

(You may want to familiarize yourself with that post before continuing; I'll give a brief summary, but he does a great job presenting this data structure)

For the values above and the max operator, we'd build the following tree:

max-reduction tree

Then, to query it, we find a set of tree nodes that exactly covers our desired range. For example, to compute the aggregate value across indices 1-4 (inclusive, zero-indexed), we'd combine the tree nodes marked in red:

max-reduction tree with values selected

We only have to combine three values instead of four, because one node (7) is already an aggregate of two leaf nodes! I'll refer to such a node in the tree as a "range node", because it covers a range of the original values. As we query larger spans of data, we get a O(log(N)) runtime for computing aggregate values.

(In these diagrams, I'm going to focus on a single tree, but rest assured that you can build a forest and pack it into a flat array just like before)

All of this is already covered by previous discussions of the data structure; now, let's push the boundaries.

In my use case, I need to update ranges, not just query them. For example, given our data above, how would we inject a new value of 9 across the range 0-4?

The naive approach is to merge the new value with every leaf node, then bubble up the results through the range nodes:

max-reduction tree values updated

(This is how it's done in Michael Rojas's excellent Typescript implementation )

Unfortunately, this update operation has an O(N) running time based on the size of the range. If we're updating infrequently, than that's probably fine, but it offends my aesthetic sensibilities (and could easily lead to Accidentally Quadratic behavior).

Let me tell you the trick right away: we double the node size. In our modified tree, each node stores two distinct values:

The rules are as follows:

Here's our original tree with both aggregate A (first, in red) and regular R values (second, in blue). To start, the leaf nodes have regular R values, and all other nodes only have aggregate A values.

max-reduction tree with A and R nodes

When updating a range, we write the new value to the R field, then update A fields recursively up the tree to maintain our invariants. Here's our previous example of writing the value 9 to the range 0-4 (inclusive):

max-reduction showing an AR update

This is much more efficient than touching every leaf node: we can write to range nodes that cover wide swaths of the tree! (In fact, we updated the exact same nodes that we'd read for the equivalent range query)

Computing the aggregate value of indices 1-4 (inclusive) now reads the following values from the tree:

max-reduction showing an AR range query

As you can see, this is a little more expensive: each node now walks up the tree to accumulate regular values above it. More formally, the runtime for both queries and updates is O(log2(N)). We've also doubled our memory usage, since each node in the tree now stores two values.

If you're paying close attention, you have noticed one more subtle limitation: we can no longer store "any associative operator" in this data structure. Instead, the operation must be both commutative (f(a, b) = f(b, a)) and idempotent (f(a, b) = f(a, f(a, b))).

(For those familiar with category theory, my friend Martin has informed me that this is a bounded meet-semilattice)

I used max in the examples above, which meets these definitions, but I'm actually using this to track set membership with fixed-size bitfields.

Here's how the original data structure compares to the new updatable variant:

Query rangeO(log(N))O(log2(N))
Update rangeO(N)O(log2(N))
Memory usage2*N4*N
OperationAssociativeAssociative, idempotent, commutative

We could bring memory usage down to 3*N by storing the R values in a separate tree, since the leaf nodes don't need both R and A. I also suspect that there's a better way to do range and update queries that would bring us back to O(log(N)); in the current implementation, parent nodes are often visited more than once (e.g. every node checks the R value at the root of the tree).

I've written a relatively un-optimized implementation as part of Fidget (which is an ongoing project that I'm not quite ready to blog about). This code will be handling much smaller sequences than the Gigatrace project – 103-104 values, versus 109 – so it's written for clarity and doesn't do as much clever bitshifting.

Even so, it turns out that this code is hard to get right: it's an absolute nesting ground for off-by-one-errors. After hearing co-workers rave about the powers of property-based testing, I wrote a proptest suite to automatically check my implementation.

There are two property-based tests:

All of the parameters for these tests are chosen by proptest, meaning a wide variety of values are checked.

The proptest experience was great. I didn't have to think about writing dozens of test cases; instead, I wrote a single checking function, let the computer run it on hundreds of cases, and glared at my bit shifting operations until it passed.

I'm not sure how many people out there need efficient range queries and updates on time-series bitsets, but if you do, I hope this has been helpful!

Like the original data structure, I'm sure this has been independently re-invented dozens of times; if you're got any pointers, then let me know.


On the Geometry Processing World-Wide Discord, Nathan suggests that this is a segment tree with range updates.

Rémy on Mastadon points me to the Algorithmica write-up on segment trees, which has a ton of detail.