Gradients are the new intervals
At the New England Symposium on Graphics, James Tompkin compared graphics researchers to magpies: they're easily distracted by shiny objects and pretty renderings.
While this is true, the analogy also holds from a different angle: when I'm
reading graphics papers, I'm constantly looking for ideas to steal
bring back to my nest.
Researchers at IRIT and Adobe Research recently published a paper that's full of interesting ideas, and I'd like to talk about it.
This blog post assumes a vague understanding of implicit surface rasterization, and how interval arithmetic is used to both skip regions of space and simplify complex expressions. See this section of the Fidget writeup for a short introduction, or my colloquium talk for a longer explanation.
Here's the key sentence from the paper's abstract:
We introduce an efficient hierarchical tree pruning method based on the Lipschitz property of SDFs, which is compatible with hard and smooth CSG operators.
In this case, "the Lipschitz property" means that the gradient of the distance value is bounded, e.g. $||\nabla f(x, y, z)|| \lt 1$. You'll also see it referred to as Lipschitz continuity.
Lipschitz continuity provides a bound on the maximum change in the distance value between two points: if you have points $\vec{p}$ and $\vec{q}$ in 3D space, then
$$|f(\vec{p}) - f(\vec{q})| \le ||\vec{p} - \vec{q}||$$
Given this requirement on the distance fields, the authors present two core tricks that can be applied to CSG-tree-flavored models:
- Pruning identifies primitives which are inactive within a particular region of space, and builds a simplified shape with only active primitives
- Far-field culling finds primitives which are far from a particular region (and therefore not relevant), and replaces them with a simpler expression.
These optimizations make rendering dramatically cheaper, allowing for interactive visualizations of complex models.
The first optimization also sounds familiar – I presented something similar back in 2020 (which itself builds on work from 1992):
To render a model, its underlying expression is evaluated in a shallow hierarchy of spatial regions, using a high branching factor for efficient parallelization. Interval arithmetic is used to both skip empty regions and construct reduced versions of the expression. The latter is the optimization that makes our algorithm practical: in one benchmark, expression complexity decreases by two orders of magnitude between the original and reduced expressions.
Here's the big difference: my implementation used interval arithmetic on arbitrary math expressions, while this new paper uses single-point evaluation on Lipschitz-continuous primitives.
Single-point evaluation is cheaper, and also doesn't suffer from the conservative nature of interval arithmetic (where intervals tend to grow over time). However, their shapes are limited to a set of well-behaved primitives and transforms; something as simple as scaling a model can break Lipschitz continuity if not handled carefully.
After thinking about it for a few days, I ended up taking home a slightly different conclusion than the paper presented:
If your distance field is Lipschitz-continuous, then you can use single-point evaluation to get interval-ish results, then apply all the usual tricks which normally require interval arithmetic.
I've now gone far too long without showing a pretty picture (the magpies in the audience are growing rowdy), so let's dig into some examples. I'm going to take this idea — of using single-point evaluations to get pseudo-intervals — and see how it applies to my usual box of tools.
Note that I'm not hewing to the implementation from the paper (which you should go read, it's great!); I'm taking this idea and running in a slightly different direction based on my background knowledge.
Basic rendering
Here's a circle of radius 1, $f(x, y) = \sqrt{x^2 + y^2} - 1$:
It's rendered with orange / blue indicating whether we're inside or outside the shape, and field lines showing the distance field values. I'm not sure who developed the original coloring strategy, but it's pervasive on Shadertoy; I borrowed it from Inigo Quilez.
My typical strategy for rendering is to do several rounds of evaluation using interval arithmetic, then pixel-by-pixel evaluation of the remaining ambiguous regions. The goal is to produce a binary image: each pixel is either inside or outside, depending on sign.
Interval evaluation takes intervals for $x$ and $y$, and returns an interval
output.
For example, we can evaluate a region in the top-left quadrant:
$$f([0.5, 1.5], [0.5, 1.5]) = [-0.3, 1.12]$$
Because the interval result is ambiguous (contains 0), we can't prove this region inside or outside the shape, so we subdivide and recurse. If it was unambiguously less than or greater than zero, we could stop processing right away!
The full algorithm produces a set of colored regions (proven inside / outside), and pixel-level results for ambiguous regions below a certain size:
Intervals are shown as large uniformly-shaded squares; pixel-by-pixel evaluation is concentrated at the edge of the circle.
Using gradients instead
Our circle model has a gradient of 1 everywhere (proof is left as an exercise to the reader). If we want to evaluate an interval region $[x_\text{min}, x_\text{max}], [y_\text{min}, y_\text{max}]$, we can sample a single point in the center:
$$v = f\left((x_\text{min} + x_\text{max}) / 2, (y_\text{min} + y_\text{max}) / 2\right)$$
From this point, the maximum distance to a corner is given by $$d = \frac{\sqrt{(x_\text{max} - x_\text{min})^2 + (y_\text{max} - y_\text{min})^2}}{2}$$
Because our gradient is bounded, we know that the value anywhere in the region must be in the range $[v - d, v + d]$.
In other words, we can evaluate a single point in the center of the region, then construct a pseudo-interval result by adding the distance to a corner.
Visually, this is like drawing a circle over the rectangular region:
The red box is our target region (i.e. intervals in X and Y); the green elements shows our center point, radius, and effective circle.
This is a drop-in replacement for our previous interval evaluator; we'll call it a "pseudo-interval evaluator". We can run the exact same algorithm to produce an image, swapping in this new evaluator. For the original circle, it produces the same picture:
By carefully positioning the circle, we can see a case where pseudo-interval evaluation requires more subdivision (interval evaluation on the left, pseudo-interval evaluation on the right):
Because the pseudo-interval considers the green circle (rather than the red square of the interval evaluator), it can't prove this entire region outside of the shape, and therefore has to subdivide it.
Downsides to interval evaluation
There are also cases where the pseudo-interval evaluator has the upper hand!
Consider rotating a model by 45°: this is equivalent to a new evaluation
$$ f_\text{rot}(x, y) = f\left(\frac{x + y}{\sqrt{2}}, \frac{y - x}{\sqrt{2}}\right) $$
The "natural" axes of the model are now at 45° angles from our image axes. Unfortunately, our intervals are in terms of the image's $x$ and $y$ coordinates, which remain horizontal and vertical. During evaluation, this transform enlarges the sampled region by a factor of $\sqrt{2}$:
You can now imagine situations where pseudo-interval evaluation is more precise, because its circle circumscribes the original region:
Sure enough, there are cases where interval arithmetic recurses down to individual pixels, while pseudo-interval evaluation succeeds in proving regions entirely inside the shape:
More generally, interval evalution gets worse and worse as you stack up more transforms. One way to think about it is that interval arithmetic "forgets" any correlations between its inputs. For example, if you evaluate $x^2$ as $x \times x$ (instead of with a dedicated power operation), interval arithmetic produces a much wider result:
$$[-1, 1]^2 = [0, 1]$$ $$[-1, 1] \times [-1, 1] = [-1, 1]$$
We know that the value must be the same on each side of the multiplication, so it's impossible to get -1 as a result; however, the implementation of multiplication assumes that the two sides are totally uncorrelated.
You can work around this failure mode (to some extent) by automatically detecting and collapsing affine transforms, but you certainly don't get it for free.
Using point samples on a Lipschitz-continuous function means that this problem disappears: we know that the stack of transforms hasn't changed the bounds on the gradient, so any transforms of the model don't affect sampling!
Normalization
Here's a more complex example, built from about 400 math expressions:
Looking at the field lines, it's clear that this is a less well-behaved distance
field! Specifically, the gradient at the w
is very low; the field lines are
extremely stretched out.
Interval evaluation shows the effects of the badly-behaved field: there's a
bunch of regions around the w
where the renderer recurses farther than
necessary.
Having a badly-behaved field means that we can't assume Lipschitz continuity. In this model, most of the field has a magnitude $\ge 1$:
Our evaluation is using Fidget, which provides a
forward-mode automatic differentiation evaluator. This suggests a dumb
potential solution: find the (value, dx, dy, dz)
tuple using automatic
differentiation, then normalize by dividing the value by the magnitude of the
partial derivatives:
The gradient looks vaguely correct – field lines advance at the same rate
everywhere – but there's a subtle issue here. Looking above the e
, you can
spot a dramatic shift in brightness, indicating that the distance field believes
it's suddenly much further away from the model. This clearly won't work: a
sample taken above the e
could falsely report that it's very far from the
model.
So, how did this go wrong?
Above the e
, two fields are combined with min
. We'll call these two fields
the $e$ and $w$ fields, based on the letter than generated the field. At the
point where the fields are exactly equal, they have the same value but
different gradient magnitudes.
$$ \begin{aligned} f(x, y) &= \min(e(x, y), w(e, y)) \\ e(x, y) &= 0.1,\hspace{0.1in}||\nabla e(x, y)|| = 1.2 \\ w(x, y) &= 0.1,\hspace{0.1in}||\nabla w(x, y)|| = 0.5 \\ \end{aligned} $$
Now consider the modified $f_\text{norm}(x, y) = f(x, y) / ||\nabla f(x, y)||$, slightly above and below the $\min$ transition:
$$ \begin{aligned} f_\text{norm}(x, y - \epsilon) &= 0.1 / 1.2 = 0.083 \\ f_\text{norm}(x, y + \epsilon) &= 0.1 / 0.5 = 0.2 \end{aligned} $$
Normalization instroduces a discontinuity in the value here! Even though our gradient is guaranteed to be 1 everywhere (because of normalization), there are instantaneous transitions in the value itself. This means that the Lipschitz property doesn't hold, so we can't use pseudo-interval evaluation.
Normalization, once more
What's to be done?
Non-rigorously, it seems like discontinuities in the gradient magnitude lead to
discontinuities in the value after normalization. The only expresions which
introduce discontinuities in the gradient magnitude are min
and max
, which
suggests a simple solution: normalize before those operations.
Of course, this changes the values, but it doesn't change their sign. We're
using min
and max
to perform union
and intersection
operations, so the
sign is the only thing that matters.
min(a, b) => union(a, b) = min(normalize(a), normalize(b))
max(a, b) => intersect(a, b) = max(normalize(a), normalize(b))
Here's the new normalization:
Everything looks good: the field lines are evenly spaced, and there are no discontinuities.
Plugging this into our psuedo-interval evaluator, we can get a perfectly reasonable result. It ends up evaluating slightly more tiles than the interval evaluator, but still produces a correct image:
Expression simplification
Interval evaluation has a second benefit: at each min
and max
node, we may
be able to prove that one branch is always taken within a particular spatial
region. If this happens, then we can simplify the shape, generating a
shorter expression which is valid within that spatial region (and less expensive
to evaluate).
With intervals, we check that they are not overlapping: min([0, 1], [2, 3])
will always pick the left-hand ([0, 1]
) argument.
A similar strategy works for Lipschitz-bounded point samples: within a region of
radius $r$, the value can only differ by $\pm r$ from a point sample in the
center. The equivalent condition for selecting the left-hand argument of a
min
operation is
$$ \min(a, b) \text{ where } a + r \lt b - r$$
Similar rules apply for the right-hand argument and max
expressions.
I find it helpful to visualize how much the expression is simplified at each
tile. For our hello, world
model, here's a comparison of tape lengths at the
final tile, with the original tape being 390 clauses:
It's definitely working in both cases! There are a few tiles where they differ, but there's not a clear winner; average tape lengths are 120.2 (for interval evaluation) versus 121.5 (for pseudo-intervals).
Performance impacts
We've presented three different evaluation strategies, which can be used to both prove entire regions inside / outside and provide data for expression simplification.
Interval arithmetic is conservative and works on any kind of distance field. The evaluator has to do extra work for each arithmetic operation to track intervals, compared to floating-point (value-only) evaluation. If many transforms are stacked up, it tends to provide results that are overly broad.
For Lipschitz-continuous distance fields, you can sample at the center of the region, then add the radius to calculate a pseudo-interval. This is a single floating-point evaluation, which is cheaper than interval arithmetic. In addition, it doesn't have issues with stacked transforms.
Finally, for distance fields that are not Lipschitz-continuous, you can evaluate
them using forward-mode automatic differentiation, then normalize by
dividing by the magnitude of the gradient before min
and max
nodes, creating
a new Lipschitz-continuous field. Like single-point sampling, this doesn't have
issues with stacked transforms; however, automatic differentiation is more
expensive than floating-point evaluation. Normalization also produces a
different numerical result, so this strategy is only valid if you just care
about the sign.
It's pretty obvious that single-point sampling will beat interval arithmetic: it's doing less work, and doesn't have issues with overly conservative estimates. (Indeed, the paper shows this in Figure 13)
However, it's not immediately obvious whether interval arithmetic or normalized single-point samples (using forward-mode automatic differentiation) would be more expensive! Obviously, the answer is going to be implementation-dependent, but I can provide one data point.
The Prospero Challenge presents a large expression (7668 clauses), and challenges readers to write an efficient renderer. The underlying expression is not Lipschitz-continuous, so we have to use the normalized evaluator.
Testing these two evaluators on this model, I see the following results:
Interval | Normalized pseudo-interval | |
---|---|---|
512×512 | 12 ms | 18 ms |
1024×1024 | 20 ms | 28 ms |
1536×1536 | 29 ms | 38 ms |
2048×2048 | 37 ms | 48 ms |
It looks like interval arithmetic is slightly faster in this case, but they're relatively close to each other – there's certainly no order-of-magnitude differences here.
Wrapping up
This has been a fun exercise, and I'd encourage you to go read Lipschitz Pruning: Hierarchical Simplification of Primitive-Based SDFs yourself and see their full system laid out on paper.
Suggested further reading:
- Blake Courter is building up a theory of uniform gradient fields, which would be amenable to these techniques (since by definition they have a gradient of 1 everywhere)
- Monte Carlo geometry processing and subsequent papers discuss using walk-on-spheres to solve geometry problems without discretization, using the same strategy of point samples on a Lipschitz-continuous distance field
I've published the code used to generate figures in this blog post at
mkeeter/gradients-and-intervals
,
along with run.py
for post-processing. This is Research Quality
Code™, so take it with a grain of salt!