Affine Structure in Ao

In designing models, we often end up stacking transforms on top of each other.

Considering a function that rotates a 2D shape:

(define (rotate shape angle)
    (let ((ca (cos angle))
          (sa (sin angle)))
    (lambda (x y) (shape (+ (*    ca  x) (* sa y))
                         (+ (* (- sa) x) (* ca y))))))

Applying this transform creates a tree structure that remaps the X and Y coordinates:

Rotate once

Calling it a second time makes the tree twice as deep, as the top-level X' and X' values are plugged into the bottom of the same structure again.

Here's the tree produced for X'', the twice-transformed X coordinate:

Rotate twice

This is computationally inefficient: the tree could be flattened out into a single multiple-and-add structure that rotates by the angle a + b.

In addition, chaining these operations makes our use of interval arithmetic less accurate.

Let's say that the bounds of our shape are given by $$ \left\{x, y \mid 0 \lt x \lt 1, 0 \lt y \lt 1\right\} $$

Original shape

Rotating once by 45º brings the bounding box to $$ \left\{x, y \mid -\sqrt 2 / 2 \lt x \lt \sqrt 2 / 2, 0 \lt y \lt \sqrt 2 \right\} $$

Rotated shape

Rotating again by 45º scales the bounding box by \(\sqrt 2\) again:
Twice-rotated shape

After these two rotations, our calculated bounds are much larger than the actual bounds of the shape. If we collapse the two rotations into a single operation that rotates by 90º, the resulting bounds stay tight to the original shape.

These two motivations (smaller computation trees and tigher bound tracking) push for a smarter way to represent transforms. For rotations, there's an intuitive strategy for accumulating a series of operations: just sum the angles. Can we generalize further?


Rotations are an example of an affine transformation.

A rotation about the origin is the same as multiplying by the matrix $$ \left[ \begin{array}{cc} \cos a && \sin a \\ -\sin a && \cos a \end{array} \right] $$

Applying a pair of rotations to a coordinate looks like the following: $$ \left[ \begin{array}{cc} \cos b && \sin b \\ -\sin b && \cos b \end{array} \right] \left[ \begin{array}{cc} \cos a && \sin a \\ -\sin a && \cos a \end{array} \right] \left[ \begin{array}{c} x \\ y \end{array} \right] $$

This structure hints at our plan: instead of grouping the operations as $$ \left[ \begin{array}{cc} \cos b && \sin b \\ -\sin b && \cos b \end{array} \right] \times \left[ \begin{array}{cc} \cos a && \sin a \\ -\sin a && \cos a \end{array} \right] \left[ \begin{array}{c} x \\ y \end{array} \right] $$ we can group them as $$ \left[ \begin{array}{cc} \cos b && \sin b \\ -\sin b && \cos b \end{array} \right] \left[ \begin{array}{cc} \cos a && \sin a \\ -\sin a && \cos a \end{array} \right] \times \left[ \begin{array}{c} x \\ y \end{array} \right] $$

Since all the terms in the matrix are constant, this calculation can be done in advance and applied as a single matrix to the target coordinates.

This gives us generality: rotation, translation, and scaling can all be represented as affine transforms with a 4x4 matrix.

In Ao, we introduce a new opcode, OP_AFFINE, which is guaranteed to have the following tree structure:

Affine tree

The initial coordinates X, Y, and Z are constructed as OP_AFFINE with appropriate coefficients (e.g. X has a as 1 and and all other constants as 0).

Then, arithmetic obeys a simple set of rules:

Each of these special-cased operations returns a new OP_AFFINE tree, so we avoid building deep trees when multiple transformations are applied.

If none of these rules apply, then we build up the arithmetic tree as usual.


So, what does this accomplish?

Consider the following recursive function, which builds up a stack of twisted cubes:

(define (stack i scale rot)
    (let ((base (cube '(-1 -1 -1) '(1 1 1))))
    (if (= i 0)
        base
        (union base
            (rotate-z (move
            (scale-xyz (stack (- i 1) scale rot) scale)
                (list 0 0 (* 2 scale)))
                rot)))))

Let's compare the tree depth with and without the affine representation:

Affine graph

Using the affine representation, the tree stays much smaller, which means intervals will be tighter. The slope of the affine line is one level per cube, due to the stacking of union operations; the non-affine slope is four levels per cube.


Back to Ao homepage
Back to blog