## A Neat XOR Trick

There's a neat trick to solve Advent of Code, day 6 in a single pass.

Looking over the solutions mega-thread, it seems like not many people discovered this trick; when I posted about it, people found it to be noteworthy.

Let's start with a brief summary of the problem: you're given an input string, e.g.

``````nznrnfrfntjfmvfwmzdfjlvtqnbhcprsg
``````

Within this string, you want to find the first `W`-character window in which every character is unique.

For this example, with a 4-character window, this occurs here:

``````nznrnf [rfnt] jfmvfwmzdfjlvtqnbhcprsg
``````

Naively, you can do this with three nested loops:

``````fn run(s: &[char], window_size: usize) -> usize {
for i in 0..s.len() - window_size {
let mut unique = true;
for j in 0..window_size {
for k in (j + 1)..window_size {
if s[i + j] == s[i + k] {
unique = false;
}
}
}
if unique {
return i + window_size;
}
}
panic!("No unique window found");
}
``````

With an input string of length `N` and a window size `W`, the running time of this implementation is `O(N * W * W)`.

There's an obvious algorithmic optimization: instead of comparing each element within the window, we could instead build a `HashSet` and check its size:

``````fn run(s: &[char], window_size: usize) -> usize {
for i in 0..s.len() - window_size {
let mut set = HashSet::new();
for j in 0..window_size {
set.insert(s[i + j]);
}
if set.len() == window_size {
return i + window_size;
}
}
panic!("No unique window found");
}
``````

Insertion into the `HashSet` is `O(1)`, so our running time is down to `O(N * W)`.

(Whether this is faster in practice is left as an exercise to the reader, since it adds allocation into the mix)

However, there's still room to improve, with a little trickery.

First, let's eliminate the `HashSet`. We can consider each character as a bitmask:

``````a = 10000000000000000000000000000000
b = 01000000000000000000000000000000
c = 00100000000000000000000000000000
d = 00010000000000000000000000000000
...etc
``````

Given a set of characters, we can see which ones are present by combining these bitmasks with or (which is written as `|`):

``````a | b | c = 11100000000000000000000000000000
``````

If we have a set of `W` characters, the or of their bitmasks will have `W` bits set if and only if those characters are unique.

``````a | b | c = 11100000000000000000000000000000
a | b | b = 11000000000000000000000000000000
``````

Hopefully, this is intuitively obvious: each character can only activate one bit in the result, so if we have `W` bits set, then we have `W` unique characters.

In Rust, we check how many bits are set with `count_ones`. With `-C target-cpu=native` passed to the compiler, it should compile down to 1-2 instructions:

We can use this bitmask to replace our `HashSet`:

``````fn run(s: &[char], window_size: usize) -> usize {
for i in 0..s.len() - window_size {
let mut set = 0u32;
for j in 0..window_size {
set |= 1 << (s[i + j] as u32 - 'a' as u32);
}
if set.count_ones() as usize == window_size {
return i + window_size;
}
}
panic!("No unique window found");
}
``````

(and now there are zero allocations!)

The code is still `O(N * W)` – because we have an outer and inner loop – but it's prepared us for the final trick.

Consider a slight modification to our previous axiom:

If we have a set of `W` characters, the xor of their bitmasks will have `W` bits set if and only if those characters are unique.

``````a ^ b ^ c = 11100000000000000000000000000000
a ^ b ^ b = 10000000000000000000000000000000
``````

This is less intuitively obvious: everyone that I've presented it to, including myself, has to convince themself that it's correct.

One reply on Reddit has a particularly good explanation (edited to match the convention of using `W` for window size):

This algorithm is just counting how many characters appear an odd number of times in a window of size `W`. That number will be `W` if-and-only-if they're all unique!

Using `xor` is more powerful than `or`, because of one particular property:

``````a ^ b ^ a = b
``````

Applying xor twice will turn a bit on, then back off.

In other words, we can maintain the bitset for a sliding window in a single `u32`, turning bits on as they enter the window and off as they leave.

This eliminates the inner loop from our previous function:

``````fn run(s: &[char], window_size: usize) -> usize {
let mut set = 0u32;
for i in 0..s.len() {
// Turn on bits as they enter the window
set ^= 1 << (s[i] as u32 - 'a' as u32);

// Turn off bits as they leave the window
if i >= window_size {
set ^= 1 << (s[i - window_size] as u32 - 'a' as u32);
}

// Check the current window and see if we're done
if set.count_ones() as usize == window_size {
return i + 1;
}
}
panic!("No unique window found");
}
``````

With this last trick, we're down to `O(N)` running time, with no dependence on the window size! I didn't do rigorous benchmarking, but one reply said that this trick sped up their code by almost 3x.

There's no moral to the story, other than "xor is cool". I hope everyone is enjoying Advent of Code!