## A Neat XOR Trick

There's a neat trick to solve Advent of Code, day 6 in a single pass.

Looking over the solutions mega-thread, it seems like not many people discovered this trick; when I posted about it, people found it to be noteworthy.

Let's start with a brief summary of the problem: you're given an input string, e.g.

```
nznrnfrfntjfmvfwmzdfjlvtqnbhcprsg
```

Within this string, you want to find the first `W`

-character window in which
every character is unique.

For this example, with a 4-character window, this occurs here:

```
nznrnf [rfnt] jfmvfwmzdfjlvtqnbhcprsg
```

Naively, you can do this with three nested loops:

```
fn run(s: &[char], window_size: usize) -> usize {
for i in 0..s.len() - window_size {
let mut unique = true;
for j in 0..window_size {
for k in (j + 1)..window_size {
if s[i + j] == s[i + k] {
unique = false;
}
}
}
if unique {
return i + window_size;
}
}
panic!("No unique window found");
}
```

With an input string of length `N`

and a window size `W`

, the
running time
of this implementation is `O(N * W * W)`

.

There's an obvious algorithmic optimization: instead of comparing each element
within the window, we could instead build a `HashSet`

and check its size:

```
fn run(s: &[char], window_size: usize) -> usize {
for i in 0..s.len() - window_size {
let mut set = HashSet::new();
for j in 0..window_size {
set.insert(s[i + j]);
}
if set.len() == window_size {
return i + window_size;
}
}
panic!("No unique window found");
}
```

Insertion into the `HashSet`

is `O(1)`

, so our running time is down to `O(N * W)`

.

(Whether this is faster *in practice* is left as an exercise to the reader,
since it adds allocation into the mix)

However, there's *still* room to improve, with a little trickery.

First, let's eliminate the `HashSet`

. We can consider each character as a
bitmask:

```
a = 10000000000000000000000000000000
b = 01000000000000000000000000000000
c = 00100000000000000000000000000000
d = 00010000000000000000000000000000
...etc
```

Given a set of characters, we can see which ones are present by combining
these bitmasks with *or* (which is written as `|`

):

```
a | b | c = 11100000000000000000000000000000
```

If we have a set of `W`

characters, the *or* of their bitmasks will have `W`

bits set **if and only if** those characters are unique.

```
a | b | c = 11100000000000000000000000000000
a | b | b = 11000000000000000000000000000000
```

Hopefully, this is intuitively obvious: each character can only activate one bit
in the result, so if we have `W`

bits set, then we have `W`

unique characters.

In Rust, we check how many bits are set with `count_ones`

.
With `-C target-cpu=native`

passed to the compiler,
it should compile down to 1-2 instructions:

We can use this bitmask to replace our `HashSet`

:

```
fn run(s: &[char], window_size: usize) -> usize {
for i in 0..s.len() - window_size {
let mut set = 0u32;
for j in 0..window_size {
set |= 1 << (s[i + j] as u32 - 'a' as u32);
}
if set.count_ones() as usize == window_size {
return i + window_size;
}
}
panic!("No unique window found");
}
```

(and now there are zero allocations!)

The code is still `O(N * W)`

– because we have an outer and inner loop –
but it's prepared us for the final trick.

Consider a slight modification to our previous axiom:

If we have a set of `W`

characters, the * xor*
of their bitmasks will have

`W`

bits set **if and only if**those characters are unique.

```
a ^ b ^ c = 11100000000000000000000000000000
a ^ b ^ b = 10000000000000000000000000000000
```

This is *less* intuitively obvious: everyone that I've presented it to,
including myself, has to convince themself that it's correct.

One reply on Reddit has a particularly good explanation
(edited to match the convention of using `W`

for window size):

This algorithm is just counting how many characters appear an odd number of times in a window of size

`W`

. That number will be`W`

if-and-only-if they're all unique!

Using `xor`

is more powerful than `or`

, because of one particular property:

```
a ^ b ^ a = b
```

Applying *xor* twice will turn a bit on, then back off.

In other words, we can maintain the bitset for a sliding window in a single
`u32`

, turning bits on as they enter the window and off as they leave.

This eliminates the inner loop from our previous function:

```
fn run(s: &[char], window_size: usize) -> usize {
let mut set = 0u32;
for i in 0..s.len() {
// Turn on bits as they enter the window
set ^= 1 << (s[i] as u32 - 'a' as u32);
// Turn off bits as they leave the window
if i >= window_size {
set ^= 1 << (s[i - window_size] as u32 - 'a' as u32);
}
// Check the current window and see if we're done
if set.count_ones() as usize == window_size {
return i + 1;
}
}
panic!("No unique window found");
}
```

With this last trick, we're down to `O(N)`

running time, with no dependence on
the window size!
I didn't do rigorous benchmarking, but
one reply
said that this trick sped up their code by almost 3x.

There's no moral to the story, other than "xor is cool". I hope everyone is enjoying Advent of Code!

## Further reading

If you want to read about overengineered solutions to Advent of Code problems, I've got a few other blog posts in this vein: