I’ve been interviewing job candidates lately, so I’ve been thinking about interview questions. I always ask people to write code in a technical interview, and I usually start with a fairly easy warm-up question — some sort of a toy problem like an operation on lists or trees, often the kind of thing that has a neat recursive solution.
There’s a particular sinking feeling that comes when the candidate starts on one of these problems
and immediately writes a for loop. I used to take that as an indicator of how good
a coder the candidate was, but more recently I’ve realised that it’s a much better indicator of how
long it is since they went to college. Even mediocre graduate recruits usually rattle off a recursive
solution pretty quickly, where some senior engineers with ten years’ experience in embedded systems
have filed recursion away in a dusty corner of the mind, along with LISP and PROLOG.
The truth is that in most embedded/real-time/kernel work, we almost never see actual recursion. When there are strict bounds on stack sizes, recursive algorithms can be downright dangerous, especially when operating on user-supplied data. Datastructures like trees will usually be implemented once and then left alone, and even then many of the actual operations are often iterative. Maybe it’s different further up the stack — with a proper language runtime to support it I’d like to think that you can just write the elegant version and skip over the details.
Another thing about these toy problems is that they usually do have an iterative solution as well, and
I’m always secretly hoping that the candidate who’s reached for the for loop is just trolling, knowing
that the recursive solution is expected and providing an iterative one instead. (Sadly this has not
yet happened in a real interview.)
In that spirit, today I’m going to look at one of the canonical
recursive toy problems, the Towers of Hanoi. You know the one: you have three pegs on which to stack
disks of different sizes. You can’t move a disk that has another disk on top of it, and you can’t put a disk
on top of a smaller disk. Your task is to move a tower of disks of sizes (N ... 1) from the leftmost peg to
the rightmost one. The standard solution looks like this:
define hanoi(size, from, to):
if size is not 0:
call hanoi(size - 1, from, spare)
move a disk from peg 'from' to peg 'to'
call hanoi(size - 1, spare, to)
But there are also several iterative solutions to it. Let’s start by looking at the actual move pattern
for N=3:
1 . . -2- . . --3-- . . . . . -2- . . --3-- . 1 . . . . . . --3-- -2- 1 . . . . 1 . --3-- -2- . . . . . 1 . . -2- --3-- . . . . . . 1 -2- --3-- . . . . . -2- 1 . --3-- . . 1 . . -2- . . --3--
The largest disk moves exactly once, to the left (wrapping around). The second-largest disk moves twice,
both times to the right, which matches the algorithm: it moves right to allow the largest disk
to move left, and then right again to land on top of it. The third-largest disk moves four times, to the
left each time, and the pattern continues for bigger towers. So we know two things: each disk has a preferred
direction, and the full solution for N disks has
(1 + 2 + ... + 2^(N-1)) == (2^N) - 1 total moves. The iterative algorithm looks like this:
define hanoi(N):
repeat 2^N - 1 times:
move the largest disk that can go in its preferred direction
You can prove by induction that this always produces the same moves as the other algorithm. (Proving the move count and directions is easy; proving that it’s always the largest disk that moves is a little trickier.)
And now, a warning. When I actually implemented this in C, the iterative version was not only messier, but also nearly four times slower than the recursive one. True, it saves a function call per move, but a good compiler will get rid of half of those with tail recursion, and a modern CPU will predict all the return addresses correctly. And the work of figuring out which of the two candidate disks to move next and where to is quite a bit more than just finding the spare peg. Since any Towers of Hanoi problem that will complete in less than a day should fit entirely in L1 cache, the extra instructions and (potentially mispredicted) branches to find the next move are worse than the extra memory operations for the recursion. Sometimes, even in C, the elegant solution is still the best!