Solar wrote:
The problem is that this is for the average case. As is quite well-known, Quicksort has a weakness that can result in O(n^2) behaviour - which a 20-element stack cannot handle.
It can result in O(n^2) time used, that's for sure, but I think it can be done in O(log n) stack space.
First of all, let's work out how to do it using a compiler with tail call elimination instead of using an explicit stack. The call to quicksort will partition the array into less-than-pivot and greater-than-pivot parts, and then recursively call quicksort twice. The first call will be recursive (will use CPU stack space) and the second call will be tail-recursive (a loop).
The maximum stack use will be the maximum nesting depth of the first (non-tail-recursive) call.
It doesn't matter which order you sort the partitions in. If you recursively sort the smaller partition, and tail-recursively sort the bigger partition, the size of the array will (at worst) halve on each recursion. So the recursion depth will be at most log n.
Now, a lot of C compilers don't do tail call elimination. So that's no good. Tail recursion is just a loop, though, so
Code: Select all
f(x) = if b then return f(y) else return z
is equivalent to
Code: Select all
f(x) = while not b do x := y done; return z
. That will get you an O(log n) stack space, explicitly recursive version of qsort.
Or we can introduce tail recursion elimination by using an explicit call stack instead of using the CPU's one. Then:
- Popping a value off the stack corresponds to a function return.
- Pushing a value on the stack corresponds to a function call. When the function returns, it will try to sort the array again that produce that function call. That's not so useful. So...
- Replacing the value on the top of the stack and pushing another value corresponds to a recursive call followed by a tail-call - it will sort the top array, and when that returns, it will sort the next array. Then it will return. The tail call doesn't use any extra stack space, which is what we need. It's a call with the return address to something else, I suppose.
With that, it's easy enough to translate a recursive version of qsort() into an iterative one with an explicit stack (just push things instead of calling functions). To get O(log n) stack use, when we recurse we need to push the bigger partition then the smaller one - that corresponds to recursively sorting the smaller array and then tail-recursively sorting the bigger one.
PDCLib's qsort does keep a call stack, more or less. The stack isn't just [tt]char * stack[/tt], though - the top of the stack is kept in [tt](base_, limit)[/tt] and rest of the stack is [tt]stack[/tt]. With that in mind, we can translate qsort back to a recursive function and see what it does:
- To begin with, the call stack will contain one entry - the whole array (in base_ and limit).
- It goes in a loop, examining the top of the stack (base_ and limit). So it uses the call stack properly.
- If the array is small enough, it sorts it and returns.
- Otherwise, it partitions the array and causes a recursive call to sort the smaller partition then a tail recursion to sort the bigger partition.
- It returns once the call stack is empty.
So your version of qsort will use at most O(log n) call stack entries already
Actually, here's a shorter argument off the top of my head: I'll prove that (for all k) the kth element of the stack (counting from the bottom of the stack) will have size < 2^(STACKSIZE - k). That's counting (base_, limit) as a member of the stack, too. This is an invariant of your qsort(), so we have to check that it holds to begin with and that the loop makes sure it's true.
To begin with, there's one element. By making STACKSIZE big enough, the invariant will be true.
In the insertion sort case, POP(base_, limit) will just remove the top element of the stack (the base_, limit one). So that will preserve the invariant.
In the recursive case, let's say the top-of-stack was originally element k. It will partition element k, set element k to the bigger partition and set element (k+1) to the smaller partition. The new element k can't be bigger than the old element k (it's a partition of it), and element (k+1) will be at most half the size of element k (i.e. it will have size < 2^(STACKSIZE - (k+1))). So that will preserve the invariant too.
So the invariant will always be true. That means that the entry at STACKSIZE - 1 will have size < 2, so size 0 or 1. An array of size 0 or 1 doesn't use the recursive case, so you'll never push anything when the stack is full.
Anyway, I seem to have gone on a bit
I hope that makes you a bit more confident of the qsort() in PDCLib's SVN...