Page 1 of 3

qsort or alternative sorting function

Posted: Tue May 23, 2006 1:12 pm
by paulbarker
I'm putting this in the OS development forum as it relates to sorting the memory map given by the BIOS or GRUB.

My plan so far is to remove extra bytes from the entries so they are all the minimum size (just base, limit and type), then run a qsort on the array.

Looking at the qsort implementation from PDCLib, it uses a stack without any checks for overflow (from what I can see).

1) Does anyone know of a qsort implementation which either does not use a stack or has proper checking on the stack?

2) Would a different type of sort suit this problem better?

3) @Solar: I can't see any way the PDCLib implementation of qsort can handle a stack overflow other than erroring out. Would you like a patch which adds this functionality?

Re:qsort or alternative sorting function

Posted: Tue May 23, 2006 1:30 pm
by dc0d32
afaik, the map is quite well sorted. anyways, i would suggest shell sort (non-recursive). works well for already sorted data.

Re:qsort or alternative sorting function

Posted: Tue May 23, 2006 1:43 pm
by paulbarker
The map provided by most (if not all) real BIOSs is completely unsorted.

I might have a go at implementing my own version of qsort, mainly to teach myself how it works.

Re:qsort or alternative sorting function

Posted: Tue May 23, 2006 2:13 pm
by ineo
I agree with prashant: just implement a basic sort. You don't need a qsort for this few records, unless you need to implement it for something else, you'd rather use something easier to implement, and faster in the already sorted case (thats the case for 3 of my computers).

Re:qsort or alternative sorting function

Posted: Tue May 23, 2006 2:59 pm
by Colonel Kernel
I'm putting this in the OS development forum as it relates to sorting the memory map given by the BIOS or GRUB.
Actually, if you're using the "sliding window" bootmem allocator that we just discussed, you shouldn't need to sort the memory map. If you still think you need to sort it, maybe we don't agree on how the "sliding window" actually works.

Re:qsort or alternative sorting function

Posted: Tue May 23, 2006 3:11 pm
by paulbarker
It's not necessary to sort the list really, it just got me thinking about sorting methods.

Maybe my brain's fried after implementing long long divison today.

Re:qsort or alternative sorting function

Posted: Tue May 23, 2006 5:41 pm
by mystran
Proper checking for qsort stack is trivial: give the qsort routine an extra parameter, max recursion, which you decrement for every recursive call. Then in beginning of the routine, check that, and return error if it's zero.

In-array quicksort would use constant stackspace per recursion so the above is sufficient. For large arrays you can then restart it with a larger stack.

The normal method for avoiding recursion at all, is to simulate the recursive calls with an explicit stack in an array. This way you can realloc() a larger array for the recursion stack easily, if you run out of space.

As for totally stackless (including no explicit stack), I can neither come up with nor find any totally stackless implementation right now, and I suspect there is nil. Those that claim to be, just simulate function calls with explicit stack. I won't promise to be correct, though.

Re:qsort or alternative sorting function

Posted: Tue May 23, 2006 10:56 pm
by Solar
paulbarker wrote: 3) @Solar: I can't see any way the PDCLib implementation of qsort can handle a stack overflow other than erroring out. Would you like a patch which adds this functionality?
I admit that the qsort implementation in PDCLib is taken from Paul Edward's work (as it states in the comments), and I haven't checked the implementation beyond the basic functionality tests that can be found in the test driver.

I'll check this, and come back to you if required.

Edit: Do you mean the STACKSIZE 40 part? I.e., what happens when it recurses more than 40 times? Unless I did my maths wrongly, the stack is sufficient for sorting an array of 2**32 elements. If I did my maths wrongly, please do correct me.

Re:qsort or alternative sorting function

Posted: Wed May 24, 2006 3:50 am
by paulbarker
That makes some sense now, I'll admit not understanding a single thing from the implementation in PDCLib. I am incredibly pedantic though, and even though in theory the stack can never overflow I get the same horrible feeling that I get from doing things like strcpy to a buffer on the stack:

If this implementation of qsort is used throughout the kernel, even the simplest bug in qsort could lead to a stack-smashing attack based on feeding qsort bad data.

EDIT: Sorry, I wasn't very clear there - I do mean the STACKSIZE=40 part, and the fact that neither PUSH nor POP checks to see if it has gone past the end of the stack.

Re:qsort or alternative sorting function

Posted: Wed May 24, 2006 4:02 am
by Solar
paulbarker wrote: I am incredibly pedantic though...
I am, too.

I admit not having a complete understanding of the qsort implementation myself. Some parts of PDCLib - most prominently qsort and malloc/free - are "works for now" placeholders, and will be reviewed before a production release is made.

Please understand that PDCLib is pre-alpha. Many things aren't there yet. Other things are there, but might be broken. The idea, of course, is that you get the issue of a C library out of your head and let me do the improvements.

Rest assured that I do not accept anything less than 100% proven stability and correctness before I will label PDCLib as "v1.0 - Production / Stable". I am well aware of the sensitivity of a C lib.

I will check the current qsort implementation for possible loopholes once I get the current workdir back to stable.

Re:qsort or alternative sorting function

Posted: Wed May 24, 2006 4:11 am
by paulbarker
Cheers Solar, I really appreciate the care you put into your code :).

I'm gunna leave qsort for now and concentrate on the 2 memory manager types I will be using: buddy-allocator and stack (or list, just that the methods used will be push/pop not append/pop).

Re:qsort or alternative sorting function

Posted: Wed May 24, 2006 5:10 am
by mystran
Remember that unless you deal with degenerate cases somehow in quicksort, your upper bound for stack use is O(n). So an optimistic qsort with a recursion limit of 40, would in worst case hit the limit with an array of 41 entries.

So, to be able to reason about maximum sortable arrays (with a given recursion limit), you also need to prove that you can avoid degenerate cases somehow.

Re:qsort or alternative sorting function

Posted: Wed May 24, 2006 6:22 am
by Solar
OK, let's talk turkey.

The PDCLib qsort() is iterative, not recursive, and uses a static stack capable of holding 20 PUSH()es (40 elements, and each PUSH() puts the lower / upper bounds of the subarray on the stack).

It uses two different algorithms - quicksort for subarrays larger than 7 elements, insertion sort for smaller subarrays (faster).

Only the quicksort part does PUSH() to the stack, and only for the larger subarray (the smaller one is sorted in the next loop).

The smallest subarray that could get PUSH()ed is thus 8 (2**3) elements large. The smallest possible array that could get an 8-element subarray sliced off and PUSH()ed is thus 16 (2**4) elements. That would give us stability up to 2**(3+20) elements... not enough.

Wait a minute... not true. The worst case is that a 9-element array gets 8 elements sliced off... that would mean that the PDCLib qsort() as of v0.4 gets unstable if the starting array is larger than 28 elements...

OK, that's not only unacceptable, that's really bad. I'll come up with a stack-realloc() in v0.5.

Thanks for pointing this out!

Edit: Filed as SourceForge bug #1494254. This'll get an honorable mention, Paul!

Re:qsort or alternative sorting function

Posted: Wed May 24, 2006 6:57 am
by Solar
The funny thing is, the code is originally by Raymond Gardner. You can google for it. It's been copied dozens of times all over the net. Either we're seeing ghosts, or this is a quite well-distributed bug...

Re:qsort or alternative sorting function

Posted: Wed May 24, 2006 7:03 am
by paulbarker
Personally, I don't care if its a bug or not. It's like using strcpy to an array on the stack and saying "This function can only return a string of 40 chars so I'll allocate a 50-char buffer". I wouldnt even trust a formal proof that the stack can never overflow. I like to assert absolutely everything.