Page 1 of 1

Added SSE + OptLib and updated FPU

Posted: Sat Mar 28, 2009 6:02 pm
by 01000101
http://wiki.osdev.org/SSE
http://wiki.osdev.org/FPU
http://wiki.osdev.org/User:01000101/optlib/

I'll be adding more soon, just wanted to give a heads up in case anyone wanted to expand upon these.

Re: Added SSE + OptLib and updated FPU

Posted: Tue Mar 31, 2009 10:12 pm
by 01000101
I just ran tests in Bochs, QEMU, and VirtualBox on my memclr(). I ran 10 sets of 1000 clears of a 4K 16-byte aligned memory location, and got an average of 881,000 ticks (using RDTSC) per set using the SSE2 version. In the same run, I also tested the same area with the same sets using the non-SSE version and got an average of 1,055,000 ticks all in BOCHS. For QEMU / VirtualBox I got 45,190,986 / 65,456,891 for the non-SSE runs and 6,021,188 / 20,601,164 for the SSE2 runs.

This will probably get shunned away as there are a ton of factors around performance monitoring, but 3 different emulators reporting increased speeds of ~30% in 3 seperate emulators still is a good increase.

Those functions can be found in the wiki about OptLib.

I'll post more findings in mroe environments later.

Re: Added SSE + OptLib and updated FPU

Posted: Wed Apr 01, 2009 2:24 am
by AJ
Good stuff. I'm a little way of adding anything like that to my OS, but the library is mentally bookmarked :)

Cheers,
Adam

Re: Added SSE + OptLib and updated FPU

Posted: Wed Apr 01, 2009 2:18 pm
by cyr1x
Great stuff :)
One point:
Instead of the

Code: Select all

if(sse) ... else ...
you cold use a function pointer to those functions, like so:

Code: Select all

void * (*memcpy) (...);

void init()
{
    memcpy = memcpy_sse;
}
So you set the pointer once at runtime. This saves you from the checks.
Ok, it sounds maybe a bit to "much" of optimization but ... :D

Re: Added SSE + OptLib and updated FPU

Posted: Wed Apr 01, 2009 4:01 pm
by 01000101
that's a great idea. It may be too much of a micro-optimization, but then again, this is a micro-optimized library. :D