Page 1 of 1
Added SSE + OptLib and updated FPU
Posted: Sat Mar 28, 2009 6:02 pm
by 01000101
http://wiki.osdev.org/SSE
http://wiki.osdev.org/FPU
http://wiki.osdev.org/User:01000101/optlib/
I'll be adding more soon, just wanted to give a heads up in case anyone wanted to expand upon these.
Re: Added SSE + OptLib and updated FPU
Posted: Tue Mar 31, 2009 10:12 pm
by 01000101
I just ran tests in Bochs, QEMU, and VirtualBox on my memclr(). I ran 10 sets of 1000 clears of a 4K 16-byte aligned memory location, and got an average of 881,000 ticks (using RDTSC) per set using the SSE2 version. In the same run, I also tested the same area with the same sets using the non-SSE version and got an average of 1,055,000 ticks all in BOCHS. For QEMU / VirtualBox I got 45,190,986 / 65,456,891 for the non-SSE runs and 6,021,188 / 20,601,164 for the SSE2 runs.
This will probably get shunned away as there are a ton of factors around performance monitoring, but 3 different emulators reporting increased speeds of ~30% in 3 seperate emulators still is a good increase.
Those functions can be found in the wiki about OptLib.
I'll post more findings in mroe environments later.
Re: Added SSE + OptLib and updated FPU
Posted: Wed Apr 01, 2009 2:24 am
by AJ
Good stuff. I'm a little way of adding anything like that to my OS, but the library is mentally bookmarked
Cheers,
Adam
Re: Added SSE + OptLib and updated FPU
Posted: Wed Apr 01, 2009 2:18 pm
by cyr1x
Great stuff
One point:
Instead of the
you cold use a function pointer to those functions, like so:
Code: Select all
void * (*memcpy) (...);
void init()
{
memcpy = memcpy_sse;
}
So you set the pointer once at runtime. This saves you from the checks.
Ok, it sounds maybe a bit to "much" of optimization but ...
Re: Added SSE + OptLib and updated FPU
Posted: Wed Apr 01, 2009 4:01 pm
by 01000101
that's a great idea. It may be too much of a micro-optimization, but then again, this is a micro-optimized library.