Page 1 of 1
Controlling CPU speed
Posted: Wed Aug 05, 2015 2:11 pm
by hgoel
I've been looking for information on changing the CPU speed on recent x86_64 processors in protected mode. I've got this issue where the OS runs really well in qemu but runs extremely slowly on bare metal (the same CPU), I can only think of two possible issues, either the CPU is under clocked at power on (this is a laptop so I think that's possible as a power saving measure) and maybe the caches are disabled at startup. It doesn't seem like it could be that the caches are disabled since otherwise I should see a speed up once all the code is doing is a single multiply in a loop (which should obviously fit in cache) but I don't.
Re: Controlling CPU speed
Posted: Wed Aug 05, 2015 3:02 pm
by Brendan
Hi,
hgoel0974 wrote:I've been looking for information on changing the CPU speed on recent x86_64 processors in protected mode. I've got this issue where the OS runs really well in qemu but runs extremely slowly on bare metal (the same CPU), I can only think of two possible issues, either the CPU is under clocked at power on (this is a laptop so I think that's possible as a power saving measure) and maybe the caches are disabled at startup.
Unfortunately ACPI killed sanity. By providing a (hideously ugly and over-complicated) standard interface (ACPI's AML) for software to use, ACPI made it possible for CPU manufacturers to avoid doing it in a simple/clean/architectural way, which mostly resulted in CPU manufacturers all doing "random CPU specific idiocy" with no usable documentation.
hgoel0974 wrote:It doesn't seem like it could be that the caches are disabled since otherwise I should see a speed up once all the code is doing is a single multiply in a loop (which should obviously fit in cache) but I don't.
When caches are disabled, nothing fits in cache and no loop sees a speedup.
Fortunately it's relatively easy to check if caches are enabled - starting with the 2 flags in CR0, then the MSRs for MTRRs.
Cheers,
Brendan
Re: Controlling CPU speed
Posted: Wed Aug 05, 2015 8:59 pm
by hgoel
Brendan wrote:Hi,
hgoel0974 wrote:I've been looking for information on changing the CPU speed on recent x86_64 processors in protected mode. I've got this issue where the OS runs really well in qemu but runs extremely slowly on bare metal (the same CPU), I can only think of two possible issues, either the CPU is under clocked at power on (this is a laptop so I think that's possible as a power saving measure) and maybe the caches are disabled at startup.
Unfortunately ACPI killed sanity. By providing a (hideously ugly and over-complicated) standard interface (ACPI's AML) for software to use, ACPI made it possible for CPU manufacturers to avoid doing it in a simple/clean/architectural way, which mostly resulted in CPU manufacturers all doing "random CPU specific idiocy" with no usable documentation.
hgoel0974 wrote:It doesn't seem like it could be that the caches are disabled since otherwise I should see a speed up once all the code is doing is a single multiply in a loop (which should obviously fit in cache) but I don't.
When caches are disabled, nothing fits in cache and no loop sees a speedup.
Fortunately it's relatively easy to check if caches are enabled - starting with the 2 flags in CR0, then the MSRs for MTRRs.
Cheers,
Brendan
I've read about ACPI, seems like overkill to have an entire bytecode language to be parsed just for power management. I decided to actually read the IA32 manual and found information regarding both controlling CPU speed in an Intel specific way and regarding enabling caches. I should have read the manual before asking. Thanks for your answer though!
Re: Controlling CPU speed
Posted: Wed Aug 05, 2015 11:01 pm
by jnc100
Out of interest how do you know it is running slow? I presume you dump something to the screen/serial port etc at regular intervals and it is the rate of this which is slower on the actual machine. Could it be that it is your display routine (i.e. interfacing with hardware) which is slower on the physical machine rather than the cpu itself per se?
Regards,
John.
Re: Controlling CPU speed
Posted: Sat Aug 08, 2015 5:44 pm
by hgoel
Yes, that's exactly it, the display is updated extremely slowly on hardware compared to VM
Re: Controlling CPU speed
Posted: Mon Aug 10, 2015 11:53 am
by jnc100
In that case, is the slow down in the cpu or in the writing to video memory? In other words, do you do lots of calculations on the cpu then output a character very occasionally, or do you try and output them really quickly (with few calculations in between)?
If the latter (i.e. your task is io bound), there are a number of explanations why it is slower on real hardware. How are you updating the display? If text mode are you writing directly to video memory or using BIOS interrupts? Are you attempting to read from video memory before writing back (e.g. to implement scrolling)? Have you configured the paging settings appropriately for the video memory? MTRRs should be properly set up for you by the BIOS, but it doesn't hurt to check.
Regards,
John.
Re: Controlling CPU speed
Posted: Mon Aug 10, 2015 10:48 pm
by hgoel
jnc100 wrote:In that case, is the slow down in the cpu or in the writing to video memory? In other words, do you do lots of calculations on the cpu then output a character very occasionally, or do you try and output them really quickly (with few calculations in between)?
For now I just try to output stuff really quickly, so it's possible you're right and I'm IO bound (I'm having GRUB setup a 1920x1080x32 display mode for me
)
jnc100 wrote:
If the latter (i.e. your task is io bound), there are a number of explanations why it is slower on real hardware. How are you updating the display? If text mode are you writing directly to video memory or using BIOS interrupts? Are you attempting to read from video memory before writing back (e.g. to implement scrolling)? Have you configured the paging settings appropriately for the video memory? MTRRs should be properly set up for you by the BIOS, but it doesn't hurt to check.
Regards,
John.
At the moment I'm updating the display by writing everything to local non-framebuffer memory and then once I'm done, swapping things over with a memcpy. I haven't touched the MTRRs yet as I've been trying to switch over to using the APIC, IO APIC and HPET instead of the PIC and PIT, which I just finished doing, so I'll setup the MTRRs now. I think I have the paging stuff setup correctly for video memory, however the OP was made when I hadn't setup paging.
I'm not sure how much work it is to write drivers for the Intel HD Graphics BLT engine, looking through the docs it doesn't seem as complicated as I expected, but it's still probably a little far off since I'm still just setting up the basic stuff like multitasking and minimal ACPI
Re: Controlling CPU speed
Posted: Tue Aug 11, 2015 2:05 pm
by jnc100
You may wish to optimize your screen updating routines first before trying to implement hardware blitting. Firstly, you probably want to support the concept of only updating the video memory with those parts of the back buffer which have actually changed by keeping track of the updated areas as 'dirty rectangles'. Secondly, when copying large areas of memory, your memcpy routine can probably be improved with the use of mmx/sse stores/loads if support for those are available.
Regards,
John.
Re: Controlling CPU speed
Posted: Tue Aug 11, 2015 10:54 pm
by hgoel
jnc100 wrote:You may wish to optimize your screen updating routines first before trying to implement hardware blitting. Firstly, you probably want to support the concept of only updating the video memory with those parts of the back buffer which have actually changed by keeping track of the updated areas as 'dirty rectangles'. Secondly, when copying large areas of memory, your memcpy routine can probably be improved with the use of mmx/sse stores/loads if support for those are available.
Regards,
John.
I spent some time looking through the Intel HD Graphics docs and it seems very clear that it's going to be a while before implementing hardware acceleration is going to be a good idea. For now I'm going to follow your suggestion and optimize my drawing routines, I do have the VFPU enabled so I should really be putting it to use where I can. Currently my memcpy uses rep movsb from glibc but I can create a version optimized for the graphics code.