Controlling CPU speed
- hgoel
- Member
- Posts: 89
- Joined: Sun Feb 09, 2014 7:11 pm
- Libera.chat IRC: hgoel
- Location: Within a meter of a computer
Controlling CPU speed
I've been looking for information on changing the CPU speed on recent x86_64 processors in protected mode. I've got this issue where the OS runs really well in qemu but runs extremely slowly on bare metal (the same CPU), I can only think of two possible issues, either the CPU is under clocked at power on (this is a laptop so I think that's possible as a power saving measure) and maybe the caches are disabled at startup. It doesn't seem like it could be that the caches are disabled since otherwise I should see a speed up once all the code is doing is a single multiply in a loop (which should obviously fit in cache) but I don't.
"If the truth is a cruel mistress, than a lie must be a nice girl"
Working on Cardinal
Find me at [url=irc://chat.freenode.net:6697/Cardinal-OS]#Cardinal-OS[/url] on freenode!
Working on Cardinal
Find me at [url=irc://chat.freenode.net:6697/Cardinal-OS]#Cardinal-OS[/url] on freenode!
Re: Controlling CPU speed
Hi,
Fortunately it's relatively easy to check if caches are enabled - starting with the 2 flags in CR0, then the MSRs for MTRRs.
Cheers,
Brendan
Unfortunately ACPI killed sanity. By providing a (hideously ugly and over-complicated) standard interface (ACPI's AML) for software to use, ACPI made it possible for CPU manufacturers to avoid doing it in a simple/clean/architectural way, which mostly resulted in CPU manufacturers all doing "random CPU specific idiocy" with no usable documentation.hgoel0974 wrote:I've been looking for information on changing the CPU speed on recent x86_64 processors in protected mode. I've got this issue where the OS runs really well in qemu but runs extremely slowly on bare metal (the same CPU), I can only think of two possible issues, either the CPU is under clocked at power on (this is a laptop so I think that's possible as a power saving measure) and maybe the caches are disabled at startup.
When caches are disabled, nothing fits in cache and no loop sees a speedup.hgoel0974 wrote:It doesn't seem like it could be that the caches are disabled since otherwise I should see a speed up once all the code is doing is a single multiply in a loop (which should obviously fit in cache) but I don't.
Fortunately it's relatively easy to check if caches are enabled - starting with the 2 flags in CR0, then the MSRs for MTRRs.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
- hgoel
- Member
- Posts: 89
- Joined: Sun Feb 09, 2014 7:11 pm
- Libera.chat IRC: hgoel
- Location: Within a meter of a computer
Re: Controlling CPU speed
I've read about ACPI, seems like overkill to have an entire bytecode language to be parsed just for power management. I decided to actually read the IA32 manual and found information regarding both controlling CPU speed in an Intel specific way and regarding enabling caches. I should have read the manual before asking. Thanks for your answer though!Brendan wrote:Hi,
Unfortunately ACPI killed sanity. By providing a (hideously ugly and over-complicated) standard interface (ACPI's AML) for software to use, ACPI made it possible for CPU manufacturers to avoid doing it in a simple/clean/architectural way, which mostly resulted in CPU manufacturers all doing "random CPU specific idiocy" with no usable documentation.hgoel0974 wrote:I've been looking for information on changing the CPU speed on recent x86_64 processors in protected mode. I've got this issue where the OS runs really well in qemu but runs extremely slowly on bare metal (the same CPU), I can only think of two possible issues, either the CPU is under clocked at power on (this is a laptop so I think that's possible as a power saving measure) and maybe the caches are disabled at startup.
When caches are disabled, nothing fits in cache and no loop sees a speedup.hgoel0974 wrote:It doesn't seem like it could be that the caches are disabled since otherwise I should see a speed up once all the code is doing is a single multiply in a loop (which should obviously fit in cache) but I don't.
Fortunately it's relatively easy to check if caches are enabled - starting with the 2 flags in CR0, then the MSRs for MTRRs.
Cheers,
Brendan
"If the truth is a cruel mistress, than a lie must be a nice girl"
Working on Cardinal
Find me at [url=irc://chat.freenode.net:6697/Cardinal-OS]#Cardinal-OS[/url] on freenode!
Working on Cardinal
Find me at [url=irc://chat.freenode.net:6697/Cardinal-OS]#Cardinal-OS[/url] on freenode!
Re: Controlling CPU speed
Out of interest how do you know it is running slow? I presume you dump something to the screen/serial port etc at regular intervals and it is the rate of this which is slower on the actual machine. Could it be that it is your display routine (i.e. interfacing with hardware) which is slower on the physical machine rather than the cpu itself per se?
Regards,
John.
Regards,
John.
- hgoel
- Member
- Posts: 89
- Joined: Sun Feb 09, 2014 7:11 pm
- Libera.chat IRC: hgoel
- Location: Within a meter of a computer
Re: Controlling CPU speed
Yes, that's exactly it, the display is updated extremely slowly on hardware compared to VM
"If the truth is a cruel mistress, than a lie must be a nice girl"
Working on Cardinal
Find me at [url=irc://chat.freenode.net:6697/Cardinal-OS]#Cardinal-OS[/url] on freenode!
Working on Cardinal
Find me at [url=irc://chat.freenode.net:6697/Cardinal-OS]#Cardinal-OS[/url] on freenode!
Re: Controlling CPU speed
In that case, is the slow down in the cpu or in the writing to video memory? In other words, do you do lots of calculations on the cpu then output a character very occasionally, or do you try and output them really quickly (with few calculations in between)?
If the latter (i.e. your task is io bound), there are a number of explanations why it is slower on real hardware. How are you updating the display? If text mode are you writing directly to video memory or using BIOS interrupts? Are you attempting to read from video memory before writing back (e.g. to implement scrolling)? Have you configured the paging settings appropriately for the video memory? MTRRs should be properly set up for you by the BIOS, but it doesn't hurt to check.
Regards,
John.
If the latter (i.e. your task is io bound), there are a number of explanations why it is slower on real hardware. How are you updating the display? If text mode are you writing directly to video memory or using BIOS interrupts? Are you attempting to read from video memory before writing back (e.g. to implement scrolling)? Have you configured the paging settings appropriately for the video memory? MTRRs should be properly set up for you by the BIOS, but it doesn't hurt to check.
Regards,
John.
- hgoel
- Member
- Posts: 89
- Joined: Sun Feb 09, 2014 7:11 pm
- Libera.chat IRC: hgoel
- Location: Within a meter of a computer
Re: Controlling CPU speed
For now I just try to output stuff really quickly, so it's possible you're right and I'm IO bound (I'm having GRUB setup a 1920x1080x32 display mode for me )jnc100 wrote:In that case, is the slow down in the cpu or in the writing to video memory? In other words, do you do lots of calculations on the cpu then output a character very occasionally, or do you try and output them really quickly (with few calculations in between)?
At the moment I'm updating the display by writing everything to local non-framebuffer memory and then once I'm done, swapping things over with a memcpy. I haven't touched the MTRRs yet as I've been trying to switch over to using the APIC, IO APIC and HPET instead of the PIC and PIT, which I just finished doing, so I'll setup the MTRRs now. I think I have the paging stuff setup correctly for video memory, however the OP was made when I hadn't setup paging.jnc100 wrote: If the latter (i.e. your task is io bound), there are a number of explanations why it is slower on real hardware. How are you updating the display? If text mode are you writing directly to video memory or using BIOS interrupts? Are you attempting to read from video memory before writing back (e.g. to implement scrolling)? Have you configured the paging settings appropriately for the video memory? MTRRs should be properly set up for you by the BIOS, but it doesn't hurt to check.
Regards,
John.
I'm not sure how much work it is to write drivers for the Intel HD Graphics BLT engine, looking through the docs it doesn't seem as complicated as I expected, but it's still probably a little far off since I'm still just setting up the basic stuff like multitasking and minimal ACPI
"If the truth is a cruel mistress, than a lie must be a nice girl"
Working on Cardinal
Find me at [url=irc://chat.freenode.net:6697/Cardinal-OS]#Cardinal-OS[/url] on freenode!
Working on Cardinal
Find me at [url=irc://chat.freenode.net:6697/Cardinal-OS]#Cardinal-OS[/url] on freenode!
Re: Controlling CPU speed
You may wish to optimize your screen updating routines first before trying to implement hardware blitting. Firstly, you probably want to support the concept of only updating the video memory with those parts of the back buffer which have actually changed by keeping track of the updated areas as 'dirty rectangles'. Secondly, when copying large areas of memory, your memcpy routine can probably be improved with the use of mmx/sse stores/loads if support for those are available.
Regards,
John.
Regards,
John.
- hgoel
- Member
- Posts: 89
- Joined: Sun Feb 09, 2014 7:11 pm
- Libera.chat IRC: hgoel
- Location: Within a meter of a computer
Re: Controlling CPU speed
I spent some time looking through the Intel HD Graphics docs and it seems very clear that it's going to be a while before implementing hardware acceleration is going to be a good idea. For now I'm going to follow your suggestion and optimize my drawing routines, I do have the VFPU enabled so I should really be putting it to use where I can. Currently my memcpy uses rep movsb from glibc but I can create a version optimized for the graphics code.jnc100 wrote:You may wish to optimize your screen updating routines first before trying to implement hardware blitting. Firstly, you probably want to support the concept of only updating the video memory with those parts of the back buffer which have actually changed by keeping track of the updated areas as 'dirty rectangles'. Secondly, when copying large areas of memory, your memcpy routine can probably be improved with the use of mmx/sse stores/loads if support for those are available.
Regards,
John.
"If the truth is a cruel mistress, than a lie must be a nice girl"
Working on Cardinal
Find me at [url=irc://chat.freenode.net:6697/Cardinal-OS]#Cardinal-OS[/url] on freenode!
Working on Cardinal
Find me at [url=irc://chat.freenode.net:6697/Cardinal-OS]#Cardinal-OS[/url] on freenode!