Page 1 of 1
Question about HLT
Posted: Fri Nov 04, 2016 8:24 am
by Ycep
Is this code for idle task:
Code: Select all
x:
hlt
hlt
hlt
hlt
hlt
hlt
hlt
hlt
jmp x
better than:
If it is, would be this be even faster:
Code: Select all
call _LowerDownSchedulerFrequency
x:
hlt
hlt
hlt
hlt
hlt
hlt
hlt
hlt
jmp x
Re: Question about HLT
Posted: Fri Nov 04, 2016 8:42 am
by JAAman
no...
why would you think it would be?
the first one is much harder to write and maintain because it is more obscure and you can't immediately understand why it was written that way, thus it is much worse
never ever try to optimize anything unless you have already proven that it will have a significant impact on the whole code (code readability and maintainability is much more important in most cases than performance)
in other words: why are you trying to optimize something that will only run when there is nothing to do?
Re: Question about HLT
Posted: Fri Nov 04, 2016 9:01 am
by Love4Boobies
What does it even mean to do nothing faster?
Re: Question about HLT
Posted: Fri Nov 04, 2016 9:09 am
by Ycep
I thought to generate less heat and use less power.
Re: Question about HLT
Posted: Fri Nov 04, 2016 9:33 am
by simeonz
You will move to the next adjacent hlt only after an interrupt, which means on an i/o event, a timer event, or an IPI scheduler event. In other words, before moving to the following adjacent instruction in the idle task, the processor will most likely chew a ton of other instructions and the prefetch/speculative execution optimizations will be voided. I would say, keeping it simple with one hlt instruction is the way to go.
Re: Question about HLT
Posted: Fri Nov 04, 2016 10:03 am
by Brendan
Hi,
Lukand wrote:I thought to generate less heat and use less power.
In theory, putting multiple "hlt" instructions might cause the CPU to enter the "HLT state" (after interrupting the previous "hlt") a fraction of a nanosecond sooner. However, if the previous "hlt" was on one cache line and the next "hlt" is on the next cache line, then you'll probably have the extra expense (and power consumption) involved in fetching an additional cache line, so it'd be much worse.
However....
In practice, entering and leaving the "HLT state" costs power (and time) too. If one IRQ occurs almost immediate after another, then the CPU does the first IRQ then switches to the "HLT state" then switches out of the "HLT state" almost immediately; and this will make performance (IRQ latency) much worse while also making power consumption worse. For this reason, multiple "hlt" instructions would probably make things worse (even when they're all in the same cache line). To avoid this problem; it might be better to do a loop of "nop" instructions until a certain amount of time (e.g. 1 us) has passed since the last IRQ, before using "hlt".
However...
Some CPUs (most newer CPUs) optimise "nop" so that it doesn't take any time (e.g. so it gets discarded after/during decoding and isn't actually executed in the normal sense). For these CPUs you'll probably want to use a "pause" instruction in the loop instead. Also, for CPUs with hyper-threading (or whatever AMD is going to call their version of it) you will want "pause" anyway, and might want to unroll the "pause loop" a little (so that the overhead of the loop itself doesn't effect the performance of the other logical CPU in the core as much).
Note that the optimum amount to unroll a loop (for all the loops mentioned) depends on multiple factors (e.g. if the CPU has a "loop detector" and what size it is), and would be different for different CPUs.
However...
For a real OS it's very likely that an IRQ handler will cause a task switch. For example, you might have a task waiting for a time-out; or waiting to receive data from the network, or from a file (from a storage device controller), or from a keyboard or mouse; where an IRQ means that the task can stop waiting. This mostly ruins all of the above.
Also; for multi-CPU systems, an IRQ received by one CPU might cause a task to unblock on a different CPU. In this situation; if the other CPU is using HLT then you need to send an IPI ("inter-processor interrupt") to it to wake it up. To avoid the need to send a (relatively expensive) IPI you mostly want to use "monitor" and "mwait" (if the CPU supports these instructions) instead of "hlt"; so that the other CPU wakes up when you write to some kind of "list of ready to run tasks".
Finally; it's very likely that the CPU has multiple other options for controlling power consumption (e.g. clock throttling, various "deeper sleep states", etc) that complicates everything even more.
Cheers,
Brendan
Re: Question about HLT
Posted: Fri Nov 04, 2016 12:03 pm
by Ycep
Brendan wrote:Hi,
For a real OS it's very likely that an IRQ handler will cause a task switch. For example, you might have a task waiting for a time-out; or waiting to receive data from the network, or from a file (from a storage device controller), or from a keyboard or mouse; where an IRQ means that the task can stop waiting.
That is cooperative, not preemptive. What if a process just does a while(true);?
Re: Question about HLT
Posted: Fri Nov 04, 2016 12:27 pm
by SpyderTL
Lukand wrote:Brendan wrote:Hi,
For a real OS it's very likely that an IRQ handler will cause a task switch. For example, you might have a task waiting for a time-out; or waiting to receive data from the network, or from a file (from a storage device controller), or from a keyboard or mouse; where an IRQ means that the task can stop waiting.
That is cooperative, not preemptive. What if a process just does a while(true);?
True. But I think you may be missing the point.
The chance that your OS is going to receive an interrupt, and do absolutely no work is probably very near zero. So there's no reason to focus a lot of time trying to optimize for that particular scenario.
Re: Question about HLT
Posted: Fri Nov 04, 2016 3:36 pm
by Kevin
Lukand wrote:What if a process just does a while(true);?
Then your idle task doesn't even get to run because the system isn't idle.
Re: Question about HLT
Posted: Sat Nov 05, 2016 11:13 am
by Ycep
Oh whoops. I accidentally read "For a real OS it's very likely that an IRQ handler will not cause a task switch" instead the original "For a real OS it's very likely that an IRQ handler will cause a task switch". Sorry. That is preemptive, not cooperative.
Thanks everybody for responding, especially Brendan.