Tickless systems
Tickless systems
I'm now reconfiguring the linux kernel for compiling (last version was linux-2.6.20.6), and found the option "Tickless system".
The idea is lowering the IRQ rate by programming the timer only when we need it (I suppose for example if all the processes are sleeping waiting for IO, we don't even need the timer interrupting the idle process every 1/n seconds, we only need to wait for the mouse/kbd/hd/etc IRQ).
Sound very interesting. A question then came to my mind: are there any more OSs supporting this?
JJ
The idea is lowering the IRQ rate by programming the timer only when we need it (I suppose for example if all the processes are sleeping waiting for IO, we don't even need the timer interrupting the idle process every 1/n seconds, we only need to wait for the mouse/kbd/hd/etc IRQ).
Sound very interesting. A question then came to my mind: are there any more OSs supporting this?
JJ
I haven't played much with the PIT, but I thought you had to reprogram it every time it ticks, even if you always need the same delay.Korona wrote:This option only makes sense if you are using the APIC or the HPET timers.
If your using the PIT the I/O port accesses will be way to slow and you will waste a lot of time while adjusting the timer.
It's probably because I've been programming a timer with this behavior in computer architecture classes (in my university we use a special processor they invented, P3, which are the initials for "Little Pedagogic Processor", in Portuguese).
JJ
Hi,
First, I put the PIT into "hibyte only" mode and leave channel 0 selected, so that a new PIT count can be set with a single I/O port write, and so you get a range from 214.5 us to 54.9 ms (in 214.5 us increments).
Secondly, if the task doesn't use it's entire time slice then the count will be reprogrammed before any IRQ occurs - no EOI necessary.
However, if a task uses a small fraction of it's time slice you'll be doing I/O port writes more often. For e.g. if the PIT is set to 100 Hz and each task runs for 1 ms and blocks, then you get one I/O port access for 10 task switches, but if the PIT is being used in "one-shot" mode you'd get 10 times as many I/O port accesses.
Lastly, if the PIT is being used in "one-shot" mode then you need to keep track of real time somehow. You could read the PIT's remaining count when a task doesn't use it's entire time slice (more I/O port accesses but very precise) or use the CMOS/RTC's periodic IRQ or update IRQ (lots more I/O port accesses), or read the time from the CMOS/RTC's "hour/minute/second" fields (a potentially huge number of I/O port accesses and a complete lack of precision) or use some other way (e.g. RDTSC, which is very hard to get right but is extremely precise).
Cheers,
Brendan
It's not quite that simple...Combuster wrote:If you count the OUTs, tickless systems are better whenever you get more than 4 interrupts without scheduling the process (4xEOI vs 1xEOI + 1x3 bytes to PIT). There's an older discussion on this you can try finding with a search.
First, I put the PIT into "hibyte only" mode and leave channel 0 selected, so that a new PIT count can be set with a single I/O port write, and so you get a range from 214.5 us to 54.9 ms (in 214.5 us increments).
Secondly, if the task doesn't use it's entire time slice then the count will be reprogrammed before any IRQ occurs - no EOI necessary.
However, if a task uses a small fraction of it's time slice you'll be doing I/O port writes more often. For e.g. if the PIT is set to 100 Hz and each task runs for 1 ms and blocks, then you get one I/O port access for 10 task switches, but if the PIT is being used in "one-shot" mode you'd get 10 times as many I/O port accesses.
Lastly, if the PIT is being used in "one-shot" mode then you need to keep track of real time somehow. You could read the PIT's remaining count when a task doesn't use it's entire time slice (more I/O port accesses but very precise) or use the CMOS/RTC's periodic IRQ or update IRQ (lots more I/O port accesses), or read the time from the CMOS/RTC's "hour/minute/second" fields (a potentially huge number of I/O port accesses and a complete lack of precision) or use some other way (e.g. RDTSC, which is very hard to get right but is extremely precise).
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Hi,
Taking the CPU out of the HLT state (and out of any other power saving state) for no reason doesn't really help SMP performance either. For e.g. unnecessary bus/memory bandwidth used by one CPU effects the performance of other CPUs trying to use bus/memory bandwidth; and for hyper-threading, unnecessary work done by one logical CPU effects the performance of another logical CPU in the same chip/core.
Cheers,
Brendan
It reduces CPU power consumption (less heat, less chance of thermal throttling, less electricity, longer laptop battery life, less fan noise, less "server room" air-conditioning, less global warming, etc).Korona wrote:What is the advantage of doing that? When no running thread you don't have to save clock cycles.
Taking the CPU out of the HLT state (and out of any other power saving state) for no reason doesn't really help SMP performance either. For e.g. unnecessary bus/memory bandwidth used by one CPU effects the performance of other CPUs trying to use bus/memory bandwidth; and for hyper-threading, unnecessary work done by one logical CPU effects the performance of another logical CPU in the same chip/core.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.