Hi,
rdos wrote:Brendan wrote:Sadly, Linux does something like this too - take a high precision timer like the local APIC timer, and use it as a general purpose timing thing so that you can bury it under millions of networking timeouts (that have no need for high precision). It's stupid because there's always a minimum amount of time between delays, and when there's too many things using the same timer you have to group things together to avoid the "minimum time between delays" problem. For example, if "foo" should happen in 1000 ns and "bar" should happen in 1234 ns, then you can't setup a 234 ns delay and have to bunch them together, and "foo" ends up happening 234 ns too late. Things that don't need such high precision should use a completely different timer to avoid screwing up the precision for things that do need high precision.
I think the major difference between RDOS and Linux is how ISRs and timer-callbacks are coded. In RDOS, you should keep both ISRs and timer-callbacks short. A typical ISR and / or timer callback only consists of a signal to wake a server-thread, along with clearing some interrupt conditions. User-apps cannot use timers directly at all (they are kernel-only and run as ISRs). Since timer-callbacks are generally shorter than the overall interrupt latency, mixing precision timers is not a problem. You would not gain anything by using separate hardware for high precision timers as it is the interrupt latency that determines response times, not the resolution of the timer. In order to get ns resolution for timed events, it is necesary to run on a dedicated core without preemption and interrupt load.
You're missing the point.
Imagine you're using the PIT in "one shot" mode (local APIC timer is similar with different numbers). You set the count, the PIT decreases it at a fixed rate and an IRQ occurs when the count reaches zero. The largest value you can use for the count is 65536 which gives a maximum delay of 54.9254 ms. What is the minimum value you can set the count to? If you set the count too low you're going to have problems with race conditions and the timer and the PIC chip can fail to keep up (leading to missed IRQs, etc). The minimum count you can actually use in practice might be 119, giving a minimum delay of 99.7 us.
Basically, for the PIT you can get 838 ns precision, but only for delays between 99.7 us and 54925.4 us. Now let's assume the PIT is being used by the scheduler, and the scheduler wants a task to run for exactly 1.2345 ms. That's easy to do - you set the count 1473 and the IRQ occurs after 1.2345 ms (actually "1.2345144424780244416538973528759 ms" but it's close enough). That gives the scheduler about 838 ns precision, which is very nice.
Now imagine someone decides to use the timer for something else at the same time. The scheduler wants a task to run for exactly 1.2345 ms, but the networking stack has asked for a delay that will expire in 1.200 ms time. After 1.2 ms has passed the IRQ for the networking stack occurs, and you want another IRQ in 34.5 us (because 1.2 ms + 34.5 us = 1.2345 ms). Unfortunately 34.5 us is below the minimum you can ask for in practice, so you have to use the minimum itself, resulting in an IRQ that occurs after 99.7 us. The scheduler's 1.2345 ms delay ends up being a 1.2997 ms delay because someone else was using the timer too. Instead of getting 838 ns precision the scheduler can only really get 99.7 us precision; and using the timer for multiple things has made it worse than 100 times less precise.
rdos wrote:Brendan wrote:What is "elapsed time"? I'd use TSC for this if I could (and fall back to HPET if TSC can't be used, and fall back to ACPI's counter if both HPET and TSC can't be used).
Elapsed time is how many tics has elapsed since the system started (simply explained, but not entirely true). A tic is one period on the PIT, which is convinient since 2^32 tics is one hour.
2^32 tics is closer to 0.99988669323428908795205349655706 hours. For every hour you can expect to be wrong by half a second; which adds up to about 68.6 seconds per week. You'd want to fix this problem (and also fix other causes of drift) by adding a tiny bias. For example, every "10 ms" you might add 10.0000001 ms to your counter. In this case "2^32 tics is one hour" isn't more convenient than anything else (e.g. you could just as easily use "64-bit nanoseconds since the start of the year 2000" if you wanted; or maybe 32-bit seconds and 32-bit fractions of a second) because you're just adding a pre-computed amount.
rdos wrote:I thus use 8 bytes to represent elapsed time.
Which means you can't just "add [ticks],value" and have to update both dwords atomically with something like "lock cmpxchg". Not that it really matters much, given that the biggest problem is that whenever the timer IRQ occurs the cache line is modified and all other CPUs have to fetch the new contents of the cache line, causing excessive cache traffic one many-CPU systems (which is the biggest advantage of "loosely synchronised TSC").
rdos wrote:Brendan wrote:Your normal "generic timer" stuff uses NMI? Sounds seriously painful to me.
No. I have reserved NMI for the crash debugger. When the scheduler hits a fatal error, it will send NMI to all other cores to freeze them, regardless if they have interrupts enabled or not. Thus, NMI is not available.
It doesn't take a genius to set a "entering crash debugger" flag, and test that flag at the start of the NMI handler to determine if the cause of the NMI was watchdog or crash.
rdos wrote:So, if the user loads the PIC device-driver, it will look for PITs for timers / elapsed time, as those are commonly found on older hardware without APIC. If the user loads the APIC device-driver, it would make different choices, selecting between APIC timer, HPET or PIT. The only choice users have is to select which interrupt controller is available.
So if the user says there's a PIC, you ignore the flag in the ACPI tables that says if there's PIC chips or not, and also assume there's no HPET even if there is? If the user says there's APICs, you ignore the details in the ACPI MADT that says if there's APICs and assume there's no PIT even if there is?
If someone cuts off their arm and is bleeding to death, instead of making the obvious assumption (that they want some first aid) based on easily observable facts, do you ask them if they want a hamburger?
rdos wrote:As can be seen, it reports that the PIT exists, but doesn't have any interrupts. It also reports that HPET has both IRQ 0 and 8.
That's probably 100% correct - there is a PIT (but it might only be used for speaker control) and HPET (and SMM) is used to emulate PIT channel 0 and IRQ0 for legacy OSs that don't enable "ACPI mode"; and an OS that looks at the AML is assumed to use "ACPI mode" where the firmware doesn't bother using SMM to emulate the PIT (the OS just uses HPET directly).
Cheers,
Brendan