Using APIC timer as a "system board" timer when HPET fails
Re: Using APIC timer as a "system board" timer when HPET fai
Rdos, have you tried Eclipse on Windows. It has good SVN support.
If a trainstation is where trains stop, what is a workstation ?
- Combuster
- Member
- Posts: 9301
- Joined: Wed Oct 18, 2006 3:45 am
- Libera.chat IRC: [com]buster
- Location: On the balcony, where I can actually keep 1½m distance
- Contact:
Re: Using APIC timer as a "system board" timer when HPET fai
So what? It's not like any commercial IDE I have ever seen is exactly good at it...There is no good version control integration in the primary tools I use.
Re: Using APIC timer as a "system board" timer when HPET fai
The merge operation is (almost) complete. The problem with the branch-code is that it uses timestamp counter (a per core resource) to keep real time. The optimal solution is to use the HPET clock to get elapsed time, even if it doesn't support MSI-interrupts. If the HPET doesn't exist at all, the code should revert to PIT timer for elapsed time (speaker channel). When the HPET doesn't support MSI, and thus is potentially really buggy with it's interrupts, the local APIC timer would be used for timers. The local APIC timer could also be used for timers under special circumstances when the HPET is setup for main timers.
Edit: Now all three modes of operation work (shared APIC timer + PIT for elapsed time, shared APIC timer + HPET for elapsed time and APIC timer for preemption + HPET timers + HPET for elapsed time). A forth mode (PIT channel 0 for timers & preemption + PIT channel 2 for elapsed time) also works when no APIC is available on older/low-end systems with only PIC. Maybe I should search for the HPET on these systems as well and use it for elapsed time instead of PIT channel 2?
Edit: Now all three modes of operation work (shared APIC timer + PIT for elapsed time, shared APIC timer + HPET for elapsed time and APIC timer for preemption + HPET timers + HPET for elapsed time). A forth mode (PIT channel 0 for timers & preemption + PIT channel 2 for elapsed time) also works when no APIC is available on older/low-end systems with only PIC. Maybe I should search for the HPET on these systems as well and use it for elapsed time instead of PIT channel 2?
- gravaera
- Member
- Posts: 737
- Joined: Tue Jun 02, 2009 4:35 pm
- Location: Supporting the cause: Use \tabs to indent code. NOT \x20 spaces.
Re: Using APIC timer as a "system board" timer when HPET fai
These rdos threads...they always deliver jajajaja
Anyway, dropping in to say that git es #1, es best VCS, always commit, never lose data...from start staging always up, I get so much work done with git, es the best!
Anyway, dropping in to say that git es #1, es best VCS, always commit, never lose data...from start staging always up, I get so much work done with git, es the best!
17:56 < sortie> Paging is called paging because you need to draw it on pages in your notebook to succeed at it.
Re: Using APIC timer as a "system board" timer when HPET fai
Hi,
During boot or kernel initialisation you should "discover" timers and counters, and create a list containing the capabilities and characteristics for each of them (e.g. if you can read the count, how precise it is, how much overhead is involved, if it can generate a fixed frequency IRQ, if it can be used for a "one shot" IRQ, if it is effected by S2 and S3 sleep states, etc). For each of the timers you'd also have function pointers for "back-end code" to do various things (set it up, read the count, configure it for fixed frequency IRQs, set the next "one shot" IRQ, etc).
Later during boot or kernel initialisation, you should use the list to determine which timers and counters are most suitable for which roles. If the kernel decides that the third timer in the list is good for "something", then it'd use the function pointers for the third timer in the list for "something". The kernel wouldn't need to know or care what the timer actually is (the function pointers acts as an effective abstraction layer).
For example, you might search the list to find the best counter to use for the master "real time" counter (e.g. something that lets you read the count, isn't effected by sleep states and doesn't need to be able to generate an IRQ at all - maybe HPET main counter or the ACPI timer, or maybe PIT or RTC if the corresponding back-end does some extra work); or you might search the list for per-CPU scheduler timers (something that supports "one shot" IRQs, that may be effected by sleep states, may not allow the count to be read, may not support "fixed frequency", etc - possibly local APIC timers, possibly one or more HPET timers, possibly PIT).
If a manufacturer invents a new type of timer, then you just add it to your list and let the kernel decide what to use it for. If you decide you want to add a new feature to the kernel (watchdog timer?) then you just search the list for a timer to suit the new role. If you find out that something doesn't work right on a specific chipset (e.g. maybe the ACPI counter is broken), then you might add a work-around in the code that creates the list of timers/counters (e.g. if chipset is XYZ then don't add the ACPI counter to the list). In all these cases the OS would automatically adjust.
Cheers,
Brendan
You shouldn't really be limited to specific permutations.rdos wrote:Edit: Now all three modes of operation work (shared APIC timer + PIT for elapsed time, shared APIC timer + HPET for elapsed time and APIC timer for preemption + HPET timers + HPET for elapsed time). A forth mode (PIT channel 0 for timers & preemption + PIT channel 2 for elapsed time) also works when no APIC is available on older/low-end systems with only PIC. Maybe I should search for the HPET on these systems as well and use it for elapsed time instead of PIT channel 2?
During boot or kernel initialisation you should "discover" timers and counters, and create a list containing the capabilities and characteristics for each of them (e.g. if you can read the count, how precise it is, how much overhead is involved, if it can generate a fixed frequency IRQ, if it can be used for a "one shot" IRQ, if it is effected by S2 and S3 sleep states, etc). For each of the timers you'd also have function pointers for "back-end code" to do various things (set it up, read the count, configure it for fixed frequency IRQs, set the next "one shot" IRQ, etc).
Later during boot or kernel initialisation, you should use the list to determine which timers and counters are most suitable for which roles. If the kernel decides that the third timer in the list is good for "something", then it'd use the function pointers for the third timer in the list for "something". The kernel wouldn't need to know or care what the timer actually is (the function pointers acts as an effective abstraction layer).
For example, you might search the list to find the best counter to use for the master "real time" counter (e.g. something that lets you read the count, isn't effected by sleep states and doesn't need to be able to generate an IRQ at all - maybe HPET main counter or the ACPI timer, or maybe PIT or RTC if the corresponding back-end does some extra work); or you might search the list for per-CPU scheduler timers (something that supports "one shot" IRQs, that may be effected by sleep states, may not allow the count to be read, may not support "fixed frequency", etc - possibly local APIC timers, possibly one or more HPET timers, possibly PIT).
If a manufacturer invents a new type of timer, then you just add it to your list and let the kernel decide what to use it for. If you decide you want to add a new feature to the kernel (watchdog timer?) then you just search the list for a timer to suit the new role. If you find out that something doesn't work right on a specific chipset (e.g. maybe the ACPI counter is broken), then you might add a work-around in the code that creates the list of timers/counters (e.g. if chipset is XYZ then don't add the ACPI counter to the list). In all these cases the OS would automatically adjust.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Using APIC timer as a "system board" timer when HPET fai
That sounds way to complicated. There currently are no other timers in main-stream PCs, so why make it more difficult than it is? With only HPET, PIT and APIC timer to choose between, there is no need to create lists or anything, and the usage rules can be coded into the APIC device-driver when that is available, and in the PIC device-driver when that is used.Brendan wrote:You shouldn't really be limited to specific permutations.
During boot or kernel initialisation you should "discover" timers and counters, and create a list containing the capabilities and characteristics for each of them (e.g. if you can read the count, how precise it is, how much overhead is involved, if it can generate a fixed frequency IRQ, if it can be used for a "one shot" IRQ, if it is effected by S2 and S3 sleep states, etc). For each of the timers you'd also have function pointers for "back-end code" to do various things (set it up, read the count, configure it for fixed frequency IRQs, set the next "one shot" IRQ, etc).
Additionally, I've looked at ACPI-tables for several machines now, and they are not reliable in regards to HPET or PIT. Most BIOS developpers probably just copy them, and do not investigate how they are configured previous to copying the configurations.
Re: Using APIC timer as a "system board" timer when HPET fai
Hi,
Then there's roles:
Cheers,
Brendan
Erm:rdos wrote:That sounds way to complicated. There currently are no other timers in main-stream PCs, so why make it more difficult than it is? With only HPET, PIT and APIC timer to choose between...
- TSC
- Local APIC timer (possibly including "TSC deadline mode")
- Performance monitoring counters and/or IRQs
- HPET main counter
- HPET comparators
- ACPI's "power management timer" (32-bit or 24-bit) counter
- PIT channel 0
- PIT channel 2
- RTC periodic and/or update IRQ
- Watchdog timer/s (e.g. the "WDAT" and "WDRT" ACPI tables)
Then there's roles:
- Some sort of counter to measure real time (need accuracy, precision would be nice, per-CPU would be nice, don't need an IRQ)
- Some sort of timer to wake sleeping tasks (need accuracy, precision would be nice, per-CPU would be nice, need an IRQ)
- Some sort of counter to measure how much time each task has used (precision would be nice, per-CPU would be really nice, low overhead would be nice, don't need an IRQ, don't really need accuracy)
- Some sort of timer that the scheduler can use to know when a task has used all of the time it was given (some precision would be nice, accuracy doesn't matter much, don't need to be able to read the current count, do need an IRQ, "one shot" IRQ would be nice)
- Some sort of timer to keep track of power management (don't really need precision or accuracy, don't need to be able to read the current count, do need an IRQ, "one shot" IRQ would be nice)
- (optional) Some sort of timer to use for "poor man's profiling" (don't really need precision or accuracy, don't need to be able to read the current count, do need an IRQ and something capable of generating an NMI would be nice, "one shot" IRQ would be nice for pseudo-random delays)
- (optional) Some sort of timer to use for a watchdog (don't really need precision or accuracy, don't need to be able to read the current count, do need an IRQ and something capable of generating an NMI would be nice, fixed frequency is fine)
In which way are the ACPI tables not accurate (and why don't other OSs like Windows and Linux have problems)?rdos wrote:Additionally, I've looked at ACPI-tables for several machines now, and they are not reliable in regards to HPET or PIT. Most BIOS developpers probably just copy them, and do not investigate how they are configured previous to copying the configurations.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Using APIC timer as a "system board" timer when HPET fai
Highly unreliable on older CPUs (when it is present), and per-core, which makes it useless for keeping elapsed time. Has no IRQ.Brendan wrote: TSC
Affected by some types of power-management, but pretty useful for preemption. Not good for high precision timing due to possible effects of power-management.Brendan wrote: Local APIC timer (possibly including "TSC deadline mode")
Better being free for performance measuringBrendan wrote: Performance monitoring counters and/or IRQs
The best alternative for measuring elapsed time.Brendan wrote: HPET main counter
The best alternative for high-precision timers. When they work. When the HPET doesn't support MSI-delivery it seems like it often malfunctions. The configuration information returned on some motherboards is incorrect regarding IRQ routings. The ACPI tables are not always correct either. Some report IRQs when they don't work, while some report no IRQ, but they still work.Brendan wrote: HPET comparators
HPET?Brendan wrote: ACPI's "power management timer" (32-bit or 24-bit) counter
AFAIK, there is no garantee in ACPI that this is not the HPET, or some channel on HPET.
Both can be used for elapsed time or high-precision (us) timers.Brendan wrote: PIT channel 0
PIT channel 2
Can be used to synchronize elapsed time with real time. Not useful for anything else.Brendan wrote: RTC periodic and/or update IRQ
These are better left out of this.Brendan wrote: Watchdog timer/s (e.g. the "WDAT" and "WDRT" ACPI tables)
No, should absolutely not be per-CPU, but per system. That means one of the PIT channels or HPET. TSC doesn't work, as it is both affected by power-management and is per-CPU. I have tried to use TSC for elapsed time, and it doesn't work. There is no reliable way to synchronize time between cores, especially not when TSCs start ticking at different frequences when power-management "kicks-in".Brendan wrote: Some sort of counter to measure real time (need accuracy, precision would be nice, per-CPU would be nice, don't need an IRQ)
This is what I refer to as "timers". This can be APIC timer, PIT channel 0 or HPET comparator. The APIC timer is per-CPU, so if it is used, timers needs to be per CPU. When using PIT channel 0 or HPET comparators, timers would be per-system. It might be possible to use combinations if both APIC timer and PIT channel 0 / HPET comparator is available.Brendan wrote: Some sort of timer to wake sleeping tasks (need accuracy, precision would be nice, per-CPU would be nice, need an IRQ)
I use elapsed time for this. When the task is started, the elapsed counter is saved, and then when a new task is scheduled, elapsed time is read again, and then subtracted from the saved value. There is no need for a separate hardware resource for this.Brendan wrote: Some sort of counter to measure how much time each task has used (precision would be nice, per-CPU would be really nice, low overhead would be nice, don't need an IRQ, don't really need accuracy)
APIC timer, if available, is most suitable for this. If timers also use APIC timer, there is a need to combine the timeouts, but this works. If APIC timer is not available, HPET or PIT channel 0 can be used (most often combined with timer function). The most effecient allocation is to use APIC timer for preemption and HPET comparator for timers.Brendan wrote: Some sort of timer that the scheduler can use to know when a task has used all of the time it was given (some precision would be nice, accuracy doesn't matter much, don't need to be able to read the current count, do need an IRQ, "one shot" IRQ would be nice)
I'd use a normal timer (as of above) for this. It doesn't need its own hardware resource.Brendan wrote: Some sort of timer to keep track of power management (don't really need precision or accuracy, don't need to be able to read the
current count, do need an IRQ, "one shot" IRQ would be nice)
This is more or less also the normal timers I have.Brendan wrote: (optional) Some sort of timer to use for "poor man's profiling" (don't really need precision or accuracy, don't need to be able to read the current count, do need an IRQ and something capable of generating an NMI would be nice, "one shot" IRQ would be nice for pseudo-random delays)
Same as above. This is a normal timer.Brendan wrote: (optional) Some sort of timer to use for a watchdog (don't really need precision or accuracy, don't need to be able to read the current count, do need an IRQ and something capable of generating an NMI would be nice, fixed frequency is fine)
Given the complex selection rules as of above, I doubt it is possible to write something generic that selects the best resources. Additionally, for such an algorithm to work there is a need to know several variables:
1. Does the hardware resource work?
2. Does the hardware resource trigger the IRQs it is supposed to trigger?
3. Does power-management affect frequencies?
These can at best be tested, but I'm not sure how to test if power-management will affect frequencies. If a resource is per-system, and leagacy, it probably won't be affected by power-management, but these are proabilities not parameters that are easily input into an algorithm.
Re: Using APIC timer as a "system board" timer when HPET fai
On one particular machine, my 2-core AMD Athlon, the _CRS for _MEM is broken and makes ACPICA malfunction (it tries to output "ACPI_STACK_UNDERFLOW", but in the process of doing this makes several random writes via uninitialized pointers. This same machine also reports that HPET support IRQ 0 and IRQ 8 (which doesn't work). The comparator itself report that any interrupt routing is supported (obviously not correct). RDOS only works when APIC timer is used both for preemption and timers. Neither HPET nor PIT has functional IRQs. HPET can be used for elapsed time though, because the counter is running. It is the IRQ mappings that are wrong. The HPET doesn't support MSI delivery. Additionally, this machine frequently hangs-up during reboots, and is acting strange. It has Windows XP installed.Brendan wrote:In which way are the ACPI tables not accurate (and why don't other OSs like Windows and Linux have problems)?
Re: Using APIC timer as a "system board" timer when HPET fai
Hi,
The only other alternative that I can think of is telling unsuspecting end users "I'm not smart enough to figure it out, and even though you probably know less than me, I'm making you solve my design failure via. compile time idiocy".
Cheers,
Brendan
Insanely awesome (extremely low overhead and extremely high precision) on recent CPUs though. You'd want to detect TSC capabilities and use TSC if it's suitable.rdos wrote:Highly unreliable on older CPUs (when it is present), and per-core, which makes it useless for keeping elapsed time. Has no IRQ.Brendan wrote: TSC
Very nice if it's not effected by power management (but still very useful even when it is effected by power management). You'd want to detect local APIC timer capabilities and use it if it's suitable.rdos wrote:Affected by some types of power-management, but pretty useful for preemption. Not good for high precision timing due to possible effects of power-management.Brendan wrote: Local APIC timer (possibly including "TSC deadline mode")
It's not like the performance monitoring stuff is actually used for performance monitoring most of the time anyway. If there's no other choice, I'd rather use performance monitoring counters for timing than be unable to boot. You'd want to detect performance monitoring capabilities and use it if it's more suitable than something else.rdos wrote:Better being free for performance measuringBrendan wrote: Performance monitoring counters and/or IRQs
Second best (TSC on recent CPUs is much better).rdos wrote:The best alternative for measuring elapsed time.Brendan wrote: HPET main counter
Probably second best (local APIC is better for high-precision timers). In cases where the local APIC timer gets messed up due to sleep states, you'd still want to use local APIC timers, but when the CPU enters a sleep state migrate the work to HPET until the CPU comes back out of the sleep state. You'd want to detect HPET capabilities and use it if it's more suitable than something else (whether it's "use it on it's own" or "use it as a backup when CPUs are in sleep states").rdos wrote:The best alternative for high-precision timers. When they work. When the HPET doesn't support MSI-delivery it seems like it often malfunctions. The configuration information returned on some motherboards is incorrect regarding IRQ routings. The ACPI tables are not always correct either. Some report IRQs when they don't work, while some report no IRQ, but they still work.Brendan wrote: HPET comparators
ACPI's "power management timer" is *not* HPET. It's a counter that is increased at a rate of about 3.5795 MHz. HPET typically runs at 10 MHz or more, which makes it separate. Of course it's possible (likely even) that some chipsets have a central 14.31818 MHz clock that is used to drive HPET directly, used via. a "divide by 4" to drive ACPI's counter, and used via. a "divide by 12" to drive the PIT (but that does not mean "HPET = ACPI's counter = PIT" - they're still all separate devices with separate control logic and capabilities).rdos wrote:HPET?Brendan wrote: ACPI's "power management timer" (32-bit or 24-bit) counter
AFAIK, there is no garantee in ACPI that this is not the HPET, or some channel on HPET.
Yes, but they're slow and ugly (e.g. "legacy IO port" accesses to read the current count); and for channel 2 the thing can roll over several times without you knowing.rdos wrote:Both can be used for elapsed time or high-precision (us) timers.Brendan wrote: PIT channel 0
PIT channel 2
For old systems (where you've only got PIT and RTC and nothing else), you'd want to use PIT for the scheduler's timer (in "one shot" mode) and RTC for everything else.rdos wrote:Can be used to synchronize elapsed time with real time. Not useful for anything else.Brendan wrote: RTC periodic and/or update IRQ
Why? Is your OS a general purpose desktop thing that doesn't have to care if it locks up completely due to a hardware fault (rather than some sort of embedded system that might be used for banking)?rdos wrote:These are better left out of this.Brendan wrote: Watchdog timer/s (e.g. the "WDAT" and "WDRT" ACPI tables)
Don't be silly - on recent systems (where TSC is guaranteed to run at a fixed frequency - e.g. the "TSC invariant" CPUID flag) TSC would be perfect for this (but synchronised to the RTC occasionally). For situations where TSC ticks at different frequencies on different CPUs, just synchronise more often to ensure that the TSC is always within an acceptable amount of error.rdos wrote:No, should absolutely not be per-CPU, but per system. That means one of the PIT channels or HPET. TSC doesn't work, as it is both affected by power-management and is per-CPU. I have tried to use TSC for elapsed time, and it doesn't work. There is no reliable way to synchronize time between cores, especially not when TSCs start ticking at different frequences when power-management "kicks-in".Brendan wrote: Some sort of counter to measure real time (need accuracy, precision would be nice, per-CPU would be nice, don't need an IRQ)
Sadly, Linux does something like this too - take a high precision timer like the local APIC timer, and use it as a general purpose timing thing so that you can bury it under millions of networking timeouts (that have no need for high precision). It's stupid because there's always a minimum amount of time between delays, and when there's too many things using the same timer you have to group things together to avoid the "minimum time between delays" problem. For example, if "foo" should happen in 1000 ns and "bar" should happen in 1234 ns, then you can't setup a 234 ns delay and have to bunch them together, and "foo" ends up happening 234 ns too late. Things that don't need such high precision should use a completely different timer to avoid screwing up the precision for things that do need high precision.rdos wrote:This is what I refer to as "timers". This can be APIC timer, PIT channel 0 or HPET comparator. The APIC timer is per-CPU, so if it is used, timers needs to be per CPU. When using PIT channel 0 or HPET comparators, timers would be per-system. It might be possible to use combinations if both APIC timer and PIT channel 0 / HPET comparator is available.Brendan wrote: Some sort of timer to wake sleeping tasks (need accuracy, precision would be nice, per-CPU would be nice, need an IRQ)
What is "elapsed time"? I'd use TSC for this if I could (and fall back to HPET if TSC can't be used, and fall back to ACPI's counter if both HPET and TSC can't be used).rdos wrote:I use elapsed time for this. When the task is started, the elapsed counter is saved, and then when a new task is scheduled, elapsed time is read again, and then subtracted from the saved value. There is no need for a separate hardware resource for this.Brendan wrote: Some sort of counter to measure how much time each task has used (precision would be nice, per-CPU would be really nice, low overhead would be nice, don't need an IRQ, don't really need accuracy)
The most efficient way would be using performance monitoring counters for the scheduler, local APIC timer for high precision "sleep()", and HPET or PIT or TSC for low precision timing (e.g. network packet timeouts).rdos wrote:APIC timer, if available, is most suitable for this. If timers also use APIC timer, there is a need to combine the timeouts, but this works. If APIC timer is not available, HPET or PIT channel 0 can be used (most often combined with timer function). The most effecient allocation is to use APIC timer for preemption and HPET comparator for timers.Brendan wrote: Some sort of timer that the scheduler can use to know when a task has used all of the time it was given (some precision would be nice, accuracy doesn't matter much, don't need to be able to read the current count, do need an IRQ, "one shot" IRQ would be nice)
Same "jack of all trades" problem (screwing up high precision timing by using it for low precision timing).rdos wrote:I'd use a normal timer (as of above) for this. It doesn't need its own hardware resource.Brendan wrote: Some sort of timer to keep track of power management (don't really need precision or accuracy, don't need to be able to read the
current count, do need an IRQ, "one shot" IRQ would be nice)
Your normal "generic timer" stuff uses NMI? Sounds seriously painful to me.rdos wrote:This is more or less also the normal timers I have.Brendan wrote: (optional) Some sort of timer to use for "poor man's profiling" (don't really need precision or accuracy, don't need to be able to read the current count, do need an IRQ and something capable of generating an NMI would be nice, "one shot" IRQ would be nice for pseudo-random delays)
Yes - same as above (seriously flawed).rdos wrote:Same as above. This is a normal timer.Brendan wrote: (optional) Some sort of timer to use for a watchdog (don't really need precision or accuracy, don't need to be able to read the current count, do need an IRQ and something capable of generating an NMI would be nice, fixed frequency is fine)
Given the complex selection rules above, I doubt it's possible to avoid writing something that selects the best resources.rdos wrote:Given the complex selection rules as of above, I doubt it is possible to write something generic that selects the best resources.
The only other alternative that I can think of is telling unsuspecting end users "I'm not smart enough to figure it out, and even though you probably know less than me, I'm making you solve my design failure via. compile time idiocy".
The conservative way would be to assume the answer to all those questions is "no" unless you know otherwise. For S4 (hybernate) and S3 (suspend) you can assume all timers lose their state (as you're effectively turning everything off, except RAM for S3) and reinitialise your timing (e.g. starting from getting the new time and date from the RTC) when you come out of S3/S4. For S2 only the CPUs are turned off, so you should never need to worry about PIT, RTC, HPET, ACPI counter (and only have to worry about TSC and local APIC). For TCS and local APIC behaviour, you can use CPUID (either the "TSC invarient" flag, or the "vendormodel"). That probably solves 99% of the problems, and the remaining problems can easily be handled with some special case work-arounds if/when they occur.rdos wrote:Additionally, for such an algorithm to work there is a need to know several variables:
1. Does the hardware resource work?
2. Does the hardware resource trigger the IRQs it is supposed to trigger?
3. Does power-management affect frequencies?
These can at best be tested, but I'm not sure how to test if power-management will affect frequencies. If a resource is per-system, and leagacy, it probably won't be affected by power-management, but these are proabilities not parameters that are easily input into an algorithm.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Using APIC timer as a "system board" timer when HPET fai
Hi,
I guess you're right - a mistake in the firmware's AML is the most likely cause...
Cheers,
Brendan
So, it could be a mistake in the firmware's AML (but Windows XP doesn't have any problem), it could be a bug in ACPICA (would be nice to try Linux on the machine to rule that out), or it could be a bug in your "just slapped ACPICA and SMP support in last month and still having lots of problems with everything" code?rdos wrote:On one particular machine, my 2-core AMD Athlon, the _CRS for _MEM is broken and makes ACPICA malfunction (it tries to output "ACPI_STACK_UNDERFLOW", but in the process of doing this makes several random writes via uninitialized pointers. This same machine also reports that HPET support IRQ 0 and IRQ 8 (which doesn't work). The comparator itself report that any interrupt routing is supported (obviously not correct). RDOS only works when APIC timer is used both for preemption and timers. Neither HPET nor PIT has functional IRQs. HPET can be used for elapsed time though, because the counter is running. It is the IRQ mappings that are wrong. The HPET doesn't support MSI delivery. Additionally, this machine frequently hangs-up during reboots, and is acting strange. It has Windows XP installed.Brendan wrote:In which way are the ACPI tables not accurate (and why don't other OSs like Windows and Linux have problems)?
I guess you're right - a mistake in the firmware's AML is the most likely cause...
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Using APIC timer as a "system board" timer when HPET fai
The TSC is only useful if you peg threads to cores, and even then you destroy the high precision if you have too long IRQs or other code running with disabled interrupts for extended periods of time. In a normal multitasking OS, this means you won't get higher precision with TSC than your worse interrupt latency. If the thread then is migrated to a new core, and reads its TSC, it will have erratic behavior if TSCes are not well synchronized. You need frequent IPIs to synchronize, preferably with NMI delivery to minimize latency.Brendan wrote:Insanely awesome (extremely low overhead and extremely high precision) on recent CPUs though. You'd want to detect TSC capabilities and use TSC if it's suitable.
Agreed, but I think there are other (better) choices. At least I know of no platform that doesn't have either PIT or HPET / APIC timer, but it is possible such might exist.Brendan wrote:It's not like the performance monitoring stuff is actually used for performance monitoring most of the time anyway. If there's no other choice, I'd rather use performance monitoring counters for timing than be unable to boot. You'd want to detect performance monitoring capabilities and use it if it's more suitable than something else.
Not in my design. The preemption timer is set to 1ms, so there is no chance that channel 2 will roll over.Brendan wrote:Yes, but they're slow and ugly (e.g. "legacy IO port" accesses to read the current count); and for channel 2 the thing can roll over several times without you knowing.
No. You would use PIT channel 0 for timers / preemption and PIT channel 2 for elapsed time. The RTC is just too slow to be useful when the system tic is 1 / 1.193 us. That means you cannot use the legacy speaker, but it is not very useful anyway.Brendan wrote:For old systems (where you've only got PIT and RTC and nothing else), you'd want to use PIT for the scheduler's timer (in "one shot" mode) and RTC for everything else.
I have a very sophisticated software watchdog timer that in practise takes care of all software-related faults. It is installed in the production release, and will reboot on any fault, including kernel panics. In practise, this is all I need. I have yet to encounter a situation in production stage where this is not enough. We built a dedicated hardware watchdog, but it caused more problems than it solved problems, so we no longer have it.Brendan wrote:Why? Is your OS a general purpose desktop thing that doesn't have to care if it locks up completely due to a hardware fault (rather than some sort of embedded system that might be used for banking)?
I think the major difference between RDOS and Linux is how ISRs and timer-callbacks are coded. In RDOS, you should keep both ISRs and timer-callbacks short. A typical ISR and / or timer callback only consists of a signal to wake a server-thread, along with clearing some interrupt conditions. User-apps cannot use timers directly at all (they are kernel-only and run as ISRs). Since timer-callbacks are generally shorter than the overall interrupt latency, mixing precision timers is not a problem. You would not gain anything by using separate hardware for high precision timers as it is the interrupt latency that determines response times, not the resolution of the timer. In order to get ns resolution for timed events, it is necesary to run on a dedicated core without preemption and interrupt load.Brendan wrote:Sadly, Linux does something like this too - take a high precision timer like the local APIC timer, and use it as a general purpose timing thing so that you can bury it under millions of networking timeouts (that have no need for high precision). It's stupid because there's always a minimum amount of time between delays, and when there's too many things using the same timer you have to group things together to avoid the "minimum time between delays" problem. For example, if "foo" should happen in 1000 ns and "bar" should happen in 1234 ns, then you can't setup a 234 ns delay and have to bunch them together, and "foo" ends up happening 234 ns too late. Things that don't need such high precision should use a completely different timer to avoid screwing up the precision for things that do need high precision.
Elapsed time is how many tics has elapsed since the system started (simply explained, but not entirely true). A tic is one period on the PIT, which is convinient since 2^32 tics is one hour. I thus use 8 bytes to represent elapsed time. Elapsed time is also related to real time. To convert between elapsed time and real time you simply add an offset, which can be changed by setting real time. Elapsed time cannot be changed, but instead is garanteed to increase monotonly. When timed-waits are used, or timers are started, they use elapsed time. This also means that if you want a timer to fire exactly one time per second, you simply add how many tics there are on a second from the previous timeout, and start a timer. In order to have sub-micro second resolution for real time, the RTC update int is used to synchronize the update of the RTC with the real time counter. At boot-up, the setting in the RTC is loaded as elapsed time, so thus normally the difference between elapsed time and real time is zero.Brendan wrote:What is "elapsed time"? I'd use TSC for this if I could (and fall back to HPET if TSC can't be used, and fall back to ACPI's counter if both HPET and TSC can't be used).
Not true since timer callbacks are ISRs, and are only allowed to clear some hardware conditions and signal a thread.Brendan wrote:Same "jack of all trades" problem (screwing up high precision timing by using it for low precision timing).
No. I have reserved NMI for the crash debugger. When the scheduler hits a fatal error, it will send NMI to all other cores to freeze them, regardless if they have interrupts enabled or not. Thus, NMI is not available.Brendan wrote:Your normal "generic timer" stuff uses NMI? Sounds seriously painful to me.
Currently, users needs to specify which device-drivers to load for their hardware, but I could imagine I could change this if necesary to detect the available hardware and auto-create the configuration file. The kernel image is not built with compile-time switches, or by linking modules. It is created with a command-line tool that writes an image file that contains the separately compiled device-driver files, along with ordinary files, settings and autostarts. This file can even be built by software, as is done when we update the OS remotely.Brendan wrote:The only other alternative that I can think of is telling unsuspecting end users "I'm not smart enough to figure it out, and even though you probably know less than me, I'm making you solve my design failure via. compile time idiocy".
So, if the user loads the PIC device-driver, it will look for PITs for timers / elapsed time, as those are commonly found on older hardware without APIC. If the user loads the APIC device-driver, it would make different choices, selecting between APIC timer, HPET or PIT. The only choice users have is to select which interrupt controller is available.
Re: Using APIC timer as a "system board" timer when HPET fai
You forget that I can debug ACPICA at source level as if it was an ordinary application. Besides, there are no SMP conditions here as it is a single thread (and when done at boot-time, only runs on BSP) that runs the ACPI initialization. I know exactly where and why it faults. All the fields of Walkstate, except Next (which contains junk) are NULLs, and the status code is 13 (ACPI_STACK_UNDERFLOW).Brendan wrote:So, it could be a mistake in the firmware's AML (but Windows XP doesn't have any problem), it could be a bug in ACPICA (would be nice to try Linux on the machine to rule that out), or it could be a bug in your "just slapped ACPICA and SMP support in last month and still having lots of problems with everything" code?
I guess you're right - a mistake in the firmware's AML is the most likely cause...
To answer the question about Linux, no, it doesn't work. When I boot the live Mandriva 2011 DVD, it simply locks-up. This doesn't say if it is the video-BIOS problem (the code to switch mode faults in V86 mode on RDOS), or if it is related to ACPI.
Re: Using APIC timer as a "system board" timer when HPET fai
ACPICA seems to trash its own environment when reading the _CRS so badly that I cannot seem to provide a fix for it. The only fix that works 100% is to exclude objects which contains "MEM" from being evaluated.
After this is done, the device-list looks like this for the 2-core AMD machine:
As can be seen, it reports that the PIT exists, but doesn't have any interrupts. It also reports that HPET has both IRQ 0 and 8.
After this is done, the device-list looks like this for the 2-core AMD machine:
Code: Select all
\_SB_.PCI0
IO: 0CF8-0CFF
\_SB_.PCI0.LPC0.PMIO
IO: 0B10-0B1F
IO: 0B00-0B0F
IO: 4210-4217
IO: 4000-40FE
IO: 0CD4-0CDF
IO: 0CD2-0CD3
IO: 0CD0-0CD1
IO: 0C6F-0C6F
IO: 0C6C-0C6D
IO: 0C50-0C52
IO: 0C14-0C14
IO: 0C00-0C01
IO: 04D6-04D6
IO: 040B-040B
IO: 0228-022F
IO: 4100-411F
\_SB_.PCI0.LPC0.LNKA
IRQ: 3, sharable, high level
\_SB_.PCI0.LPC0.LNKB
IRQ: 11, sharable, high level
\_SB_.PCI0.LPC0.LNKC
IRQ: 5, sharable, high level
\_SB_.PCI0.LPC0.LNKD
IRQ: 10, sharable, high level
\_SB_.PCI0.LPC0.LNKE
IRQ: 0, sharable, high level
\_SB_.PCI0.LPC0.LNKF
IRQ: 0, sharable, high level
\_SB_.PCI0.LPC0.LNK0
IRQ: 11, sharable, high level
\_SB_.PCI0.LPC0.LNK1
IRQ: 0, sharable, high level
\_SB_.PCI0.LPC0.PIC_
IO: 00A0-00A1
IO: 0020-0021
IRQ: 2, exclusive, edge
\_SB_.PCI0.LPC0.DMA1
IO: 00C0-00DF
IO: 0094-009F
IO: 0080-0090
IO: 0000-000F
\_SB_.PCI0.LPC0.TMR_
IO: 0040-0043
\_SB_.PCI0.LPC0.HPET
Mem: FED00000-FED003FF
IRQ: 8, exclusive, edge
IRQ: 0, exclusive, edge
\_SB_.PCI0.LPC0.RTC_
IO: 0070-0073
\_SB_.PCI0.LPC0.SPKR
IO: 0061-0061
\_SB_.PCI0.LPC0.COPR
IO: 00F0-00FF
IRQ: 13, exclusive, edge
\_SB_.PCI0.SYSR
IO: 0220-0225
IO: 04D0-04D1
IO: 00E0-00EF
IO: 00A2-00BF
IO: 0091-0093
IO: 0074-007F
IO: 0065-006F
IO: 0062-0063
IO: 0044-005F
IO: 0022-003F
IO: 0010-001F
\_SB_.PCI0.FDC0
IO: 03F7-03F7
IO: 03F0-03F5
IRQ: 6, exclusive, edge
\_SB_.PCI0.UAR1
IO: 03F8-03FF
IRQ: 4, exclusive, edge
\_SB_.PCI0.UAR2
IO: 0000-0007
IRQ: 0, exclusive, edge
\_SB_.PCI0.LPT1
IO: 0378-037F
IRQ: 7, exclusive, edge
\_SB_.PCI0.ECP1
IO: 0778-077B
IO: 0378-037F
IRQ: 7, exclusive, edge
\_SB_.PCI0.PS2M
IRQ: 12, exclusive, edge
\_SB_.PCI0.PS2K
IO: 0064-0064
IO: 0060-0060
IRQ: 1, exclusive, edge
\_SB_.PCI0.PSMR
IO: 0064-0064
IO: 0060-0060
\_SB_.PCI0.EXPL
Mem: E0000000-EFFFFFFF
Re: Using APIC timer as a "system board" timer when HPET fai
The walkstate being nulled or containing junk is a pretty good indicator there's a bug in either ACPICA or the OS specific code it depends on, it's highly unlikely the AML code would be messing up the internal state of the interpreter like that. Even if it were, you should be able to track it through the OS hooks ACPICA depends on. If you can get any other major OS running on the platform you can probably dump the DSDT and SSDTs (if there are any) with iasl just to make sure the AML code is sane.
Reserved for OEM use.