Hi,
ReturnInfinity wrote:There are issues with Pure64 when run on core i5/i7 machines.
I never had any problem with Nehalem CPUs (Xeon specifically, but they're all basically the same) - the AP startup code I've been using for ages worked fine.
If you search through the forums for "AP startup" and "Brendan" you'll probably find half a dozen of my posts that all repeat a few basic recommendations. You've managed to do the opposite of all my recommendations, so here they are:
1) Get the address of the local APIC from the MP specification tables or ACPI tables (and not from an MSR and definitely don't assume it's at 0xFEE00000). Just because Intel felt like calling the "APIC base MSR" an architectural MSR, it doesn't mean that older CPUs support it (Pentium or older CPUs don't), or that other CPU manufacturers also consider it architectural, or that no CPU manufacturer will break it future. Ironically, Intel was the first CPU manufacturer to break this when they introduced x2APIC.
2) Don't use "broadcast to all but self". It has several problems, including:
- For Netburst CPUs with hyper-threading, if hyper-threading is disabled in the BIOS you start the disable CPUs anyway. Note: Hyper-threading in Nehalem CPUs works a bit differently, because it can be totally disabled (rather than "pretend disabled" like in Netburst CPUs). I don't know if it's "safe" to broadcast the INIT/SIPI sequence in all cases on Nehalem (for e.g. it's possible for BIOS/chipset bugs to cause strange behaviour that nobody noticed because no sane OS broadcasts the INIT/SIPI sequence).
- If the firmware detected that some CPUs were faulty (and the firmware makes sure these faulty CPUs aren't listed in the MP specification tables and ACPI tables to make sure the OS doesn't attempt to use them), then using "broadcast to all but self" will attempt to start/use faulty CPUs.
- There's no way to tell if CPUs that should've started actually did start (for e.g. if there's 7 AP CPUs and only 5 of them start correctly, then your code doesn't notice that there's been any problems)
- I wouldn't trust the "delivery status" bit in the ICR for broadcast IPIs of any kind - it may only say if the IPI was sent correctly and if at least one CPU received it, and might reliably determine if all CPUs (rather than just one) received the IPI/s.
- It breaks backward compatibility for X2APIC (you can accidentally start CPUs that are in "x2APIC mode" that your OS can't handle)
3) The INIT/SIPI/SIPI startup sequence described by Intel (which is what you're using) is a pain in the neck - it requires precise timing (which typically isn't easily available in a standard way during early boot), it's possible for the AP CPU to start executing code after the receiving the first SIPI but before the receiving second SIPI, and it's slower (often the full 200 us delay isn't necessary). Don't use it. It's better to send the INIT IPI, then do the first delay (at least 10 ms, but longer seems fine), then send the first SIPI, then wait to see if the AP CPU starts or not (with a time-out that can be a lot longer than the 200 us that Intel say). If the first SIPI was enough to start the AP CPU then you're done (don't send the second SIPI at all). Otherwise, send the second SIPI then wait to see if the AP CPU starts or not (with a time-out that can be much longer than the 200 us that Intel say) and fail the CPU if it doesn't start after the second SIPI. In this case you will need some synchronisation between the AP CPU and the BSP to avoid race conditions (e.g. AP CPU sets a flag to tell the BSP it started correctly, and when the BSP notices this flag is set then the BSP set a flag to tell the AP CPU it can continue).
Also note that the way I do the AP CPU startup doesn't require precise timing - setting the PIT to 1000 Hz and using time-outs of "between 1 ms and 2 ms" for the SIPI delays is fine.
About x2APIC....
Nehalem CPUs support x2APIC; where the local APIC has a 2 modes - the "xAPIC" mode (for backward compatibility) that you're using, and the new "x2APIC" mode. When a CPU is using the x2APIC mode the local APIC's registers are accessed via. MSRs instead of being mapped into the physical address space, and there is no "local APIC base address". When a computer has lots of CPUs, some CPUs have APIC IDs that are larger than 256 and can't be represented by the 8-bit APIC IDs that used in "xAPIC" mode. In this case the firmware is required to put those CPUs into "X2APIC mode", and create different entries in the ACPI tables for these CPUs. This means that a normal OS that doesn't support x2APIC and only understands the normal ACPI entries with 8-bit APIC IDs won't start CPUs that need 32-bit APIC IDs and will work correctly (just those extra CPUs won't be used). By broadcasting the INIT/SIPI/SIPI sequence you bypass this and you'd start any CPUs that are in x2APIC mode, even if your OS can't handle CPUs that are in x2APIC mode.
Cheers,
Brendan