Page 1 of 1

Multi CPU support

Posted: Tue Feb 20, 2007 2:29 pm
by PyroMathic
lo,

I just got a new pc, whit a core 2 duo inside and i already got the 64-bit part working, i just dont have a clue on where to start whit the Multi-core. Since i would also like to be capeble to use both cpu's.

Couple of questions, about MP system.:
I dont have the APIC working yet, i think i need this first? since i geusse i need the IO/APIC to enable the second CPU.

And what should be done after this?


any help is welcome

Regards
PyroMathic

Re: Multi CPU support

Posted: Tue Feb 20, 2007 11:01 pm
by Brendan
Hi,
PyroMathic wrote:Couple of questions, about MP system.:
I dont have the APIC working yet, i think i need this first? since i geusse i need the IO/APIC to enable the second CPU.
First, download [url=developer.intel.com/design/pentium/datashts/24201606.pdf]Intel's Multiprocessor Specification[/url]. It describes a lot of the details (chipset, BIOS, etc) that aren't part of the actual CPU (and therefore aren't included in the CPU's manuals).

Also remember that there's local APICs and I/O APICs. The local APICs are part of each CPU and are used to receive interrupts from the I/O APIC and other local APICs, and to send "interprocessor" interrupts to other CPUs. The I/O APICs are like the old PIC chips - they monitor the interrupt lines from devices and send interrupts to local APICs. You don't need to care about the I/O APICs until you start writing device drivers, etc (but you do need to play with local APICs).

Each local APIC has an "APIC ID", and each I/O APIC has an "APIC ID". This is an 8 bit number that could be anything, that is used to determine who receives an interrupt.

To start other CPUs you first need to detect what other CPUs are present (their local APIC IDs), and detect where the local APIC is mapped in memory (usually at 0xFEE00000, but necessarily always). You can get this information from the Multiprocessor Specification or from ACPI. It's mostly the same in either case - finding and reading data from tables created by the BIOS.

Once you've got these details you need to send a sequence of IPIs (interprocessor interrupts) from the BSP (the boot CPU that's already running) to the CPU/s you want to start. This sequence is described in Intel's Multiprocessor Specification.

Intel's method uses very short time delays that are annoying to measure correctly. I've found the time delay after the INIT IPI doesn't matter much (as long as it's longer than 10 ms), and the time delay after the STARTUP IPI needs to be very short because the CPUs usually start on the first STARTUP IPI.

I use a modified form of Intel's method to avoid some of the timing requirements and don't support the "82489DX" (which is an old local APIC chip used in 80486). I send an INIT IPI, wait for about 20 ms, then send a STARTUP IPI, then wait for 500 ms for the CPU to start. If it hasn't started I send another STARTUP IPI and start waiting again (and give up if it doesn't start after another 500 ms).
PyroMathic wrote:And what should be done after this?
What would you like to do after this? :)

My OS does memory detection, etc before starting CPUs. When CPUs are started I make each CPU do it's own CPU identification seperately (as I support different CPUs in the same system). Then I detect NUMA information.

After this I decide what sort of kernel I'm starting - if all CPUs support long mode then I start a "64-bit kernel setup" module. Otherwise, if all CPUs support PAE and if there's physical memory above 4 GB I start a "36-bit/PAE kernel setup" module, or I start a "32-bit kernel setup module". These modules are responsible for creating paging data structures (page tables, etc), switching to protected mode or long mode, and loading/starting kernel modules.


Cheers,

Brendan

Posted: Wed Feb 21, 2007 11:44 am
by gaf
Brednan wrote:To start other CPUs you first need to detect what other CPUs are present (their local APIC IDs), and detect where the local APIC is mapped in memory (usually at 0xFEE00000, but necessarily always). You can get this information from the Multiprocessor Specification or from ACPI. It's mostly the same in either case - finding and reading data from tables created by the BIOS.

Have you any experiences with the Intel multiprocessor tables on multicore systems ?

As far as I know the Intel standard is rather outdated today as ACPI has taken its position. This also mean that the specifications are no longer updated or extended as new technologies are introduced. Especially since support for hyperthreading has never been included in the standard I've some doubts whether multicore systems are detected correctly.

It should also be noted that there's an alternative way of detecting the number of cores and logical threads on a given cpu. The basic approach is to get the number of logical processors from a series of CPUID call and use this information to reconstruct the APIC IDs of all other cores and HT processors from the ID of any processor (f.ex bootstrap) on that physical package. Chapter 7.10 of the Intel Manual 3a describes the process in greater detail and provides some sample code.

Of course this approach only works for single processor machines: If you have to detect your I/O APIC or want to support the 4x4 design AMD introduced a while ago walking through the tables mentioned by Brendan will still be necessary.

regards,
gaf

Posted: Thu Feb 22, 2007 12:40 am
by Brendan
Hi,
gaf wrote:
Brednan wrote:To start other CPUs you first need to detect what other CPUs are present (their local APIC IDs), and detect where the local APIC is mapped in memory (usually at 0xFEE00000, but necessarily always). You can get this information from the Multiprocessor Specification or from ACPI. It's mostly the same in either case - finding and reading data from tables created by the BIOS.

Have you any experiences with the Intel multiprocessor tables on multicore systems ?
Yes.

So far I've written code to detect other CPUs using both tables about 4 times (and have tested it on at least 2 real SMP machines). The way I do it is to look for the ACPI tables and then look for Intel's MP tables if ACPI isn't present, however I write the MP table stuff first so I can test it.

I've also done things from the other side for my Bochs BIOS - writing code to manually detect CPUs and then building the ACPI and MP specification tables for the OS.
gaf wrote:As far as I know the Intel standard is rather outdated today as ACPI has taken its position. This also mean that the specifications are no longer updated or extended as new technologies are introduced. Especially since support for hyperthreading has never been included in the standard I've some doubts whether multicore systems are detected correctly.
Yes - anything made after (roughly) the year 2000 should have ACPI. My OS is "Pentium or later if you want SMP support", which goes back to machines made after the year 1993, and means there's about 7 years where SMP machines will have MP tables but won't have ACPI.

For multi-core and hyper-threading, the ACPI table reports all logical CPUs (if hyper-threading isn't disabled in the BIOS) while Intel's MP tables only report physical CPUs (i.e. multi-core and multi-chip, but not hyper-threading, AFAIK).

I'd recommend doing the same as I do - check for ACPI first, then use the MP tables if ACPI isn't present (although this depends on your OS's minimum requirements - for a "64-bit only" OS I wouldn't bother with Intel's MP tables).
gaf wrote:It should also be noted that there's an alternative way of detecting the number of cores and logical threads on a given cpu. The basic approach is to get the number of logical processors from a series of CPUID call and use this information to reconstruct the APIC IDs of all other cores and HT processors from the ID of any processor (f.ex bootstrap) on that physical package. Chapter 7.10 of the Intel Manual 3a describes the process in greater detail and provides some sample code.
Correct, but if you're going to support "multi-core" then you may as well support SMP - there's very little practical difference (except detection) - the code to start extra CPUs is the same, the OS needs the same re-entrancy locking, etc. The only real differences are optional optimizations (for e.g. detecting which caches are shared within a chip, and adjusting the scheduling, etc to keep caches hot while still allowing tasks to run on different CPUs).

Also note that Intel's documentation describes Intel's CPUs only. IIRC Intel uses CPUID with EAX = 0x00000004, while AMD use CPUID with EAX = 0x80000008, and AMD don't support Intel's method, and Intel don't support AMDs method.

For completeness, there is another way to detect the other CPUs. It's possible to do the multi-CPU startup sequence using broadcast IPIs (send to all except self), so that all extra CPUs start running. Once an extra CPU is running it'd incrememt a "total CPUs found" counter and store it's APIC ID somewhere.

There's 3 problems with this "manual detection" method. First you have to assume the local APIC is at 0xFEE00000 when it might not be. Second, it ignores the "disable hyper-threading" BIOS setting (in some situations disabling hyperthreading can improve performance). Lastly, when a faulty CPU is present the BIOS can start it, find out it's faulty then disable it, but you'd end up trying to use the faulty CPU.

The last 2 reasons are also reasons why an OS should never use "send to all" IPIs for some of the interrupt types - you shouldn't assume your OS knows about all CPUs. The "send to all" broadcast IPI does work fine for normal IPIs as they are ignored by disabled CPUs, but SMI, NMI, STARTUP and INIT aren't ignored by disabled CPUs and can cause unexpected problems.


Cheers,

Brendan

Posted: Thu Feb 22, 2007 10:16 am
by gaf
I'd recommend doing the same as I do - check for ACPI first, then use the MP tables if ACPI isn't present
Actually I already have a decent SMP kernel that uses the Intel tables for cpu detection. While ACPI might have some advantages its great complexity and the huge size of the specification keep me from adopting it. At least for the moment a standard that defines its own language seems like overkill for my tiny project..
For multi-core and hyper-threading, the ACPI table reports all logical CPUs (if hyper-threading isn't disabled in the BIOS) while Intel's MP tables only report physical CPUs (i.e. multi-core and multi-chip, but not hyper-threading, AFAIK).
Ok, that should clear-up what I was wondering about: As the Intel tables of my pentium4 didn't list the secondary logical processor I was afraid that support for these tables might have been discontinued (i.e: The tables are still there for backwards compatibility reasons, but support for any new technologies like hyperthreadig or multicore systems is not added anymore).
You have to assume the local APIC is at 0xFEE00000 when it might not be.
As far as I know the location of the APIC registers can also be determined by reading one of the machine-specific registers (Intel Manual 3a, Chapter 8.4.4 and Intel Manual 3b, Appendix B1, Register 0x1B)

cheers,
gaf

Posted: Thu Feb 22, 2007 11:24 am
by Brendan
Hi,
gaf wrote:
I'd recommend doing the same as I do - check for ACPI first, then use the MP tables if ACPI isn't present
Actually I already have a decent SMP kernel that uses the Intel tables for cpu detection. While ACPI might have some advantages its great complexity and the huge size of the specification keep me from adopting it. At least for the moment a standard that defines its own language seems like overkill for my tiny project..
Yes, but mostly no... ;)

Full ACPI support (including power management and the run-time side of things) is a huge and messy thing. Parsing ACPI's MADT (Multiple APIC Description Table) and ignoring everything else is very similar to parsing Intel's MP specification tables. Of course there are other tables that you could parse during boot but they're all optional.
gaf wrote:
For multi-core and hyper-threading, the ACPI table reports all logical CPUs (if hyper-threading isn't disabled in the BIOS) while Intel's MP tables only report physical CPUs (i.e. multi-core and multi-chip, but not hyper-threading, AFAIK).
Ok, that should clear-up what I was wondering about: As the Intel tables of my pentium4 didn't list the secondary logical processor I was afraid that support for these tables might have been discontinued (i.e: The tables are still there for backwards compatibility reasons, but support for any new technologies like hyperthreadig or multicore systems is not added anymore).
Hyper-threading is different in that there are some things that don't behave the same as "multi-chip SMP" - parts of the CPU that are shared between logical CPUs (e.g. MTRRs), performance implications (PAUSE, HLT, scheduler optimizations for better load balancing, etc), work done by one logical CPU effecting the speed of code running on another logical CPU, etc. Combined with overall performance improvements that never reach 200% (what you'd expect with 2 seperate CPUs) it's likely an older OSs that isn't optimized to suit hyper-threading won't get any performance from hyper-threading (and in some cases will actually run faster when hyper-threading is disabled).

Multi-core is very similar to "multi-chip SMP", and an older OS designed for SMP will work quite well without any changes at all.

It all makes sense from a "backwards compatability" perspective.
gaf wrote:
You have to assume the local APIC is at 0xFEE00000 when it might not be.
As far as I know the location of the APIC registers can also be determined by reading one of the machine-specific registers (Intel Manual 3a, Chapter 8.4.4 and Intel Manual 3b, Appendix B1, Register 0x1B)
For P6 and later CPUs you are entirely correct... ;)


Cheers,

Brendan

Posted: Mon Feb 26, 2007 12:16 pm
by PyroMathic
lo,

thx for all the replies.
Think that i am going to try to do it right and read out the ACPI registers to get the local APIC ID's of the other cpu's.

Regards
PyroMathic