Re: x2APIC/NUMA Emulator
Posted: Sat Feb 21, 2009 3:43 am
Hi,
However, if the CPU doesn't support hyper-threading, then I'd expect it to go like this:
Also note that this means APIC IDs may not be sequential. For example, for a computer with 2 triple-core CPUs you'd probably end up with APIC IDs 0x00000000, 0x00000001, 0x00000002, 0x00000004, 0x00000005 and 0x00000006 (where APIC ID's 0x00000003 and 0x00000007 are unused), where the "number of bits to shift x2APIC ID right to get chip number" would be 2 (e.g. "0x00000004 >> 2 = chip number 1").
First, in all cases the I/O APIC is used to generate MSI interrupts (where local APICs receive MSI interrupts from both the I/O APIC and any PCI devices that support MSI). The chipset seems to have 2 modes - the normal/legacy mode where MSI interrupts from the I/O APIC and from PCI devices are delivered "as is" to local APICs (with 8-bit APIC ID's only), or you can enable Interrupt Remapping.
If Interrupt Remapping is enabled then bits 4 to 19 in the MSI is an "interrupt handle", and the chipset uses this interrupt handle as the index into an array of "Interrupt Transformation Table Entries", where the corresponding "Interrupt Transformation Table Entry" contains the 32-bit APIC ID to use for the MSI (and delivery mode, destination mode, vector, etc).
Basically, Interrupt Remapping is used for virtualization (e.g. hypervisors) *and* for supporting 32-bit (x2APIC) APIC IDs.
Also, AFAIK an OS that uses x2APIC for local APICs may continue to use I/O APICs without any Interrupt Remapping (for example, using "broadcast to lowest priority", or by making sure target CPUs have 8-bit logical APIC IDs or 8-bit physical APIC IDs).
I'm also wondering if AMD will implement x2APIC and Interrupt Remapping in the same way, or if OSs will need different code for Intel and AMD. I guess I won't find out until AMD start supporting x2APIC, but it's probably safe to assume they'll use similar methods (even if the implementation is slightly different).
Cheers,
Brendan
My understanding of it goes like this:stlw wrote:- Could somebody explain me a CPUID leaf 0xB output ? Especially EAX and EBX registers.
Code: Select all
EDX(31:0) = x2APIC ID for current logical CPU
RDX(63:31) = reserved = 0
if(ECX == 0) { // First level
EAX(4:0) = number of bits to shift x2APIC ID right to get "core number"
EAX(31:5) = reserved = 0
RAX(63:31) = reserved = 0
EBX(15:0) = number of logical CPUs per core
EBX(31:16) = reserved = 0
RBX(63:31) = reserved = 0
ECX(7:0) = ECX(7:0) = *always* unchanged
ECX(15:8) = 0x01 = returned values describe SMT
ECX(31:16) = reserved = 0
RCX(63:31) = reserved = 0
} else if(ECX == 1) { // Second level
EAX(4:0) = number of bits to shift x2APIC ID right to get "chip number"
EAX(31:5) = reserved = 0
RAX(63:31) = reserved = 0
EBX(15:0) = number of logical CPUs per chip
EBX(31:16) = reserved = 0
RBX(63:31) = reserved = 0
ECX(7:0) = ECX(7:0) = *always* unchanged
ECX(15:8) = 0x02 = returned values describe cores
ECX(31:16) = reserved = 0
RCX(63:31) = reserved = 0
} else { // Invalid initial value in ECX
EAX(4:0) = reserved = 0
EAX(31:5) = reserved = 0
RAX(63:31) = reserved = 0
EBX(15:0) = reserved = 0
EBX(31:16) = reserved = 0
RBX(63:31) = reserved = 0
ECX(7:0) = ECX(7:0) = *always* unchanged
ECX(15:8) = reserved = 0
ECX(31:16) = reserved = 0
RCX(63:31) = reserved = 0
}
Code: Select all
EDX(31:0) = x2APIC ID for current logical CPU
RDX(63:31) = reserved = 0
if(ECX == 0) { // First level
EAX(4:0) = number of bits to shift x2APIC ID right to get "chip number"
EAX(31:5) = reserved = 0
RAX(63:31) = reserved = 0
EBX(15:0) = number of logical CPUs per chip
EBX(31:16) = reserved = 0
RBX(63:31) = reserved = 0
ECX(7:0) = ECX(7:0) = *always* unchanged
ECX(15:8) = 0x02 = returned values describe cores
ECX(31:16) = reserved = 0
RCX(63:31) = reserved = 0
} else { // Invalid initial value in ECX
EAX(4:0) = reserved = 0
EAX(31:5) = reserved = 0
RAX(63:31) = reserved = 0
EBX(15:0) = reserved = 0
EBX(31:16) = reserved = 0
RBX(63:31) = reserved = 0
ECX(7:0) = ECX(7:0) = *always* unchanged
ECX(15:8) = reserved = 0
ECX(31:16) = reserved = 0
RCX(63:31) = reserved = 0
}
The best place to look is probably datasheets for Intel's X58 chipset. I took a quick look...stlw wrote:- What about I/O APIC ?
It has some APIC ID as well and knows to send IPIs to one the APIC bus.
But I never seen any spec explaining if something has to change in the I/O APIC when local apics are in x2apic mode.
So it will keep sending the IPIs as usual and have the same 8-bit APIC ID ...
Have no idea how all it could work.
First, in all cases the I/O APIC is used to generate MSI interrupts (where local APICs receive MSI interrupts from both the I/O APIC and any PCI devices that support MSI). The chipset seems to have 2 modes - the normal/legacy mode where MSI interrupts from the I/O APIC and from PCI devices are delivered "as is" to local APICs (with 8-bit APIC ID's only), or you can enable Interrupt Remapping.
If Interrupt Remapping is enabled then bits 4 to 19 in the MSI is an "interrupt handle", and the chipset uses this interrupt handle as the index into an array of "Interrupt Transformation Table Entries", where the corresponding "Interrupt Transformation Table Entry" contains the 32-bit APIC ID to use for the MSI (and delivery mode, destination mode, vector, etc).
Basically, Interrupt Remapping is used for virtualization (e.g. hypervisors) *and* for supporting 32-bit (x2APIC) APIC IDs.
Also, AFAIK an OS that uses x2APIC for local APICs may continue to use I/O APICs without any Interrupt Remapping (for example, using "broadcast to lowest priority", or by making sure target CPUs have 8-bit logical APIC IDs or 8-bit physical APIC IDs).
I'm also wondering if AMD will implement x2APIC and Interrupt Remapping in the same way, or if OSs will need different code for Intel and AMD. I guess I won't find out until AMD start supporting x2APIC, but it's probably safe to assume they'll use similar methods (even if the implementation is slightly different).
Cheers,
Brendan