x2APIC/NUMA Emulator

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: x2APIC/NUMA Emulator

Post by Brendan »

Hi,
stlw wrote:- Could somebody explain me a CPUID leaf 0xB output ? Especially EAX and EBX registers.
My understanding of it goes like this:

Code: Select all

    EDX(31:0) = x2APIC ID for current logical CPU
    RDX(63:31) = reserved = 0

    if(ECX == 0) {   // First level
        EAX(4:0) = number of bits to shift x2APIC ID right to get "core number"
        EAX(31:5) = reserved = 0
        RAX(63:31) = reserved = 0
        EBX(15:0) = number of logical CPUs per core
        EBX(31:16) = reserved = 0
        RBX(63:31) = reserved = 0
        ECX(7:0) = ECX(7:0) = *always* unchanged
        ECX(15:8) = 0x01 = returned values describe SMT
        ECX(31:16) = reserved = 0
        RCX(63:31) = reserved = 0
    } else if(ECX == 1) {   // Second level
        EAX(4:0) = number of bits to shift x2APIC ID right to get "chip number"
        EAX(31:5) = reserved = 0
        RAX(63:31) = reserved = 0
        EBX(15:0) = number of logical CPUs per chip
        EBX(31:16) = reserved = 0
        RBX(63:31) = reserved = 0
        ECX(7:0) = ECX(7:0) = *always* unchanged
        ECX(15:8) = 0x02 = returned values describe cores
        ECX(31:16) = reserved = 0
        RCX(63:31) = reserved = 0
    } else {   // Invalid initial value in ECX
        EAX(4:0) = reserved = 0
        EAX(31:5) = reserved = 0
        RAX(63:31) = reserved = 0
        EBX(15:0) = reserved = 0
        EBX(31:16) = reserved = 0
        RBX(63:31) = reserved = 0
        ECX(7:0) = ECX(7:0) = *always* unchanged
        ECX(15:8) = reserved = 0
        ECX(31:16) = reserved = 0
        RCX(63:31) = reserved = 0
    }
However, if the CPU doesn't support hyper-threading, then I'd expect it to go like this:

Code: Select all

    EDX(31:0) = x2APIC ID for current logical CPU
    RDX(63:31) = reserved = 0

    if(ECX == 0) {   // First level
        EAX(4:0) = number of bits to shift x2APIC ID right to get "chip number"
        EAX(31:5) = reserved = 0
        RAX(63:31) = reserved = 0
        EBX(15:0) = number of logical CPUs per chip
        EBX(31:16) = reserved = 0
        RBX(63:31) = reserved = 0
        ECX(7:0) = ECX(7:0) = *always* unchanged
        ECX(15:8) = 0x02 = returned values describe cores
        ECX(31:16) = reserved = 0
        RCX(63:31) = reserved = 0
    } else {   // Invalid initial value in ECX
        EAX(4:0) = reserved = 0
        EAX(31:5) = reserved = 0
        RAX(63:31) = reserved = 0
        EBX(15:0) = reserved = 0
        EBX(31:16) = reserved = 0
        RBX(63:31) = reserved = 0
        ECX(7:0) = ECX(7:0) = *always* unchanged
        ECX(15:8) = reserved = 0
        ECX(31:16) = reserved = 0
        RCX(63:31) = reserved = 0
    }
Also note that this means APIC IDs may not be sequential. For example, for a computer with 2 triple-core CPUs you'd probably end up with APIC IDs 0x00000000, 0x00000001, 0x00000002, 0x00000004, 0x00000005 and 0x00000006 (where APIC ID's 0x00000003 and 0x00000007 are unused), where the "number of bits to shift x2APIC ID right to get chip number" would be 2 (e.g. "0x00000004 >> 2 = chip number 1").
stlw wrote:- What about I/O APIC ?
It has some APIC ID as well and knows to send IPIs to one the APIC bus.
But I never seen any spec explaining if something has to change in the I/O APIC when local apics are in x2apic mode.
So it will keep sending the IPIs as usual and have the same 8-bit APIC ID ...
Have no idea how all it could work.
The best place to look is probably datasheets for Intel's X58 chipset. I took a quick look...

First, in all cases the I/O APIC is used to generate MSI interrupts (where local APICs receive MSI interrupts from both the I/O APIC and any PCI devices that support MSI). The chipset seems to have 2 modes - the normal/legacy mode where MSI interrupts from the I/O APIC and from PCI devices are delivered "as is" to local APICs (with 8-bit APIC ID's only), or you can enable Interrupt Remapping.

If Interrupt Remapping is enabled then bits 4 to 19 in the MSI is an "interrupt handle", and the chipset uses this interrupt handle as the index into an array of "Interrupt Transformation Table Entries", where the corresponding "Interrupt Transformation Table Entry" contains the 32-bit APIC ID to use for the MSI (and delivery mode, destination mode, vector, etc).

Basically, Interrupt Remapping is used for virtualization (e.g. hypervisors) *and* for supporting 32-bit (x2APIC) APIC IDs.

Also, AFAIK an OS that uses x2APIC for local APICs may continue to use I/O APICs without any Interrupt Remapping (for example, using "broadcast to lowest priority", or by making sure target CPUs have 8-bit logical APIC IDs or 8-bit physical APIC IDs).

I'm also wondering if AMD will implement x2APIC and Interrupt Remapping in the same way, or if OSs will need different code for Intel and AMD. I guess I won't find out until AMD start supporting x2APIC, but it's probably safe to assume they'll use similar methods (even if the implementation is slightly different).


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
User avatar
bewing
Member
Member
Posts: 1401
Joined: Wed Feb 07, 2007 1:45 pm
Location: Eugene, OR, US

Re: x2APIC/NUMA Emulator

Post by bewing »

JohnnyTheDon wrote:ugh....

Will you be rewriting the bochs bios as well bewing?
That's my current intention. How hard could it be? :wink:
Post Reply