Adding SMP support
Adding SMP support
Hello!
I'm currently trying to implement the SMP architecture in an Operating System, with the x86_64 architecture. I managed to find and parse the ACPI tables, I wrote the wake up sequence and the trampoline code, but the AP's do not wake up. i should mention that the OS is loaded by Multiboot, and it runs on KVM.
I use the APIC in the x2APIC mode. When I enable the x2APIC mode, the EN bit in the Spurious Interrupt Register is already set, as seen in the debug messages.
This is the wake up sequence, with the values of the ICR. As I've read, in the x2APIC mode, I don't need to check the Delivery Bit in ICR, but I do, for debug purposes.
Going with gdb, I see that the processors are not even entering the trampoline code, but are in the state where the booting process puts them
Also, one weird thing: when I send an INIT IPI with the Destination Shorthand 11, "All Excluding Self", the bootstrapping processor reboots.
What am I missing?
I'm currently trying to implement the SMP architecture in an Operating System, with the x86_64 architecture. I managed to find and parse the ACPI tables, I wrote the wake up sequence and the trampoline code, but the AP's do not wake up. i should mention that the OS is loaded by Multiboot, and it runs on KVM.
I use the APIC in the x2APIC mode. When I enable the x2APIC mode, the EN bit in the Spurious Interrupt Register is already set, as seen in the debug messages.
This is the wake up sequence, with the values of the ICR. As I've read, in the x2APIC mode, I don't need to check the Delivery Bit in ICR, but I do, for debug purposes.
Going with gdb, I see that the processors are not even entering the trampoline code, but are in the state where the booting process puts them
Also, one weird thing: when I send an INIT IPI with the Destination Shorthand 11, "All Excluding Self", the bootstrapping processor reboots.
What am I missing?
-
- Member
- Posts: 5563
- Joined: Mon Mar 25, 2013 7:01 pm
Re: Adding SMP support
Please don't post pictures of code. They're very difficult to work with.
Don't read the ICR at all, just write zeroes to the reserved bits.
In x2APIC mode, the destination field is 32 bits, not 8 bits. Your IPIs are probably never reaching the target processors in the first place. Also, the destination is a logical x2APIC ID, which is not the same as the local x2APIC ID. The Intel SDM volume 3A section 10.12.10.2 explains how to find the logical x2APIC ID if you only have the local x2APIC ID.
In all modes, the vector field must be 0 for an INIT IPI.
There's no delivery status bit in the ICR in x2APIC mode.CristiV wrote:This is the wake up sequence, with the values of the ICR. As I've read, in the x2APIC mode, I don't need to check the Delivery Bit in ICR, but I do, for debug purposes.
Don't read the ICR at all, just write zeroes to the reserved bits.
In x2APIC mode, the destination field is 32 bits, not 8 bits. Your IPIs are probably never reaching the target processors in the first place. Also, the destination is a logical x2APIC ID, which is not the same as the local x2APIC ID. The Intel SDM volume 3A section 10.12.10.2 explains how to find the logical x2APIC ID if you only have the local x2APIC ID.
In all modes, the vector field must be 0 for an INIT IPI.
Actual OSes never do this, since it might affect APs that were disabled by firmware, so I'd guess KVM doesn't support it.CristiV wrote:Also, one weird thing: when I send an INIT IPI with the Destination Shorthand 11, "All Excluding Self", the bootstrapping processor reboots.
Re: Adding SMP support
I calculated the Logical Address, using the Local APIC ID, because the MADT table doesn't have x2APIC entries. The ID for the BSP is the same as the one in LDR. After the INIT - SIPI sequence, the OS restarts.
[ 0.248968] dbg: [libkvmplat] <smp.c @ 88> eax: 0x4d00, edx: 0x2
[ 0.254468] dbg: [libkvmplat] <smp.c @ 94> eax: 0xd00, edx: 0x2
[ 0.277321] dbg: [libkvmplat] <smp.c @ 109> eax: 0x4e08, edx: 0x2
I enabled 2 cores through kvm
[ 0.248968] dbg: [libkvmplat] <smp.c @ 88> eax: 0x4d00, edx: 0x2
[ 0.254468] dbg: [libkvmplat] <smp.c @ 94> eax: 0xd00, edx: 0x2
[ 0.277321] dbg: [libkvmplat] <smp.c @ 109> eax: 0x4e08, edx: 0x2
I enabled 2 cores through kvm
Re: Adding SMP support
Could you post your code here if possible?
-
- Member
- Posts: 5563
- Joined: Mon Mar 25, 2013 7:01 pm
Re: Adding SMP support
Even if it did have x2APIC entries, the MADT still only gives you local APIC IDs and not logical APIC IDs. The x2APIC entries are just for local APIC IDs that don't fit in 8 bits.CristiV wrote:I calculated the Logical Address, using the Local APIC ID, because the MADT table doesn't have x2APIC entries.
I realize now looking at the spec again that the x2APIC still supports physical destinations, so you don't actually have to use logical x2APIC IDs if you don't want to - physical ones should still work fine.
A triple fault on the AP could cause this.CristiV wrote:After the INIT - SIPI sequence, the OS restarts.
INIT IPIs should have the trigger mode set to level. I'm sure it works fine either way, but the MP spec says to use level and not edge.CristiV wrote:[ 0.248968] dbg: [libkvmplat] <smp.c @ 88> eax: 0x4d00, edx: 0x2
[ 0.254468] dbg: [libkvmplat] <smp.c @ 94> eax: 0xd00, edx: 0x2
Re: Adding SMP support
The whole code can be found in my git repo, if you want to test it yourself:Ethin wrote:Could you post your code here if possible?
https://github.com/cristian-vijelie/unikraft/tree/smp
To run it: make menuconfig -> Platform Configuration -> KVM guest, Platform Interface Options -> SMP support
Then, to enable debug messages: in the main config menu: Library Configuration -> ukdebug -> Enable debug messages, Kernel message level -> Show all types of messages
At last, to run it:
make && sudo qemu-system-x86_64 -smp 2 -enable-kvm -m 128 -cpu host -kernel build/unikraft_kvm-x86_64 -serial stdio
To use gdb: sudo qemu-system-x86_64 -s -S -smp 2 -enable-kvm -m 128 -cpu host -kernel build/unikraft_kvm-x86_64 -serial stdio
and, in another terminal: gdb --eval-command="target remote :1234" ./build/unikraft_kvm-x86_64.dbg
I'll also post parts of the code:
Defines:
Code: Select all
#define IA32_APIC_BASE 0x1b
#define x2APIC_BASE 0x800
#define x2APIC_SPUR 0x80F
#define x2APIC_ESR 0x828
#define x2APIC_ICR 0x830
#define x2APIC_BASE_EXTD 10
#define x2APIC_BASE_EN 11
#define x2APIC_CPUID_BIT 21
#define x2APIC_SPUR_EN 8
#define x2APIC_ICR_DMODE_SMI 0x200
#define x2APIC_ICR_DMODE_NMI 0x400
#define x2APIC_ICR_DMODE_INIT 0x500
#define x2APIC_ICR_DMODE_SUP 0x600
#define x2APIC_ICR_DESTMODE_LOGICAL 0x800
#define x2APIC_ICR_LEVEL_ASSERT 0x4000
#define x2APIC_ICR_TRIGGER_LEVEL 0x8000
#define x2apic_logical_dest(x) ((((x) & 0xfff0) << 16) | (1 << ((x) & 0x000f)))
Code: Select all
void enable_cores(__u8 numcores)
{
__u8 bspid, ret;
int i, j;
__u32 ecx, eax, edx;
bspid = ukplat_lcpu_id();
uk_pr_info("Bootstrapping processor has the ID %d\n", bspid);
if (numcores > smp_numcores) {
uk_pr_info("Too many cores have been selected to be enabled. "
"Truncating to %d!\n",
smp_numcores);
numcores = smp_numcores;
}
memcpy((void *)0x8000, &_lcpu_start16, 4096);
uk_pr_info("Copied AP boot code to 0x8000\n");
uk_pr_debug("Computed logical ID for core 0: %d\n",
((lapic_ids[0] & 0xff00) << 16)
| (1 << (lapic_ids[0] & 0x00ff)));
rdmsr(0x80D, &eax, &edx);
uk_pr_debug("Logical ID from LDR: %d\n", eax);
for (i = 0; i < numcores; i++) {
if (i == bspid)
continue;
/* clear APIC errors */
wrmsr(x2APIC_ESR, 0, 0);
/* select AP and trigger INIT IPI */
eax = x2APIC_ICR_LEVEL_ASSERT | x2APIC_ICR_DESTMODE_LOGICAL
| x2APIC_ICR_DMODE_INIT;
edx = x2apic_logical_dest(lapic_ids[i]);
uk_pr_debug("eax: 0x%x, edx: 0x%x\n", eax, edx);
wrmsr(x2APIC_ICR, eax, edx);
/* deassert */
eax = x2APIC_ICR_DESTMODE_LOGICAL | x2APIC_ICR_DMODE_INIT;
edx = x2apic_logical_dest(lapic_ids[i]);
uk_pr_debug("eax: 0x%x, edx: 0x%x\n", eax, edx);
wrmsr(x2APIC_ICR, eax, edx);
/* wait 10 msec */
mdelay(10);
for (j = 0; j < 2; j++) {
/* clear APIC errors */
wrmsr(x2APIC_ESR, 0, 0);
/* select AP and trigger STARTUP IPI for 0x8000 */
eax = x2APIC_ICR_TRIGGER_LEVEL | x2APIC_ICR_LEVEL_ASSERT
| x2APIC_ICR_DESTMODE_LOGICAL
| x2APIC_ICR_DMODE_SUP | 0x08;
edx = x2apic_logical_dest(lapic_ids[i]);
uk_pr_debug("eax: 0x%x, edx: 0x%x\n", eax, edx);
wrmsr(x2APIC_ICR, eax, edx);
/* wait 200 usec */
udelay(200);
}
mdelay(10);
}
bspdone = 1;
}
Code: Select all
__u8 smp_init()
{
__u8 ret;
__u32 eax, edx;
ret = enable_x2apic();
if (ret) {
uk_pr_err("x2APIC could not be enabled!\n");
return -1;
}
rdmsr(x2APIC_SPUR, &eax, &edx);
uk_pr_debug(
"Spurious Interrupt Register has the values %x; EN bit: %d\n", eax,
(eax & (1 << x2APIC_SPUR_EN)) != 0);
if ((eax & (1 << x2APIC_SPUR_EN)) == 0) {
eax |= (1 << x2APIC_SPUR_EN);
wrmsr(x2APIC_SPUR, eax, edx);
uk_pr_debug("Spurious interrupt enabled\n");
}
find_madt();
if (madt == NULL)
return -1;
get_lapicid();
return 0;
}
Code: Select all
static __u8 enable_x2apic(void)
{
__u32 eax, edx, ecx;
__asm__ __volatile__("mov $1, %%eax; cpuid;" : "=c"(ecx) : :);
if (ecx & (1 << x2APIC_CPUID_BIT))
uk_pr_debug("x2APIC is supported; enabling\n");
else {
uk_pr_info("x2APIC is not supported\n");
return 1;
}
rdmsr(IA32_APIC_BASE, &eax, &edx);
uk_pr_debug(
"IA32_APIC_BASE has the value %x; EN bit: %d, EXTD bit: %d\n", eax,
(eax & (1 << x2APIC_BASE_EN)) != 0,
(eax & (1 << x2APIC_BASE_EXTD)) != 0);
/* set the x2APIC enable bit */
eax |= (1 << x2APIC_BASE_EXTD);
wrmsr(IA32_APIC_BASE, eax, edx);
uk_pr_info("x2APIC is enabled\n");
return 0;
}
Code: Select all
#define ENTRY(x) .globl x; .type x,%function; x:
#define END(x) .size x, . - x
.code16
ENTRY(_lcpu_start16)
r_base = .
cli
cld
wbinvd
mov %cs, %ax
mov %ax, %ds
mov %ax, %es
mov %ax, %ss
movw $(trampoline_stack_end - r_base), %sp
movl %cr0, %eax
orl $1, %eax
movl %eax, %cr0
ljmpl *(_lcpu_start32_vector - r_base)
END(_lcpu_start16)
.code32
.align 32
ENTRY(_lcpu_start32)
cld
/* 1: enable pae */
movl %cr4, %eax
orl $X86_CR4_PAE, %eax
movl %eax, %cr4
/* 2: enable long mode */
movl $0xc0000080, %ecx
rdmsr
orl $X86_EFER_LME, %eax
orl $X86_EFER_NXE, %eax
wrmsr
/* 3: load pml4 pointer */
movl $cpu_pml4, %eax
movl %eax, %cr3
/* 4: enable paging */
movl %cr0, %eax
orl $X86_CR0_PG, %eax
movl %eax, %cr0
jmp _lcpu_start64
/* NOTREACHED */
haltme2:
cli
hlt
jmp haltme2
END(_lcpu_start32)
.align 64
gdt64:
.quad 0x0000000000000000
gdt64_cs:
.quad GDT_DESC_CODE_VAL /* 64bit CS */
gdt64_ds:
.quad GDT_DESC_DATA_VAL /* DS */
.quad 0x0000000000000000 /* TSS part 1 (via C) */
.quad 0x0000000000000000 /* TSS part 2 (via C) */
gdt64_end:
.align 64
.type gdt64_ptr, @object
gdt64_ptr:
.word gdt64_end-gdt64-1
.quad gdt64
.type mxcsr_ptr, @object
mxcsr_ptr:
.long 0x1f80 /* Intel SDM power-on default */
#include "pagetable.S"
.code64
.align 32
ENTRY(_lcpu_start64)
lgdt (gdt64_ptr)
/* let lret jump just one instruction ahead, but set %cs
* to the correect GDT entry while doing that.
*/
pushq $(gdt64_cs-gdt64)
pushq $1f
lretq
1:
/* Set up the remaining segment registers */
movq $(gdt64_ds-gdt64), %rax
movq %rax, %ds
movq %rax, %es
movq %rax, %ss
xorq %rax, %rax
movq %rax, %fs
movq %rax, %gs
/* spinlock, wait for the BSP to finish */
spin:
pause
cmpb $0, bspdone
jz spin
lock incb smp_aprunning
movq $_lcpu_entry_default, %rax
jmp *%rax
END(_lcpu_start64)
.align 32
_lcpu_start32_vector:
.long _lcpu_start32 - r_base
.word 8, 0
.align 32
_lcpu_start64_vector:
.long _lcpu_start64 - r_base
.word 16, 0
trampoline_stack:
.space 0x1000
trampoline_stack_end:
Re: Adding SMP support
Shouldn't kvm send me an error, or anything, if this happened?Octocontrabass wrote:A triple fault on the AP could cause this.
Re: Adding SMP support
If I remember right, no, it doesn't. I'd recommend you set up an IDT before you do SMP initialization so you can figure out the problem. Until you do, figuring this out is going to be painful. If you set up an IDT you'll at least be able to, hopefully, figure out the problem just based on the fired interrupt.CristiV wrote:Shouldn't kvm send me an error, or anything, if this happened?Octocontrabass wrote:A triple fault on the AP could cause this.
-
- Member
- Posts: 5563
- Joined: Mon Mar 25, 2013 7:01 pm
Re: Adding SMP support
If it's reached, it will probably triple fault because it switches to protected mode without setting the GDTR.CristiV wrote:The trampoline code, which doesn't seem to be reached:
I would expect it to do the same thing real hardware will do, unless you specifically configure otherwise. Real hardware will usually reboot if any CPU triple faults.CristiV wrote:Shouldn't kvm send me an error, or anything, if this happened?
An IDT on the BSP won't help you when it's an AP triple faulting.Ethin wrote:I'd recommend you set up an IDT before you do SMP initialization so you can figure out the problem.
You might try disabling KVM and using "-d int" and "-no-reboot" to see if it's really a triple fault. (Unfortunately, it seems "-d int" isn't reliable with KVM.)
Re: Adding SMP support
I should've clarified that I meant an IDT on each AP. As well as a GDT.Octocontrabass wrote:If it's reached, it will probably triple fault because it switches to protected mode without setting the GDTR.CristiV wrote:The trampoline code, which doesn't seem to be reached:
I would expect it to do the same thing real hardware will do, unless you specifically configure otherwise. Real hardware will usually reboot if any CPU triple faults.CristiV wrote:Shouldn't kvm send me an error, or anything, if this happened?
An IDT on the BSP won't help you when it's an AP triple faulting.Ethin wrote:I'd recommend you set up an IDT before you do SMP initialization so you can figure out the problem.
You might try disabling KVM and using "-d int" and "-no-reboot" to see if it's really a triple fault. (Unfortunately, it seems "-d int" isn't reliable with KVM.)
Re: Adding SMP support
I was afraid of this. Without the "-cpu host" option, which cannot exist without KVM, I have to use the xAPIC mode, and I must mess with the page table. I'll update you when I manage to do this.Octocontrabass wrote: You might try disabling KVM and using "-d int" and "-no-reboot" to see if it's really a triple fault. (Unfortunately, it seems "-d int" isn't reliable with KVM.)
Re: Adding SMP support
I've done some more digging, and I found out that the AP starts, and right after the Startup IPI, it starts executing code at address 0x1, even though the Vector field specifies the address 0x8000. Any idea why this happens?
gdb-peda$ info threads
Id Target Id Frame
* 1 Thread 1.1 (CPU#0 [running]) rdmsr (hi=<synthetic pointer>, lo=<synthetic pointer>, msr=0x828)
at cpu.h:175
2 Thread 1.2 (CPU#1 [running]) 0x0000000000000019 in ?? ()
gdb-peda$ info threads
Id Target Id Frame
* 1 Thread 1.1 (CPU#0 [running]) rdmsr (hi=<synthetic pointer>, lo=<synthetic pointer>, msr=0x828)
at cpu.h:175
2 Thread 1.2 (CPU#1 [running]) 0x0000000000000019 in ?? ()
-
- Member
- Posts: 5563
- Joined: Mon Mar 25, 2013 7:01 pm
Re: Adding SMP support
Are you sure the problem isn't GDB? Last I checked, it assumes the CS base is always 0, so it displays nonsense when the CPU is in real mode with CS set to any nonzero value.
Re: Adding SMP support
I didn't know this.Octocontrabass wrote:Are you sure the problem isn't GDB? Last I checked, it assumes the CS base is always 0, so it displays nonsense when the CPU is in real mode with CS set to any nonzero value.
Well, I've loaded a GDT, using the example on this forum. Not an IDT, yet. But it still breaks somewhere.
Code: Select all
.section .text
.code16
ENTRY(_lcpu_start16)
r_base = .
cli
cld
ljmp $0, $0x8040
.align 16
_L8010_GDT_table:
.long 0, 0
.long 0x0000FFFF, 0x00CF9A00 /* flat code */
.long 0x0000FFFF, 0x008F9200 /* flat data */
.long 0x00000068, 0x00CF8900 /* tss */
_L8030_GDT_value:
.word _L8030_GDT_value - _L8010_GDT_table - 1
.long 0x8010
.long 0, 0
.align 64
_L8040:
xorw %ax, %ax
movw %ax, %ds
lgdtl 0x8030
movw $(trampoline_stack_end - r_base), %sp
movl %cr0, %eax
orl $1, %eax
movl %eax, %cr0
ljmp *(_lcpu_start32_vector - r_base)
END(_lcpu_start16)
Re: Adding SMP support
What vector are you sending? Remember that the vector of the SIPI determines where the processor begins initialization. The vector is 000VV000H, where VV is the initialization vector. So if you didn't send a vector or specified 0 for it, you'd be starting at 00000000H. That code can jump to your actual init code if you want it to. (Section 8.4 of the Intel SDMs provides more info on MP init.)