Hi,
giovanig wrote:Hi Brendan, thank you for your answer.
Who said that the BSP's APIC ID is zero? It may not be.
Is it better to read and test the APIC_BASE_MSR instead? Something like this:
It's better to get the physical address of the local APIC, the APIC ID for all CPUs (including those using x2APIC that have 32-bit IDs and not 8-bit IDs), and determine which APIC ID corresponds to the BSP, by parsing either the ACPI table (if possible) or the MP specification table (if there's no ACPI).
giovanig wrote:Never broadcast the "INIT, SIPI, SIPI" sequence - it incorrectly attempts to start CPUs that failed their built-in self tests and logical CPUs that are disabled because hyper-threading is disabled in the BIOS.
How is the correct sequence? Should I send an INIT, SIPI, SIPI sequence for each AP? Could you give me an example?
Correct sequence should be something like:
Code: Select all
for each CPU mentioned by ACPI or MP spec {
send INIT IPI;
delay()
send Startup IPI;
delay()
send Startup IPI;
wait for AP CPU to set some flag or something, with a time-out in case
if(time-out) {
remember AP failed to start and maybe display an error message
}
}
I also tend not to use Intel's method. Sometimes (often) CPUs start after the first Startup IPI and the second Startup IPI isn't needed. Instead I do something like:
Code: Select all
for each CPU mentioned by ACPI or MP spec {
send INIT IPI;
delay()
send Startup IPI;
wait for AP CPU to set some flag or something, with a time-out in case
if AP CPU didn't start before time-out {
send Startup IPI;
wait for AP CPU to set some flag or something, with a time-out in case
if AP CPU didn't start before time-out {
remember AP failed to start and maybe display an error message
}
}
}
Note that this is racey. For example, the AP CPU may have started immediately after the BSP stops waiting for the AP CPU to start. To avoid this I add some synchronisation - when an AP CPU starts it sets some sort of flag to tell BSP that it started, and when BSP sees this flag and knows that the AP CPU did start it sets another flag to tell the AP CPU that it can continue. Also, the last time-out should be relatively long.
There are also ways to speed this up a little (imagine there's 128 CPUs and each CPU takes 10 ms to start - it'd create a noticeable delay during boot if you start them one at a time). One way is for BSP to start 1 CPU, then both CPUs start 2 more, then 4 CPUs start 4 more, etc. Another way is to send the INIT IPI to each AP CPU, then have one delay, then send Starup IPIs to each CPU, etc.
giovanig wrote:If there's 7 AP CPUs, they all wait for the BSP to create one stack?
Yes, they all wait for Stacks_Ready = true, as the code above.
Is "Stacks_Ready" volatile? None of the code you've posted shows how the AP CPU sets its stack. I'm assuming you do something like "address_of_stack = (my_APIC_ID * stack_size) + address_of_stack_pool" which would be wrong (as APIC IDs are not guaranteed to be sequential).
I normally allocate the AP CPU's stack and the store the "address_of_stack" somewhere (e.g. in the trampoline) where the AP CPU I'm starting can get it.
giovanig wrote:Code: Select all
DISK_IMAGE = 0x8000 ; = BOOT_IMAGE_PHY_ADDR from memory_map
; System Information
DISK_IMAGE_SETUP = (DISK_IMAGE + DISK_SECT_SIZE)
; SETUP entry point
SETUP_ENTRY = (DISK_IMAGE_SETUP + ELF_HDR_SIZE)
TRAMPOLINE_STACK = 0x3000 ; SMP trampoline stack (descendent, 4K)
;========================================================================
; TRAMPOLINE
;
; Desc: "trampolines" additional CPUs into protected mode in SMP
; configurations =
;========================================================================
trampoline:
cli ; disable interrupts
xor ax,ax ; data segment base = 0x00000
mov ds,ax
mov es,ax
mov ss,ax
mov sp,#TRAMPOLINE_STACK ; set stack pointer
;; mov [0xB8004], #(0x32 & 0xFF) ;
The AP CPU never uses the real mode stack ("TRAMPOLINE_STACK"), which is good because you've got all AP CPUs running this code (and using "TRAMPOLINE_STACK") at the same time, but the RAM you've reserved for the real mode stack is wasted.
giovanig wrote:As soon as I get this boot working, I can add ACPI support in our OS, so we can get these information from there.
There's 2 parts to ACPI. The first part is just reading various pieces of data from tables during boot. This is relatively easy and impossible for correct code to avoid. The second part is the "run-time" stuff, which includes an AML interpreter and other things. The second part is an unholy mess. You only need the first part at the moment (the easy part, not the unholy mess part).
giovanig wrote:I've made another test to see if the APs are reaching the trampoline code. I added the following code to print 'A' in the screen with background in blue:
That only says if one AP CPU started. The BSP could put a '0' in display memory and the AP CPUs could "lock inc byte []" it so you see a number that indicates how many AP CPUs have incremented it. You could even have several of these (one for each different place in the code), and insert dummy loops (e.g. just after each AP CPU's "lock inc []") to slow things down in case things happen too fast for you to see.
I'm usually too lazy for that though. I tend to just put endless loops ("jmp $") in the code to see if it crashes before it reaches a certain point (or locks up when it does reach that point). You can usually get a good idea where the crash occurs after shifting the endless loop a few times.
Cheers,
Brendan