Startup IPI does not launch the AP code to the right address

Oxmose · Post by **Oxmose** » Tue Mar 10, 2020 7:48 pm

Hey there!

I am taking back my old kernel because I need it for some class work.
The issue I have is the following:
On QEMU and VirtualBox everything is ok, but on a real hardware platform, when sending the startup IPI to a core, this core triple faults and the computer restarts.
The code I have at the reset address : 0x8000 is a simple :

Code: Select all

ap_start_:
hlt
jmp ap_start

So, it looks like the init IPI and startup IPI work great as the AP starts.
However, when debugging on QEMU I can see that the AP core actually starts fetching at 0x0.

The value I write for the SIPI in the LAPIC register is 0x0000_4608.

I checked the memory content at 0x8000 (mapped 1:1) and it is the correct code.
If you have any idea of what is going on let me know!
Thanks in advance!

bzt · Post by **bzt** » Thu Mar 12, 2020 8:56 am

Hi,

I don't know what's wrong with your code, but a few hints:
1. is your code at 0x8000 compiled for 16 bits?
2. what happens if an AP receives an interrupt? Do you have a proper stack for the CPU? What about the IDT? Is the PIC masked properly?
3. I believe you should set bits 18-19 of ICR to 1. According to the docs, this means send interrupt to all cores except the current one, and this works for me.

Here's my code that works on real hardware too.

I do (note I use the PS2 oscillator for delays):
1. write 0xC4500 to ICR
2. wait 10 millisec
3. write 0xC4607 to ICR
4. wait 1 millisec
5. write 0xC4607 to ICR again

And in the AP startup code at 0x700:0000, I also explicitly set CS:IP and SS:SP (to avoid further possible troubles). Hope this helps.

Cheers,
bzt

nullplan · Post by **nullplan** » Thu Mar 12, 2020 9:35 am

bzt wrote:1. is your code at 0x8000 compiled for 16 bits?

The given code is a hlt and a short jmp, those are mode independent.

bzt wrote:2. what happens if an AP receives an interrupt? Do you have a proper stack for the CPU? What about the IDT? Is the PIC masked properly?

PIC in multi-core mode? That's a recipe for disaster. But you might be on to something here. After an INIT IPI, the core should be in reset state, which means - if my manual is not mistaken - that the IDTR is set to a base address of 0 and a length limit of 64k. Given that the CPU is in real mode at that point, if an interrupt did occur, it would be vectored according to the IVT assumed present at 0. However, the very same manual also says that the flags register is initialized to a value of 2, so no interrupt flag is set. That leaves NMIs, tho. Is there a valid IVT present at address 0?

Try the following code. It should either work or tripple fault:

Code: Select all

ap_start_:
mov ax,cs
mov ss,ax
xor sp,sp
push word 0
push word 0
push word 0
lidt [sp]
popf
.loop:
hlt
jmp .loop

bzt wrote:3. I believe you should set bits 18-19 of ICR to 1. According to the docs, this means send interrupt to all cores except the current one, and this works for me.

I would strongly counsel against that. Sending a startup IPI to all nodes will send it to deactivated nodes as well. Sometimes they are deactivated for a reason. Sometimes cores are broken (remember tripple-core CPUs? Those were quad-core CPUs with one broken core), or maybe the user really wanted to disable cores for whatever reason (e.g. power, heat), or maybe in a hyperthreading system, the hyperthreading is broken and only one thread per core works (AMD's first foray into hyperthreading had a bit of a bumpy start).

Yeah, it'll work most of the time, but I work in the software industry, I know exactly how bad code can become before it will stop working most of the time.

Oxmose · Post by **Oxmose** » Thu Mar 12, 2020 2:27 pm

Hello, first thanks for the reply.

So to answer the subjects:
1- My code is 16 bits compatible, the AP should not receive any interrupt at the moment when it start (I dont even have the time to execute the first instruction).
I use the IO-APIC and the masks are correctly set.

2- The actual code is

Code: Select all

cli
ap_start:
    hlt
    jmp ap_start

3. I dont want to broadcast the IPI but do it sequentialy while controlling which one I want to start of not.

4. Actually I feel like the first instruction is not even fetched.

5. The IVT from the BIOS is not modified by my kernel whatsoever (excepted maybe because of grub but I'm not sure about that).

Oxmose · Post by **Oxmose** » Sat Mar 14, 2020 8:46 pm

Hi everyone,

So long story short, it was a rookie mistake.
On Qemu and VirtualBox, the Lapic ID == CPU ID.

I was assuming that it was the same on actual hardware. But on my computer the lapics ids are as follows:
CPU 0 : Lapic 0
CPU 1 : Lapic 2
Cpu 2 : Lapic 4
...

So I was using the wrong LAPIC ID when making the INIT IPI and Startup IPI.
Now that I take this fact into account, everything run smoothly.

Thanks for your help!

OSDev.org

Startup IPI does not launch the AP code to the right address

Startup IPI does not launch the AP code to the right address

Re: Startup IPI does not launch the AP code to the right add

Re: Startup IPI does not launch the AP code to the right add

Re: Startup IPI does not launch the AP code to the right add

Re: Startup IPI does not launch the AP code to the right add