OSDev.org

Posted: **Sat Nov 29, 2014 4:35 am**

Hi,

I have a hobby OS that I am experimenting with. I have followed James Molly tutorial as a guide.

I was able to reach the following:

Boot from my custom boot loader.
Switch to long mode directly, following the OSDev tutorial.
Setup a video driver
Mapping the whole memory using Page Map Level 4
Configure PIC and plug in service routines for PIT and RTC

Till here everything is working fine. I decided to start working on SMP. I was able to parse the APIC data strucutre, and start the APs either through broadcast of one by one.

Now, I started to focus on booting up one core to be able to test with.

I was able to initiate IIPI and SIPI successfully to boot up a core AP. I was also able to jump to my normal code that I have booted the BSP with, and reach my C Kernel main function. I was also able to point the new AP CR3 to the PML4 initialized by the BSP.

I have reserved a page of memory at address 0xA000 as a memory area which I can exchange data between the two cores. My problem is that whatever I write at memory location 0xA000 with the AP second core cannot be seen by the BSP. The BSP always see the old values, and the AP sees the new values it wrote.

My second question is that, at this point of time the PIC is enabled and is firing interrupts to the BSP. I read on OSDev that I should disable PIC and enable APIC! Why do I have to disable it? And if I disable it where should I connect the interrupt service routines? This is kind of vague to me and I don't have a clear understanding on how to continue from where I am right now.

Kindly, if you can point out any possible actions that I can take or investigate to solve my first problem that will be greate. Also I appreciate if any one can explain to me how to disable PIC and enable APIC and move my already existing service toutines to the APIC.

Thanks,
Karim.

Posted: **Sat Nov 29, 2014 6:23 am**

Hi,

kemosparc wrote:I have reserved a page of memory at address 0xA000 as a memory area which I can exchange data between the two cores. My problem is that whatever I write at memory location 0xA000 with the AP second core cannot be seen by the BSP. The BSP always see the old values, and the AP sees the new values it wrote.

That's very unlikely (especially if you haven't messed with the MTRRs, badly). If you're using C, make sure you use "volatile" where appropriate. Also make sure you actually are using the same addresses (e.g. and not use the real mode address 0xA000:0xA000 or something); and make sure there isn't any race conditions (e.g. BSP reads before AP writes, causing BSP to see the old data).

kemosparc wrote:My second question is that, at this point of time the PIC is enabled and is firing interrupts to the BSP. I read on OSDev that I should disable PIC and enable APIC! Why do I have to disable it?

You don't have to, but it's better if you do. Otherwise all IRQs go to the BSP (and you can't have multiple CPUs handling multiple IRQs at the same time). Also the PIC is mostly bad in general because it causes a lot of IRQ sharing for PCI devices, and for this reason it's best to use IO APIC if/where possible (including on single-CPU systems).

kemosparc wrote:And if I disable it where should I connect the interrupt service routines? This is kind of vague to me and I don't have a clear understanding on how to continue from where I am right now.

Instead of using the PIC to send IRQs to the CPU, you'd be using the IO APIC/s to send IRQs to the CPU. In both cases, the CPU executes the corresponding interrupt handler (described in the IDT).

There are some differences though. For example, "EOI" has to be different (no point sending it the PIC when you're using the IO APIC). Also, for IO APIC each "IO APIC input" has its own entry that says which interrupt vector the CPU should use.

The main difference is how hardware connects devices to IO APIC inputs. For PCI devices (but not legacy ISA stuff) this will be painful, as it can mean using ACPI's AML to determine what is connected to which IO APIC input.

Cheers,

Brendan

Posted: **Sat Nov 29, 2014 3:40 pm**

It turned out to be a stack problem and I fixed it.

One question though, if I decided to to continue for PIC now. How can I send an interrupt from the BSP to an AP and trigger execution of a service routine on the AP?

Thanks
Karim.

Posted: **Sun Nov 30, 2014 1:26 am**

Hi,

kemosparc wrote:One question though, if I decided to to continue for PIC now. How can I send an interrupt from the BSP to an AP and trigger execution of a service routine on the AP?

Your BSP's IRQ handler would figure out which CPU to send an IPI to, then send an IPI to that CPU (via. the local APIC). The other CPU would need an interrupt handler for that IPI. This may mean 32 interrupt handlers for the PIC chips (used by the BSP), plus 32 interrupt handlers for the IPIs (for the AP CPUs).

Note that there's overhead for starting an interrupt handler - e.g. pipeline flush, switch from CPL=3 to CPL=0 (including protection checks and potential cache and TLB misses while accessing IDT and GDT), then stall while waiting for instruction fetch (and more potential cache/TLB misses). If the BSP re-sends the interrupt to an AP, you pay the "starting an interrupt handler" overhead twice. It will be more efficient to just do it on the BSP without sending it to another CPU.

For a micro-kernel (where the kernel's IRQ handler's only sends an "IRQ occured" message to the device driver/s in user-space) it'd be faster to just send the messages from the BSP. For a monolithic kernel; often you've got a "top half" interrupt handler and "bottom half" interrupt handler where the top half adds the bottom half to a queue or something; and in this case it'd be faster for the "top half" to run on the BSP (where only the "bottom halves" run on AP CPUs).

Also; please don't forget that there will be interrupt sharing involved. For a very rough estimate, you might find an average of 20 PCI devices in a computer, but there's only 4 PCI IRQs. This means that for each PCI IRQ you have to ask an average of 5 device drivers if their device was responsible for the IRQ (where you can expect 4 of the 5 device drivers to check their device and say "No, IRQ wasn't from this device"). For micro-kernels, the IRQ handler just sends a message to each device driver and (as a consequence of the way communication effects scheduling) the scheduler would take care of running all the device drivers on different CPUs in parallel. For monolithic kernels you'd want to do something similar; where the "top half" does very little (and only runs on the BSP) and each device driver has its own "bottom half" where these "bottom halves" can be run on different CPUs in parallel.

Finally; I should mention MSI ("Message Signalled Interrupts"), where the PCI device bypasses the IO APIC input and sends its interrupt directly via. a "message" over the bus). In this case you can avoid IRQ sharing completely (which helps a lot for reducing overhead - no need for "4 out of 5 device drivers say the IRQ wasn't from their device" overhead). More importantly; for MSI you don't have to care which IO APIC input the device is connected to; which avoids the need to use the ACPI's AML to find out. The nice thing is that MSI is mandatory for PC-Express, so you can expect most PCI devices in modern computers to support it. The sad part is that "most" isn't the same as "all" (so you will still need to figure out which IO APIC input is used by the device for some devices); and that you need to be using the IO APIC to be able to use MSI.

Cheers,

Brendan

Posted: **Sun Nov 30, 2014 3:36 am**

Hi,

Thanks a lot for your thorough reply.

I know it is not an easy task to do and it needs a lot of details attention.

But as I am still trying to learn and I would like to explore and experiment, based on your explanantion below I have very specific questions:

What I would like to do is that when the BSP received a PIT interrupt (Int # 32) via the PIC, I send the other AP core an interrupt (Int 32 also)

The attached is the ICR lower 32 bit from the intel manual, and I will be refering my question to it:

First, is it possible to send and IPI to another core from the BSP while I am inside an interrupt service routine
Second, What delivery mode should I use for the IPI in that case (In attached Diagram bit 8-10)
Third, where should I tell the AP that I am interrupting with int # 32? The vector field (In attached diagram bit 0-7)?
Forth, does the AP core receive in that case the interrupt as a normal one, meaning that if I have ldt the interrupt vector table in the AP and entry # 32 points to a service routine, does this work?
Fifth, is there any possibility that my interupt get lost, do I have to implement a mechanism for the BSP to make sure the the AP received the IPI.

Thanks a lot for the help

Karim.

Posted: **Sun Nov 30, 2014 5:31 am**

Hi,

kemosparc wrote:What I would like to do is that when the BSP received a PIT interrupt (Int # 32) via the PIC, I send the other AP core an interrupt (Int 32 also)

For that to work, the BSP and the AP would need different IDTs. E.g. for the BSP's IDT, the IDT entry for "int 32" would point to the code that sends the IPI to the AP; and for the AP the IDT entry for "int 32" would point to the code that handles the IRQ.

It'd probably be easier to use different interrupt vectors, and have the same IDT for all CPUs. For example, the IDT entry for "int 32" points to code that sends an "int 64" to the AP; and the IDT entry for "int 64" handles the IRQ.

kemosparc wrote:First, is it possible to send and IPI to another core from the BSP while I am inside an interrupt service routine

Yes.

However, there's no way to atomically set both the high 32 bits of the ICR and the low 32 bits at the same time. This means that (e.g.) a normal piece of kernel code might set the high 32 bits of ICR (then get interrupted by your IRQ handler that changes the high 32 bits of ICR) then set the low 32 bits of ICR (causing unpredictable bugs). To prevent that you'd just have to disable IRQs whenever doing anything with the ICR.

kemosparc wrote:Second, What delivery mode should I use for the IPI in that case (In attached Diagram bit 8-10)

Most of the choices are for extremely special purposes only (SMI, NMI, INIT, StartUp). The "lowest priority" is sadly annoying, as it's only supported on some CPUs and not others. This only leaves one choice ("fixed").

kemosparc wrote:Third, where should I tell the AP that I am interrupting with int # 32? The vector field (In attached diagram bit 0-7)?

If you have one "AP interrupt vector" for each IRQ, then yes.

Alternatively, it would be possible for BSP to store the IRQ number somewhere (e.g. in a volatile global variable) and have one "AP interrupt vector"; where the AP's IRQ handler gets the IRQ from that volatile global variable. This would be slower and would involve additional synchronisation (the BSP can't change that volatile global variable until after it knows all AP CPUs have read it).

kemosparc wrote:Forth, does the AP core receive in that case the interrupt as a normal one, meaning that if I have ldt the interrupt vector table in the AP and entry # 32 points to a service routine, does this work?

It's not quite the same as a normal IRQ. The AP CPU would have to send an EOI to the local APIC when its finished handling the IPI (in addition to sending an EOI to the PIC after the device drivers are finished handling the IRQ).

kemosparc wrote:Fifth, is there any possibility that my interupt get lost, do I have to implement a mechanism for the BSP to make sure the the AP received the IPI.

There's a few different cases here.

For newer CPUs, the local APIC will automatically retry, and its impossible for the interrupt to get lost (as long as the interrupt vector is valid). However, before sending an IPI, you must wait for the "Delivery Status" flag in the ICR to become clear first (to ensure that the CPU has finished sending the previous IPI before you try to send another).

For older CPUs (Pentium and P6) it's possible to get checksum errors (the local APIC doesn't retry). In this case you have 2 choices:

Send the IPI; then monitor the "Delivery Status" flag in the ICR until the IPI was sent; then check the Error Status Register to see if there was a checksum error and retry if there was (possibly with a limit of 3 retries or something so you don't end up doing "retry forever", where after 3 retries you do a "hardware is dodgy" kernel panic).
Before sending an IPI, wait for the "Delivery Status" flag in the ICR to become clear first (just like for newer CPUs) then check the Error Status Register to see if the previous IPI failed and do a "hardware is dodgy" kernel panic if there was a problem. If everything is fine, send your IPI and don't wait after.

For these 2 choices; the latter option is faster - typically the previous IPI was already sent correctly and you don't need to wait for anything at all.

Cheers,

Brendan

Posted: **Sun Nov 30, 2014 9:12 am**

Thanks for the reply.

Is there a way that I can know the id of the current processor executing ?

Thank,
Karim.

Posted: **Sun Nov 30, 2014 9:49 am**

Hi,

Do I have to do anything for the AP to accept the interrupts. I used this code to send the interrupt:

Code: Select all

        Write_Local_APIC(LAPIC_ICR1, 0x01000000); //processor number 1
        Write_Local_APIC(LAPIC_ICR0, 0x0000402F); // Type: Fixed and Interrupt 47

This is the same way I sent the INIT and the SIPI.

I used this statically in the code and I did not put it in the PIT yet.

I attached a service routine to interrupt 47 to the inetrrupt service routine of the BSP and loaded the same interrupt vector table address to the AP using lidt.

But the interrupts does not go through.

Thanks
karim.

Posted: **Sun Nov 30, 2014 10:16 am**

Hi,

Do i have to enable apic on the ap to be able to receive ipi?

Thanks,
Karim

Posted: **Sun Nov 30, 2014 12:47 pm**

Which chapters of the intel manuals do not contain the answer?

OSDev.org

SMP Cache Cogerency and More understanding of interrupts

SMP Cache Cogerency and More understanding of interrupts

Re: SMP Cache Cogerency and More understanding of interrupts

Re: SMP Cache Cogerency and More understanding of interrupts

Re: SMP Cache Cogerency and More understanding of interrupts

Re: SMP Cache Cogerency and More understanding of interrupts

Re: SMP Cache Cogerency and More understanding of interrupts

Re: SMP Cache Cogerency and More understanding of interrupts

Re: SMP Cache Cogerency and More understanding of interrupts

Re: SMP Cache Cogerency and More understanding of interrupts

Re: SMP Cache Cogerency and More understanding of interrupts