Page 1 of 1

AP core initialization

Posted: Mon Nov 17, 2014 10:55 am
by sankarp
Hi,

I am trying to understand the (hardware) delays during a processing core initialization. The core is moved to offline and online state through Linux hot-plug mechanism. The core moving to online state starts with sending INIT IPI to the destination core's local APIC followed by START UP IPI. I noticed few hardcoded delays added in this process from linux's source code. However the data sheet - Appendix B.4 from http://www.intel.com/design/pentium/dat ... 201606.pdf - contains a pseudo-code (given below) that instructs to add the delays.

BSP sends AP an INIT IPI
BSP DELAYs (10mSec)
If (APIC_VERSION is not an 82489DX)
{
BSP sends AP a STARTUP IPI
BSP DELAYs (200µSEC)
BSP sends AP a STARTUP IPI
BSP DELAYs (200µSEC)
}
BSP verifies synchronization with executing AP

I am not sure the reason for adding these hard-coded delays in the startup code. I would really appreciate if someone could help me understand the delays.

- Sankar

Re: AP core initialization

Posted: Mon Nov 17, 2014 11:30 am
by Brendan
Hi,

When a CPU receives the "INIT IPI", it has to reset itself to a default state (including wiping its caches, invalidating TLBs, MTRRs, etc). This takes time. Different CPUs probably take different amounts of time - e.g. slow CPUs with very little internal state might be able to reset themselves quickly; and fast CPUs with a lot of internal state (branch buffers, streaming loop detectors, AVX, etc) might take far longer to reset.

If a CPU receives a SIPI while it's resetting it's internal state, it can't store the fact that its received the SIPI in the internal state that it's in the middle of resetting. To fix that; Intel have given us a guarantee. They've mostly said "all CPUs will take less than 10 ms to restore themselves to default state after receiving an INIT IPI". This means an OS has to wait 10 ms after sending the INIT IPI to be sure the CPU has had the time it needs to be ready to receive the SIPI IPI.

For the 200 us delays, this is probably similar - a guarantee from Intel that all CPUs will be able to fetch their first instruction from RAM into cache and start executing within 200 us. This is likely to be "conservative' - e.g. enough time for the CPU to execute several instructions and set some sort of flag (to inform whoever started it that it actually did start).

Also note that most CPUs do start on the first SIPI, where the second SIPI "re-starts" it; and I suspect that the second SIPI is only there just in case the first was lost. For my code, if the AP CPU has already set some sort of "I'm running" flag after the first SIPI, then I skip the second SIPI and the second delay.

For the second SIPI; that delay should not be a delay but should be a time-out. E.g. "while( (AP_CPU_started_flag_not_set_by_AP) && (timeout_not_expired) ) { // Keep waiting }". If it is used as a time-out like this; then it should be much longer (e.g. 500 ms and not 200 us).


Cheers,

Brendan

Re: AP core initialization

Posted: Mon Nov 17, 2014 11:59 am
by sankarp
Thanks much for the response Brendan. I have few other follow-up questions :

1. Are there any documentation on what exactly happens in the processor when these IPIs are sent (like the micro-architecture states that are cleared, how the first instruction is fetched on receiving the SIPI)
2. Any idea if a timeout exists between the INIT IPI and the SIPI - Does the processors expect these two IPIs to be sent within some time interval? I was wondering if the INIT IPI could be sent when the cpu is moved to offline state and send the SIPI when core is moved to online state thereby accelerating the startup process.
3. If an INIT IPI is sent, any idea if the core will remain in active state (not in sleep state)? If this is the case then sending INIT IPI earlier does not work well resulting in power wastage.

- Sankar

Re: AP core initialization

Posted: Mon Nov 17, 2014 1:28 pm
by Brendan
Hi,
sankarp wrote:1. Are there any documentation on what exactly happens in the processor when these IPIs are sent (like the micro-architecture states that are cleared, how the first instruction is fetched on receiving the SIPI)
I only know of the piece in the Intel manuals that describes the final software visible state (and none of the micr-arch specific internal stuff), and a few "hints" you might find in specification updates/errata. Note: There is at least one Intel CPU (I forget which) where it doesn't invalidate TLBs during INIT, and the work-around is that you have to invalidate TLBs when taking it offline.
sankarp wrote:2. Any idea if a timeout exists between the INIT IPI and the SIPI - Does the processors expect these two IPIs to be sent within some time interval? I was wondering if the INIT IPI could be sent when the cpu is moved to offline state and send the SIPI when core is moved to online state thereby accelerating the startup process.
I can't think of a reason for the CPU to care if the INIT IPI arrives a long time before the SIPI. However, there's few reasons to take a CPU offline at all (e.g. rather than leaving it "online in its lowest power state"), and the largest reason is for actual hot-plug (e.g. "online -> offline -> physically removed" and "physically inserted -> offline -> online") where sending the INIT IPI during the "online->offline" transition would be pointless.
sankarp wrote:3. If an INIT IPI is sent, any idea if the core will remain in active state (not in sleep state)? If this is the case then sending INIT IPI earlier does not work well resulting in power wastage.
On receiving an INIT IPI it goes into a "wait for SIPI state"; but this "wait for SIPI state" isn't described in detail by anything I've seen.


Cheers,

Brendan

Re: AP core initialization

Posted: Tue Nov 18, 2014 1:26 pm
by sankarp
Thanks much for the information - it really helps.

Re: AP core initialization

Posted: Mon Jun 08, 2015 8:49 pm
by sankarp
[UPDATE] 10 ms delay is not needed for multi-core processors (from a certain family number). Refer to this post: https://lkml.org/lkml/2015/5/12/102

Re: AP core initialization

Posted: Mon Jun 08, 2015 11:54 pm
by Brendan
Hi,
sankarp wrote:[UPDATE] 10 ms delay is not needed for multi-core processors (from a certain family number). Refer to this post: https://lkml.org/lkml/2015/5/12/102
I'd be willing to bet this wasn't tested properly; and will break on old CPUs (e.g. Pentium Pro), and/or Atom, and/or possibly even on the latest Intel desktop/server CPUs under certain conditions.


Cheers,

Brendan