Page 1 of 2

Cores initialization on intel i7

Posted: Tue Mar 13, 2012 12:10 pm
by giovanig
Hello everyone,

I am implementing the cores initialization on intel i7-2600 processor. I followed the normal initializing structure:

BSP sends AP an INIT IPI
BSP DELAYs (10mSec)
BSP sends AP a STARTUP IPI
BSP DELAYs (200μSEC)
BSP sends AP a STARTUP IPI
BSP DELAYs (200μSEC)

My code is:

Code: Select all

if(APIC::id() == 0) { // Boot strap CPU (BSP)// Initialize shared CPU counter
	si->bm.n_cpus = 1;

	// Broadcast INIT IPI to all APs excluding self
	APIC::ipi_init();

	// Broadcast STARTUP IPI to all APs excluding self
	// Non-boot CPUs will run a simplified boot strap just to
	// trampoline them into protected mode
	//  0x3000 is the trampoline code which enable protected mode and jumps to this same function
 	APIC::ipi_start(0x3000);
      ...... //BSP do other things here
} else { // Additional CPUs (APs)
	// Each AP increments the CPU counter
	CPU::finc(reinterpret_cast<volatile int &>(si->bm.n_cpus));
	// Wait for the boot strap CPU to get us a stack
	while(!Stacks_Ready);
}
The problem is that the system is restarting after the APIC::ipi_start. This same code runs in on an intel core 2 Q9550 processor but on the intel i7 it restarts. Does anyone have a clue?

Thanks

Re: Cores initialization on intel i7

Posted: Wed Mar 14, 2012 1:56 am
by Brendan
Hi,

Code: Select all

if(APIC::id() == 0) { // Boot strap CPU (BSP)// Initialize shared CPU counter
Who said that the BSP's APIC ID is zero? It may not be.

Code: Select all

	// Broadcast INIT IPI to all APs excluding self
Never broadcast the "INIT, SIPI, SIPI" sequence - it incorrectly attempts to start CPUs that failed their built-in self tests and logical CPUs that are disabled because hyper-threading is disabled in the BIOS.

Code: Select all

      ...... //BSP do other things here
Do you make the BSP wait until all AP CPUs have started? How? Is a time-out and an "AP CPU number ?? failed to start" error message involved?

Code: Select all

	CPU::finc(reinterpret_cast<volatile int &>(si->bm.n_cpus));
"Volatile" is not enough to prevent race conditions.

Code: Select all

	// Wait for the boot strap CPU to get us a stack
	while(!Stacks_Ready);
If there's 7 AP CPUs, they all wait for the BSP to create one stack?

Where is the trampoline code? Where is the APIC code ("APIC::ipi_init()", "APIC::ipi_start()", etc)? Where are your time delays?


Cheers,

Brendan

Re: Cores initialization on intel i7

Posted: Wed Mar 14, 2012 8:00 am
by giovanig
Hi Brendan, thank you for your answer.
Who said that the BSP's APIC ID is zero? It may not be.
Is it better to read and test the APIC_BASE_MSR instead? Something like this:

Code: Select all

static unsigned long long rdmsr(unsigned int reg) {
        unsigned long long v;
        asm volatile("rdmsr" : "=A" (v) : "c" (reg));
        return v;
}
// APIC_BASE_MSR =  0x800
// 0x100 =  BSP——Processor is BSP
static int is_bsp() {
      return rdmsr(APIC_BASE_MSR) & 0x100;
}
Never broadcast the "INIT, SIPI, SIPI" sequence - it incorrectly attempts to start CPUs that failed their built-in self tests and logical CPUs that are disabled because hyper-threading is disabled in the BIOS.
How is the correct sequence? Should I send an INIT, SIPI, SIPI sequence for each AP? Could you give me an example?
Do you make the BSP wait until all AP CPUs have started? How? Is a time-out and an "AP CPU number ?? failed to start" error message involved?
No, the BSP do other things and when it allocate the stack, APs are already up. The final part of BSP is:

Code: Select all

// Move the boot image to after SETUP, so there will be nothing else
// below SETUP to be preserved
// SETUP code + data + 1 stack per CPU)
register char * dst = MMU::align_page(entry + size + si->bm.n_cpus * sizeof(MMU::Page));
memcpy(dst, bi, si->bm.img_size);

// Passes a pointer to the just allocated stack pool to other CPUs
Stacks = dst;
Stacks_Ready = true;
If there's 7 AP CPUs, they all wait for the BSP to create one stack?
Yes, they all wait for Stacks_Ready = true, as the code above.
Where is the trampoline code? Where is the APIC code ("APIC::ipi_init()", "APIC::ipi_start()", etc)? Where are your time delays?

Code: Select all

DISK_IMAGE 	 = 0x8000 ; = BOOT_IMAGE_PHY_ADDR from memory_map 
; System Information 
DISK_IMAGE_SETUP =       (DISK_IMAGE + DISK_SECT_SIZE)
; SETUP entry point 
SETUP_ENTRY =            (DISK_IMAGE_SETUP + ELF_HDR_SIZE)
TRAMPOLINE_STACK = 0x3000 ; SMP trampoline stack (descendent, 4K)
;========================================================================
; TRAMPOLINE
;
; Desc: "trampolines" additional CPUs into protected mode in SMP
;	configurations							=
;========================================================================
trampoline:
		cli ; disable interrupts
		xor ax,ax ; data segment base = 0x00000
		mov ds,ax
		mov es,ax
		mov ss,ax
                mov sp,#TRAMPOLINE_STACK ; set stack pointer
	;; 		mov [0xB8004], #(0x32 & 0xFF) ; 

; Set GDTR
                lgdt	GDTR
                   
; Enable Protected Mode
                mov	eax,cr0
                or	al,#0x01	; set PE flag and MP flag
                mov	cr0,eax
     
; Adjust selectors
                mov	bx,#2 * 8	; adjust data selectors to use
                mov	ds,bx		; GDT[2] (DATA) with RPL = 00
                mov	es,bx
                mov     fs,bx
                mov     gs,bx
                mov	ss,bx

; As Linux as86 can't generate 32 bit instructions, we have to code it by hand.
; The instruction below is a inter segment jump to GDT[GDT_CODE]:SETUP.
; Jump into "SETUP" (actually ix86 Protected Mode starts here)
;		jmp	0x0008:#SETUP_ENTRY ; here it jumps to the code I showed 
		.byte	0x66
		.byte	0xEA
		.long	SETUP_ENTRY
		.word	0x0008

//ICR_OTHERS		= (3 << 18)
//ICR_LEVEL		= (1 << 15)
//ICR_ASSERT		= (1 << 14)
//ICR_INIT		= (5 <<  8)
//ICR0_31 =		0x300
static void ipi_init() {
	// Broadcast INIT IPI to all APs excluding self
	write(ICR0_31, ICR_OTHERS | ICR_LEVEL | ICR_ASSERT | ICR_INIT);
 	while((read(ICR0_31) & ICR_PENDING));
 }

//ICR_STARTUP		= (6 <<  8)
void APIC::ipi_start(Log_Addr entry)
{
    unsigned int vector = (entry >> 12) & 0xff;

    // Broadcast STARTUP IPI to all APs excluding self twice
    write(ICR0_31, ICR_OTHERS | ICR_LEVEL | ICR_ASSERT | ICR_STARTUP | vector);
    while((read(ICR0_31) & ICR_PENDING));
    i8255::ms_delay(10); // ~ 10ms delay

    write(ICR0_31, ICR_OTHERS | ICR_LEVEL | ICR_ASSERT | ICR_STARTUP | vector);
    while((read(ICR0_31) & ICR_PENDING));

    // Give other CPUs a time to wake up (> 100ms)
    i8255::ms_delay(100);
};

static void ms_delay(int milliseconds) {
	for(; milliseconds > 0; milliseconds--) {
	    // Disable speaker so we can use channel 2 of i8253
	    port_b(port_b() & ~(SPEAKER | I8253_GATE2));

	    // Program i8253 channel 2 to count 1 ms
	    i8253::config(2, i8253::CLOCK/1000, false, false);

	    // Enable i8253 channel 2 counting
	    port_b(port_b() | I8253_GATE2);

	    // Wait for i8253 counting to finish
	    while(!(port_b() & I8253_OUT2));
	}
}
I think everything is here. Thank you again.

Re: Cores initialization on intel i7

Posted: Wed Mar 14, 2012 9:15 am
by giovanig
Just a correction. APIC_BASE_MSR address is 0x1b and not 0x800:

Code: Select all

// APIC_BASE_MSR = 0x1b
// 0x100 = BSP——Processor is BSP
static int is_bsp() {
return rdmsr(APIC_BASE_MSR) & 0x100;
}

Re: Cores initialization on intel i7

Posted: Wed Mar 14, 2012 9:21 am
by JAAman
i added quote and code tags to your last 2 posts, as they make it much easier to read (and as a result they increase the chances of getting a good reply) -- and are also required by the forum rules, so try to remember to include them in the future

Re: Cores initialization on intel i7

Posted: Wed Mar 14, 2012 10:44 am
by giovanig
Hi Brendan,

I've found your reply about the correct INIT,SIPI,SIPI sequence in another post: http://forum.osdev.org/viewtopic.php?f= ... tartup+ipi

I've tried to implement this initialization in my ipi_init and ipi_start methods. First I've created a shared vector for APs to say to BSP that they are alive:

Code: Select all

//aps_status[AP] = 0 means not initialized
//aps_status[AP] = 1 means initialized
//aps_status[AP] = 2 means BSP ACK
volatile unsigned int APIC::aps_status[Traits<PC>::MAX_CPUS];
Then APs start, set aps_status[AP] to 1 and wait for the ACK:

Code: Select all

 APIC::aps_status[APIC::id()] = 1; // AP is initilized
 while(APIC::aps_status[APIC::id()] != 2) ; //wait for BSP's ACK
Then I changed the BSP's ipi_init and ipi_start:

Code: Select all

//MAX_CPUS = 8
//ICR0_31 = 0x300
//ICR32_63 = 0x310
void APIC::ipi_init() {
    for(unsigned int ap_number = 1; ap_number < Traits<PC>::MAX_CPUS; ap_number++) {
        APIC::aps_status[ap_number] = 0; //AP not initilized
        write(ICR32_63, (ap_number << 24));
        write(ICR0_31, ICR_LEVEL | ICR_ASSERT | ICR_INIT);
    }
    i8255::ms_delay(10);
};

void APIC::ipi_start(Log_Addr entry)
{
    unsigned int vector = (entry >> 12) & 0xff;
    
    for(unsigned int ap_number = 1; ap_number < Traits<PC>::MAX_CPUS; ap_number++) {
        write(ICR32_63, (ap_number << 24));
        write(ICR0_31, ICR_LEVEL | ICR_ASSERT | ICR_STARTUP | vector);
    }
    
    i8255::ms_delay(200);
    
    for(unsigned int ap_number = 1; ap_number < Traits<PC>::MAX_CPUS; ap_number++) {
        if(APIC::aps_status[ap_number] == 1) //if AP is initilized
            APIC::aps_status[ap_number] = 2; //send an ACK
        else { // send a second SIPI to the AP
            write(ICR32_63, (ap_number << 24));
            write(ICR0_31, ICR_LEVEL | ICR_ASSERT | ICR_STARTUP | vector);
            i8255::ms_delay(200);
            if(APIC::aps_status[ap_number] == 1) //if AP is initilized
              APIC::aps_status[ap_number] = 2; //send an ACK
            //else AP was not initilized
        }
};
I know that CPU 0 is the BSP because I ran the code without calling ipi_start and it ran, of course, without the APs. I also know that there are 7 APs that's why MAX_CPUS is 8. You said in the another post that only the CPUs recognized by BIOS should be initialized. How can I get this information?

Finally, I ran the new ipi_init and ipi_start implemenantion but got the same error, the machine restarted. I should be missing something.

Cheers,
Giovani

Re: Cores initialization on intel i7

Posted: Wed Mar 14, 2012 11:31 am
by Velko
APIC IDs are not necessarily sequential. For example, my Core i5 have APICs #0, #2, #4 and #6. You can not fire IPIs blindly. There's no way of knowing what the effect will be.

The quick (and incorrect) way to find out APIC IDs: boot the computer up in Linux and take a look in /proc/cpuinfo.

More correct (but somewhat obsolete) way: scan and parse MultiProcessor configuration tables. You may find this article interesting.

Most correct way: get that information from ACPI.

Re: Cores initialization on intel i7

Posted: Wed Mar 14, 2012 11:49 am
by giovanig
I looked at /proc/cpuinfo and the APICIDs are sequential, from 0 to 7.

As soon as I get this boot working, I can add ACPI support in our OS, so we can get these information from there.

Thanks

Re: Cores initialization on intel i7

Posted: Wed Mar 14, 2012 3:08 pm
by giovanig
I've made another test to see if the APs are reaching the trampoline code. I added the following code to print 'A' in the screen with background in blue:

Code: Select all

trampoline:
cli ; disable interrupts
mov ax,#0xb800
mov es,ax 
seg es 
mov [0],#0x41
seg es
mov [1],#0x1f
; the normal trampoline code that I posted before
I was able to see 'A' in the screen before restarting.

Giovani

Re: Cores initialization on intel i7

Posted: Wed Mar 14, 2012 10:08 pm
by Brendan
Hi,
giovanig wrote:Hi Brendan, thank you for your answer.
Who said that the BSP's APIC ID is zero? It may not be.
Is it better to read and test the APIC_BASE_MSR instead? Something like this:
It's better to get the physical address of the local APIC, the APIC ID for all CPUs (including those using x2APIC that have 32-bit IDs and not 8-bit IDs), and determine which APIC ID corresponds to the BSP, by parsing either the ACPI table (if possible) or the MP specification table (if there's no ACPI).
giovanig wrote:
Never broadcast the "INIT, SIPI, SIPI" sequence - it incorrectly attempts to start CPUs that failed their built-in self tests and logical CPUs that are disabled because hyper-threading is disabled in the BIOS.
How is the correct sequence? Should I send an INIT, SIPI, SIPI sequence for each AP? Could you give me an example?
Correct sequence should be something like:

Code: Select all

for each CPU mentioned by ACPI or MP spec {
    send INIT IPI;
    delay()
    send Startup IPI;
    delay()
    send Startup IPI;
    wait for AP CPU to set some flag or something, with a time-out in case
    if(time-out) {
        remember AP failed to start and maybe display an error message
    }
}
I also tend not to use Intel's method. Sometimes (often) CPUs start after the first Startup IPI and the second Startup IPI isn't needed. Instead I do something like:

Code: Select all

for each CPU mentioned by ACPI or MP spec {
    send INIT IPI;
    delay()
    send Startup IPI;
    wait for AP CPU to set some flag or something, with a time-out in case
    if AP CPU didn't start before time-out {
        send Startup IPI;
        wait for AP CPU to set some flag or something, with a time-out in case
        if AP CPU didn't start before time-out {
            remember AP failed to start and maybe display an error message
        }
    }
}
Note that this is racey. For example, the AP CPU may have started immediately after the BSP stops waiting for the AP CPU to start. To avoid this I add some synchronisation - when an AP CPU starts it sets some sort of flag to tell BSP that it started, and when BSP sees this flag and knows that the AP CPU did start it sets another flag to tell the AP CPU that it can continue. Also, the last time-out should be relatively long.

There are also ways to speed this up a little (imagine there's 128 CPUs and each CPU takes 10 ms to start - it'd create a noticeable delay during boot if you start them one at a time). One way is for BSP to start 1 CPU, then both CPUs start 2 more, then 4 CPUs start 4 more, etc. Another way is to send the INIT IPI to each AP CPU, then have one delay, then send Starup IPIs to each CPU, etc.
giovanig wrote:
If there's 7 AP CPUs, they all wait for the BSP to create one stack?
Yes, they all wait for Stacks_Ready = true, as the code above.
Is "Stacks_Ready" volatile? None of the code you've posted shows how the AP CPU sets its stack. I'm assuming you do something like "address_of_stack = (my_APIC_ID * stack_size) + address_of_stack_pool" which would be wrong (as APIC IDs are not guaranteed to be sequential).

I normally allocate the AP CPU's stack and the store the "address_of_stack" somewhere (e.g. in the trampoline) where the AP CPU I'm starting can get it.
giovanig wrote:

Code: Select all

DISK_IMAGE 	 = 0x8000 ; = BOOT_IMAGE_PHY_ADDR from memory_map 
; System Information 
DISK_IMAGE_SETUP =       (DISK_IMAGE + DISK_SECT_SIZE)
; SETUP entry point 
SETUP_ENTRY =            (DISK_IMAGE_SETUP + ELF_HDR_SIZE)
TRAMPOLINE_STACK = 0x3000 ; SMP trampoline stack (descendent, 4K)
;========================================================================
; TRAMPOLINE
;
; Desc: "trampolines" additional CPUs into protected mode in SMP
;	configurations							=
;========================================================================
trampoline:
		cli ; disable interrupts
		xor ax,ax ; data segment base = 0x00000
		mov ds,ax
		mov es,ax
		mov ss,ax
                mov sp,#TRAMPOLINE_STACK ; set stack pointer
	;; 		mov [0xB8004], #(0x32 & 0xFF) ;
The AP CPU never uses the real mode stack ("TRAMPOLINE_STACK"), which is good because you've got all AP CPUs running this code (and using "TRAMPOLINE_STACK") at the same time, but the RAM you've reserved for the real mode stack is wasted.
giovanig wrote:As soon as I get this boot working, I can add ACPI support in our OS, so we can get these information from there.
There's 2 parts to ACPI. The first part is just reading various pieces of data from tables during boot. This is relatively easy and impossible for correct code to avoid. The second part is the "run-time" stuff, which includes an AML interpreter and other things. The second part is an unholy mess. You only need the first part at the moment (the easy part, not the unholy mess part).
giovanig wrote:I've made another test to see if the APs are reaching the trampoline code. I added the following code to print 'A' in the screen with background in blue:
That only says if one AP CPU started. The BSP could put a '0' in display memory and the AP CPUs could "lock inc byte []" it so you see a number that indicates how many AP CPUs have incremented it. You could even have several of these (one for each different place in the code), and insert dummy loops (e.g. just after each AP CPU's "lock inc []") to slow things down in case things happen too fast for you to see.

I'm usually too lazy for that though. I tend to just put endless loops ("jmp $") in the code to see if it crashes before it reaches a certain point (or locks up when it does reach that point). You can usually get a good idea where the crash occurs after shifting the endless loop a few times.


Cheers,

Brendan

Re: Cores initialization on intel i7

Posted: Thu Mar 15, 2012 12:24 pm
by giovanig
Hi Brendan, thank you again for reply.

I've changed the INIT,SIPI,SIPI sequence as you said and the machine is now able to boot and to start all the 7 APs. I also added the synchronization mechanism and then the BSP prints the value of each AP, that's how I know they are starting.
Is "Stacks_Ready" volatile? None of the code you've posted shows how the AP CPU sets its stack. I'm assuming you do something like "address_of_stack = (my_APIC_ID * stack_size) + address_of_stack_pool" which would be wrong (as APIC IDs are not guaranteed to be sequential).
Exactly. It is like this:

Code: Select all

register char * sp = const_cast<char *>(Stacks) - sizeof(MMU::Page) * APIC::id();
ASM("movl %0, %%esp" : : "r" (sp));
I know if the APIC IDs are not sequential, there will be a problem here. But in my case now, they are sequential. I will change this after implementing the parsing of ACPI table.
That only says if one AP CPU started. The BSP could put a '0' in display memory and the AP CPUs could "lock inc byte []" it so you see a number that indicates how many AP CPUs have incremented it. You could even have several of these (one for each different place in the code), and insert dummy loops (e.g. just after each AP CPU's "lock inc []") to slow things down in case things happen too fast for you to see.

I'm usually too lazy for that though. I tend to just put endless loops ("jmp $") in the code to see if it crashes before it reaches a certain point (or locks up when it does reach that point). You can usually get a good idea where the crash occurs after shifting the endless loop a few times.
As I said, I know that APs are starting by making the BSP to print the synchronization variable of each AP. Now the problem is that after each core has set its own stack, there is a call to the next part of booting process, that will configure paging, idt, gdt, and so on.

Code: Select all

if(APIC::id() == 0) {
//BSP sends INIT,SIPI,SIPI
register char * dst = MMU::align_page(entry + size + si->bm.n_cpus * sizeof(MMU::Page)); //si->bm.n_cpus is 8 here

// Passes a pointer to the just allocated stack pool to other CPUs
Stacks = dst;
Stacks_Ready = true; //Stacks_Ready is volatile

} else { // Additional CPUs (APs)

    // Informs BSP that this AP is initialized 
    CPU::finc(reinterpret_cast<volatile int &>(si->bm.aps_status[APIC::id()])); // now aps_status[APIC::ID()] is 1. BSP will increment to 2 inside the ipi_start() method
    
    //Wait for BSP's ACK
    while(reinterpret_cast<volatile int &>(si->bm.aps_status[APIC::id()]) != 2) ;

   // Each AP increments the CPU counter
    CPU::finc(reinterpret_cast<volatile int &>(si->bm.n_cpus));

    // Wait for the boot strap CPU to get us a stack
    while(!Stacks_Ready);
}

//All cores execute the following lines, including the BSP
// Setup a single page stack for SETUP after its data segment
// Boot strap CPU gets the highest address stack
// SP = "entry" + "size" + #CPU * sizeof(Page)
// Be careful: we'll loose our old stack now, so everything we still
// need to reach PC_Setup() must be in regs or globals!
register char * sp = const_cast<char *>(Stacks) - sizeof(MMU::Page) * APIC::id();
ASM("movl %0, %%esp" : : "r" (sp));

// Pass the boot image to SETUP
ASM("pushl %0" : : "r" (Stacks));
    
if(APIC::id() != 0) { 
    CPU::cr0(CPU::CR0_PE | CPU::CR0_ET); // write 0x11 to CR0
    //while(1) ;
}

// Call setup()
// the assembly is necessary because the compiler generates
// relative calls and we need an absolute one
ASM("call *%0" : : "r" (&setup));
If I run de code with while(1) in if(APIC::id() != 0) {} the system does not restart (APs stay there), and the BSP is able to print the APs synchronization variables (all equals to 2), and I am sure that they are up. But if I run the code without while(1), the system restarting in that call *%0.

Do you have any idea about what is going on here?

Giovani

Re: Cores initialization on intel i7

Posted: Thu Mar 15, 2012 1:24 pm
by Brendan
Hi,
giovanig wrote:If I run de code with while(1) in if(APIC::id() != 0) {} the system does not restart (APs stay there), and the BSP is able to print the APs synchronization variables (all equals to 2), and I am sure that they are up. But if I run the code without while(1), the system restarting in that call *%0.

Do you have any idea about what is going on here?
The "while(1)" only effects AP CPUs, so the BSP must be able to execute the "call *%0" without trouble.

The "ASM("movl %0, %%esp" : : "r" (sp));" worries me a lot - it's never safe to mess with the location of a stack while a C function is relying on it for things like local variables.

Beyond that, I'm not too sure. It's probably best to test this on Bochs where you can get decent debugging information, and probably best to write the entire thing in assembly so you know exactly what the final code actually does (rather than hoping that the compiler generates what you think it might).


Cheers,

Brendan

Re: Cores initialization on intel i7

Posted: Thu Mar 15, 2012 1:27 pm
by aod
Hey guys,
could you recommend some widely accessible resources on SMP initialisation?

Re: Cores initialization on intel i7

Posted: Thu Mar 15, 2012 1:40 pm
by giovanig
Hello,

I ran the same code on QEMU, and it ran. The problem is on the real machine. I suppose is something related to CR registers, or even the stack.

Re: Cores initialization on intel i7

Posted: Thu Mar 15, 2012 1:44 pm
by Brendan
Hi,
giovanig wrote:I ran the same code on QEMU, and it ran. The problem is on the real machine. I suppose is something related to CR registers, or even the stack.
Default (single-CPU) Qemu, or multi-CPU Qemu?


Cheers,

Brendan