Page 1 of 1
try to boot other cpu in lapic
Posted: Thu Aug 23, 2012 3:24 am
by hfcao
In order to support mulltiprocessors,I try to write code to boot other cpus.the follow is the address I use:
http://www.scs.stanford.edu/histar/src/kern/dev
and the code is apic.c.
I have some questions:1)should the lapic location must be 0xfee00000,even in an 64bit os,and if it is,should I write to the location directly without mapping or not when I write code at the location .
2)since the OS I write is an 64bit OS ,so I map the 0xfee00000 to a address in my OS , however when I run the code ,the cpus cannot be booted correctly,it seems like a deadlock happens ,can anyone give me some tips? Thanks in advance~
Re: try to boot other cpu in lapic
Posted: Thu Aug 23, 2012 8:45 am
by Combuster
1) The location is a hardcoded value on old CPUs and can be found in the APIC BASE model-specifc-register on anything current.
2) the APIC is mapped in the physical address space. Paging and segmentation will apply.
Re: try to boot other cpu in lapic
Posted: Thu Aug 23, 2012 11:05 am
by Brendan
Hi,
This code is for systems with an external local APIC (e.g. an 82489DX chip) and isn't suitable for anything with a local APIC built into the CPU. This means that it's useless for anything that has a "Pentium or later" CPU in it. In theory this code will work on "80486 or older" systems that support SMP; but in practice these systems are so rare that it's impossible to tell if any of them still exist.
hfcao wrote:I have some questions:1)should the lapic location must be 0xfee00000,even in an 64bit os,and if it is,should I write to the location directly without mapping or not when I write code at the location .
The safest way is to assume the local APIC could be anywhere (and not just at 0xFEE00000) and find the physical address of the local APIC from the ACPI or MP specification tables when/while you're finding out the APIC IDs for AP CPUs.
hfcao wrote:2)since the OS I write is an 64bit OS ,so I map the 0xfee00000 to a address in my OS , however when I run the code ,the cpus cannot be booted correctly,it seems like a deadlock happens ,can anyone give me some tips? Thanks in advance~
My tip is, never automatically trust any information provided by any university. Even if a university says "frozen water is cold", double check.
Cheers,
Brendan
Re: try to boot other cpu in lapic
Posted: Fri Aug 24, 2012 12:29 am
by hfcao
Thanks for your tips.I don't how to detect whether the AP has been waked up by the BP or not?
Re: try to boot other cpu in lapic
Posted: Fri Aug 24, 2012 1:53 pm
by gravaera
Yo:
You don't detect that -- you assume that the AP is in the correct state based on the multiprocessor specification. If a board does not comply with the predefined MP state at handover to the OS, then it is not MP compatible, and you can assume it that multiprocessing should not be used.
Realistically though, a core not being left in the "wait for IPI" state isn't going to interfere with your MP setup; at least, not to my knowledge \o/
--Peace out,
gravaera
Re: try to boot other cpu in lapic
Posted: Fri Aug 24, 2012 3:21 pm
by Brendan
Hi,
hfcao wrote:I don't how to detect whether the AP has been waked up by the BP or not?
Normally in your AP CPU startup code you'd set some sort of flag or variable (in memory) as soon as you can. The BSP would wait for this flag to be set, with a timeout. If the flag hasn't been set within some amount of time (e.g. maybe 20 ms) the BSP stops waiting and adds a "Failed to start AP" error message to the boot log or something.
Cheers,
Brendan
Re: try to boot other cpu in lapic
Posted: Fri Aug 24, 2012 9:55 pm
by hfcao
Thank you so much for your replies~
1)I try to understand the following code ,however I still don't know one sentence:
Code: Select all
apic_write(LAPIC_ICRLO, LAPIC_DLMODE_STARTUP | (pa >> 12));
,
why the physical address should be shift right by 12 bit?
.
2)when I try to modify this code,it seems that the AP cannot get the right address to start executing the routines,it just halt.and I doubt the executing address I try to send to AP is wrong.
Code: Select all
void
apic_start_ap(uint32_t apicid, physaddr_t pa)
{
// Universal Start-up Algorithm from Intel MultiProcessor spec
int r;
uint16_t *dwordptr;
// "The BSP must initialize CMOS shutdown code to 0Ah ..."
outb(IO_RTC, NVRAM_RESET);
outb(IO_RTC + 1, NVRAM_RESET_JUMP);
// "and the warm reset vector (DWORD based at 40:67) to point
// to the AP startup code ..."
dwordptr = pa2kva(0x467);
dwordptr[0] = 0;
dwordptr[1] = pa >> 4;
// ... prior to executing the following sequence:"
if ((r = ipi_init(apicid)) < 0)
panic("unable to send init");
timer_delay(10 * 1000000); // 10ms
for (uint32_t i = 0; i < 2; i++) {
apic_icr_wait();
apic_write(LAPIC_ICRHI, apicid <<24);
apic_write(LAPIC_ICRLO, STARTUP | (pa >> 12));
timer_delay(200 * 1000); // 200us
}
}
Re: try to boot other cpu in lapic
Posted: Fri Aug 24, 2012 10:51 pm
by Brendan
Hi,
Code: Select all
void apic_start_ap(uint32_t apicid, physaddr_t pa)
{
// Universal Start-up Algorithm from Intel MultiProcessor spec
Ok.
Unnecessary variable (see below).
Code: Select all
uint16_t *dwordptr;
// "The BSP must initialize CMOS shutdown code to 0Ah ..."
outb(IO_RTC, NVRAM_RESET);
outb(IO_RTC + 1, NVRAM_RESET_JUMP);
// "and the warm reset vector (DWORD based at 40:67) to point
// to the AP startup code ..."
dwordptr = pa2kva(0x467);
dwordptr[0] = 0;
dwordptr[1] = pa >> 4;
This is all unnecessary (only needed for 80486 or older SMP systems that don't exist).
Code: Select all
// ... prior to executing the following sequence:"
if ((r = ipi_init(apicid)) < 0)
panic("unable to send init");
timer_delay(10 * 1000000); // 10ms
I don't know how many bugs are in "ipi_init()" - possibly a lot.
Should just do "if ( ipi_init(apicid) < 0)" because you don't use the value in "r" anyway.
Code: Select all
for (uint32_t i = 0; i < 2; i++) {
apic_icr_wait();
apic_write(LAPIC_ICRHI, apicid <<24);
apic_write(LAPIC_ICRLO, STARTUP | (pa >> 12));
timer_delay(200 * 1000); // 200us
}
}
Probably don't need to do the first "apic_write()" here, as the destination APIC ID should still be set after the "ipi_init()".
The second "apic_write()" looks right, but I don't know what the code in the "apic_write()" function does, or if LAPIC_ICRLO is correct, or if STARTUP is correct, or if "pa" is sane (e.g. if pa is the address 0x123, then..).
Cheers,
Brendan
Re: try to boot other cpu in lapic
Posted: Fri Aug 24, 2012 11:27 pm
by hfcao
1)the code of apic_write like this :
Code: Select all
static void
apic_write(uint32_t off, uint32_t val)
{
*(volatile uint32_t *) (pa2kva(LAPIC_BASE) + off) = val;
}
2)
I don't know how many bugs are in "ipi_init()" - possibly a lot.
the following is the ipi_init()
Code: Select all
static int
ipi_init(uint32_t apicid)
{
// Intel MultiProcessor spec. section B.4.1
apic_write(LAPIC_ICRHI, apicid << LAPIC_ID_SHIFT);
apic_write(LAPIC_ICRLO, apicid | LAPIC_DLMODE_INIT | LAPIC_LVL_TRIG |
LAPIC_LVL_ASSERT);
apic_icr_wait();
timer_delay(10 * 1000000); // 10ms
apic_write(LAPIC_ICRLO, apicid | LAPIC_DLMODE_INIT | LAPIC_LVL_TRIG |
LAPIC_LVL_DEASSERT);
apic_icr_wait();
return (apic_read(LAPIC_ICRLO) & LAPIC_DLSTAT_BUSY) ? -1 : 0;
}
3)
The second "apic_write()" looks right, but I don't know what the code in the "apic_write()" function does, or if LAPIC_ICRLO is correct, or if STARTUP is correct, or if "pa" is sane (e.g. if pa is the address 0x123, then..).
the LAPIC_ICRLO is 0x300, and STARTUP is 0x00000600.Are there some errors?thanks~
Re: try to boot other cpu in lapic
Posted: Fri Aug 24, 2012 11:59 pm
by Brendan
Hi,
hfcao wrote:1)the code of apic_write like this :
Code: Select all
static void
apic_write(uint32_t off, uint32_t val)
{
*(volatile uint32_t *) (pa2kva(LAPIC_BASE) + off) = val;
}
That looks OK to me.
hfcao wrote:2)
I don't know how many bugs are in "ipi_init()" - possibly a lot.
the following is the ipi_init()
Code: Select all
static int
ipi_init(uint32_t apicid)
{
// Intel MultiProcessor spec. section B.4.1
apic_write(LAPIC_ICRHI, apicid << LAPIC_ID_SHIFT);
apic_write(LAPIC_ICRLO, apicid | LAPIC_DLMODE_INIT | LAPIC_LVL_TRIG |
LAPIC_LVL_ASSERT);
apic_icr_wait();
timer_delay(10 * 1000000); // 10ms
apic_write(LAPIC_ICRLO, apicid | LAPIC_DLMODE_INIT | LAPIC_LVL_TRIG |
LAPIC_LVL_DEASSERT);
apic_icr_wait();
return (apic_read(LAPIC_ICRLO) & LAPIC_DLSTAT_BUSY) ? -1 : 0;
}
That code is only for "80486 or older". You only need to send one "INIT assert".
I assume that "LAPIC_DLSTAT_BUSY" is the delivery status flag. This tells you if the local APIC is still sending, not if it failed to send. Normally you'd test this in a loop (with a timeout) before you send an IPI (not after). For example:
Code: Select all
while( delivery_status_is_set() ) {
/* Wait for it to finish sending previous IPI */
}
send_IPI(); /* Send the next IPI */
To detect errors, see the Error Status Register (but be very careful with race conditions).
hfcao wrote:3)
The second "apic_write()" looks right, but I don't know what the code in the "apic_write()" function does, or if LAPIC_ICRLO is correct, or if STARTUP is correct, or if "pa" is sane (e.g. if pa is the address 0x123, then..).
the LAPIC_ICRLO is 0x300, and STARTUP is 0x00000600.Are there some errors?thanks~
That sounds right to me too.
Cheers,
Brendan
Re: try to boot other cpu in lapic
Posted: Sat Aug 25, 2012 12:57 am
by hfcao
thaks~
To detect errors, see the Error Status Register (but be very careful with race conditions).
I use the following code to print the contents of Error Status Register
Code: Select all
printf("lapic error:ESR:%x\n",apic_read(ESR))
and the lapic_read() is
Code: Select all
static uint32_t
apic_read(uint32_t off)
{
return *(volatile uint32_t *) (pa2kva(LAPIC_BASE) + off);
}
and it just print "lapic error:ESR:0"
does it means it is right?
Re: try to boot other cpu in lapic
Posted: Sat Aug 25, 2012 1:25 am
by hfcao
I forgot to say the platform I use is AMD 64bit machine ,does that affect? (Since I think MP Specification is an open standard.I don't I am right or wrong?)
Re: try to boot other cpu in lapic
Posted: Sat Aug 25, 2012 4:30 pm
by Brendan
Hi,
hfcao wrote:To detect errors, see the Error Status Register (but be very careful with race conditions).
I use the following code to print the contents of Error Status Register
hfcao wrote:and it just print "lapic error:ESR:0"
does it means it is right?
It could mean that it was right, or it could just mean that the hardware didn't update the Error Status Register until after you read it.
hfcao wrote:I forgot to say the platform I use is AMD 64bit machine ,does that affect? (Since I think MP Specification is an open standard.I don't I am right or wrong?)
If the MP Specification tables exist and are valid, then it shouldn't matter if the CPU is 64-bit or if the OS is 64-bit. However, the MP Specification tables have become obsolete (replaced by ACPI's "MADT/APIC" table), and for modern hardware the MP Specification tables may not exist or may not be valid (e.g. just an minimal stub that says there's only one CPU and not much else, even though there are more CPUs, etc).
Also note that whatever the problem is that you're currently having (I lost track), it may have nothing to do with the AP startup code itself. For a silly/hypothetical example, a bug in your virtual memory manager could mean that you're not actually reading/writing to the local APIC at all; or a bug in your real mode "trampoline" might make it look like the AP CPU didn't start even though it did.
Cheers,
Brendan
Re: try to boot other cpu in lapic
Posted: Sat Aug 25, 2012 5:43 pm
by Owen
Brendan wrote: However, the MP Specification tables have become obsolete (replaced by ACPI's "MADT/APIC" table), and for modern hardware the MP Specification tables may not exist or may not be valid (e.g. just an minimal stub that says there's only one CPU and not much else, even though there are more CPUs, etc).
It's much worse than that: The MP specification tables exist, appear valid, and are wrong in subtle ways.
In general, expect the MP tables to be generated by some BIOS code which has been copied and pasted from BIOS to BIOS for the last 5 years with no maintenance (because nobody cares about NT 4 any more) and only continues to exist because it would cost money to dike it out and nobody cares ("It boots Windows 2000 through 7. Ship it")