Page 1 of 1

Problem starting some QEMU APs

Posted: Sat Oct 08, 2016 8:31 am
by heat
Hi,

I've ran into a problem recently. While waking up the APs, some get woken up, and usually one or two don't. I've been running some tests and here are the results (pc = 0x95 is the only valid IP, because I place my smp trampoline code at 0x0):
QEMU with -smp 2:

Code: Select all

* CPU #0: pc=0xffffffff8010e3b1 (halted) thread_id=14502
  CPU #1: pc=0x0000000000000095 (halted) thread_id=14503
QEMU with -smp 3 (as you can see, CPU2 doesn't wake up (staying in the BIOS), even after a second SIPI):

Code: Select all

* CPU #0: pc=0xffffffff8010e3b1 (halted) thread_id=14665
  CPU #1: pc=0x0000000000000095 (halted) thread_id=14666
  CPU #2: pc=0x00000000000fec56 (halted) thread_id=14667
QEMU with -smp 4 (odly enough, CPU2 doesn't wake up, but my OS thinks it woke up, might this be some bug?):

Code: Select all

* CPU #0: pc=0xffffffff8010e3b1 (halted) thread_id=14801
  CPU #1: pc=0x0000000000000095 (halted) thread_id=14802
  CPU #2: pc=0x00000000000fec56 (halted) thread_id=14803
  CPU #3: pc=0x0000000000000095 (halted) thread_id=14804
If I do an "Interrupt to all but self", every CPU wakes up (as expected), although that's obviously undesirable (disabled CPUs and such don't get respected).
Right now I'm reading from the MADT to get every "enabled" CPU.
This is my WakeUpProcessor function:

Code: Select all

void SendIPI(uint8_t id, uint32_t type, uint32_t page)
{
	*lapic_ipiid |= (uint32_t)id << 24;
	uint64_t icr = type << 8;
	icr |= page & 0xFF;
	icr |= (1 << 14);
	// BAD HACK: We shouldn't use broadcast to all but self, but instead fix the existing code to work on qemu
	//icr |= (3 << 18); With this uncommented, every CPU wakes up
	*lapic_icr = icr;
}
void WakeUpProcessor(uint8_t lapicid)
{
	SendIPI(lapicid, 5, 0);
	uint64_t tick = get_tick_count();
	while(get_tick_count() - tick < 200)
		asm volatile("hlt");
	core_stack = (volatile uint64_t)Memory::GetVirtualPages(1, 2, VMM_TYPE_STACK, VMM_WRITE) + 0x2000;
	Memory::Map((void*)(core_stack - 0x2000), 2, VMM_WRITE | VMM_GLOBAL | VMM_NOEXEC);
	SendIPI(lapicid, 6, 0);
	tick = get_tick_count();
	while(get_tick_count() - tick < 1000)
	{
		if(ap_done == 1)
		{
			printf("AP core woke up! LAPICID %d at tick %d\n", lapicid, get_tick_count());
			break;
		}
	}
	if(ap_done == 0)
	{
		printf("Trying second SIPI\n");
		SendIPI(lapicid, 6, 0);
		tick = get_tick_count();
		while(get_tick_count() - tick < 1000)
		{
			if(ap_done == 1)
			{
				printf("AP core woke up! LAPICID %d at tick %d\n", lapicid, get_tick_count());
				break;
			}
		}
	}
	if(ap_done == 0)
	{
		printf("Failed to start an AP with LAPICID %d\n", lapicid);
	}
	ap_done = 0;
}
Here's the SMP trampoline code, if someone is interested: http://pastebin.com/g0pusuQU
Notice the AP initialization might be a bit broken since I'm trying to get all the CPUs to wake up before actually using them.

Best regards,

heat

Re: Problem starting some QEMU APs

Posted: Sun Oct 09, 2016 7:11 am
by heat
Hi,

I've just fixed it. My problem was that I was OR-ing the lapicid's together instead of just setting it using '='. #-o

Best regards,

heat