Page 1 of 1

Do reading APIC IDs need spin-locks?

Posted: Fri May 14, 2010 6:28 am
by rdos
I have recently redesigned my kernel to support SMP. It was a lot of work (the hardest being changing hardware taskswitching to software taskswitching) but I think I now have a stable OS again for single processor systems. However, I still have problems with SMP systems. I have one test-machine with Hyperthreading (Intel Atom), and it (almost) works, but sometimes the scheduler crashes. I'm still not 100% sure why this is, but SMP and uni-processor uses different locks, with SMP using spin-locks.

It seems like I read inconsistent APIC IDs for the same core, and that this seems to occur rarely when many threads are running in short bursts. The scheduler reads the APIC ID every time it needs to lock something. My machine does not seem to support MSR for APIC, so I use it in memory-mapped mode. Is it necesary to use some kind of lock (spin-lock) around the code reading the APIC id from the memory mapped area? If this is locally implemented, it should work without synchronnization, but it seems like the problem is related to inconsistent APIC IDs. The wrong APIC IDs does not seem to be totally random, rather that one of the processors get the other processor's ID.

Anybody have an idea on this? Maybe I should use the TR-method instead to get the processor-block related
to the current processor? GS would not be possible, neither any other segment register, as my kernel is segmented.

Re: Do reading APIC IDs need spin-locks?

Posted: Fri May 14, 2010 8:04 am
by gerryg400
No, you don't need spinlocks to read the local APIC. However, where do you put the id when you read it? I haven't seen your code of course but make sure you don't put the id in a global variable (that another core might read). Sorry if that's too obvious!! Also, depending on your code you may need to have interrupts disabled...

- gerryg400

Re: Do reading APIC IDs need spin-locks?

Posted: Fri May 14, 2010 8:13 am
by Brendan
Hi,
rdos wrote:It seems like I read inconsistent APIC IDs for the same core, and that this seems to occur rarely when many threads are running in short bursts. The scheduler reads the APIC ID every time it needs to lock something. My machine does not seem to support MSR for APIC, so I use it in memory-mapped mode. Is it necesary to use some kind of lock (spin-lock) around the code reading the APIC id from the memory mapped area? If this is locally implemented, it should work without synchronnization, but it seems like the problem is related to inconsistent APIC IDs. The wrong APIC IDs does not seem to be totally random, rather that one of the processors get the other processor's ID.
You don't need a lock for accessing the local APIC's ID. You do need to make sure you're running on the same CPU though. For example, you can't get the APIC ID when running on CPU #0, then allow the scheduler to preempt the task and let the task run on CPU#1 while it's still using the ID it got from CPU#0.

The idea of using the TR (or GS/FS or something else) instead is to reduce overhead (it's usually quicker to get something from normal cache/RAM than from a device, even the local APIC) and to make sure the IDs aren't scattered (e.g. in theory, for a system with 2 CPUs the local APIC IDs could be 8 and 128, which makes it painful to use as the index of an array or something, and it'd be better to use 0 and 1 instead). If you used any of these methods, then you'd probably still have the same bug (e.g. getting the CPU number on one CPU, then using it after you've switched to a different CPU).


Cheers,

Brendan

Re: Do reading APIC IDs need spin-locks?

Posted: Fri May 14, 2010 1:58 pm
by rdos
gerryg400 wrote:No, you don't need spinlocks to read the local APIC. However, where do you put the id when you read it? I haven't seen your code of course but make sure you don't put the id in a global variable (that another core might read). Sorry if that's too obvious!! Also, depending on your code you may need to have interrupts disabled...

- gerryg400
Here is the code that reads the ID (or rather gets the selector associated with the current core:

mov ax,apic_mem_sel ; this is a selector mapped to the linear start address of the APIC
; It is mapped with paging to the physical address of the APIC (same in all address-spaces).
mov ds,ax
mov ebx,ds:APIC_ID
shr ebx,24
add bx,bx
mov ax,apic_data_sel
mov ds,ax
mov fs,ds:[bx].apic_arr ; Get the selector from an array that is created at boot-time.

The code is executed without disabling interrupts.

EDIT: I just realise that I've put this code before the scheduler lock, meaning that the current thread could be preempted before the lock, and then the scheduler will think it is running on another processor (!).

Re: Do reading APIC IDs need spin-locks?

Posted: Fri May 14, 2010 3:17 pm
by rdos
It is resolved. I moved the code for getting the APIC ID / processor selector inside the code that locks the scheduler, and this seems to have resolved the last SMP issue on my eMachine MiniPC. I can now run all of my multithreaded test applications without problems.

Now I just need to "fine-tune" the scheduler so it selects the best processor when threads are blocked/unblocked. I also need a synchronized counter (TSC) on multicore CPUs where TSCs aren't synchronized.

My Athlonx2 PC doesn't run, but there are other issues with it. It worked better when I used the PIC, but now that I switched to the IOAPIC it crashes very early in the boot. I probably need to implement relevant parts of ACPI to be able to set up the PIC/IOAPIC correctly.