Page 1 of 2
Mutex Problem (Or Anything Else) in SMP Scheduling
Posted: Sun Apr 30, 2017 1:07 am
by Js2xxx
Hi everyone. I've made threads run in my OS with round-robin algorithm and with uniprocessor(and it works well). Now I want to make threads run with another CPU. So I've correctly made AP's awake and let it load another thread. It works well for several milliseconds (the first picture: "B" runs on an AP and "I" runs on the BSP, the "!" suggests causing reentrant).
However, after that, everything is broken and the system reset (the second picture).
What's wrong? Is it a problem in mutex? Or any other causes?
Thanks for any help and advice.
Some useful(I think it is) information about my OS:
1.The timer handler(written in asm) and the scheduler(written in C) is shared with all the CPUs.
2.Every CPU has got it's own thread queue(based on array and with proper enqueue and dequeue operation).
3.The CPUs' logical IDs are stored in IA32_TSC_AUX and the scheduler can determine which thread queue to choose.
4.Before calling the scheduler, there's a spinlock and after that the lock will be released so the scheduler is atomic.
5.The thread context is stored in stacks.
Re: Mutex Problem (Or Anything Else) in SMP Scheduling
Posted: Fri May 05, 2017 5:45 pm
by max
Hey,
you can't really expect help if you don't post any code or have done any more research on what's going wrong.
My crystal ball says you're causing a triple fault somewhere in your code.
Greets
Re: Mutex Problem (Or Anything Else) in SMP Scheduling
Posted: Fri May 05, 2017 6:54 pm
by Geri
at this level, you cant have mutexes/semaphores. those are high-level features, that you ensure with software assistance in your kernel. cache coherency is not ensured beethwen cores, you need a few thousands of cycles to sync the cache beethwen cores, a mutex-like feature will not magically work.
in my kernel, i organize everything from core 0. that controls what thread will execute what thread. your solution sounds nice, but its too nice to be true, dont try writing spectacularry optimistic code involving kernel funcionality to be placed on the AP-s, becouse in most cases .... uh.. what example i could i illustrate it to an asian... so a dragon will come and rape your lolis.
Re: Mutex Problem (Or Anything Else) in SMP Scheduling
Posted: Sat May 06, 2017 6:43 am
by LtG
Geri wrote:at this level, you cant have mutexes/semaphores. those are high-level features, that you ensure with software assistance in your kernel. cache coherency is not ensured beethwen cores, you need a few thousands of cycles to sync the cache beethwen cores, a mutex-like feature will not magically work.
I guess I don't understand what you are talking about at all. Are you saying that x86 doesn't have cache coherency? I mean, isn't cache coherency _only_ about multi-core to begin with? You're saying that either x86 is incoherent or it lacks multi-core cache coherency, but cache coherency is only relevant when there's multiple cores, so..?
Also where does that "few thousand cycles" come from? Any source? Yes, moving cache lines between cores costs, but it would be pretty stupid if it costs more than actually going all the way to RAM, and going to RAM I think is something like 100-300 cycles and cache would be thousands according to you?
Geri wrote:
in my kernel, i organize everything from core 0. that controls what thread will execute what thread. your solution sounds nice, but its too nice to be true, dont try writing spectacularry optimistic code involving kernel funcionality to be placed on the AP-s, becouse in most cases .... uh.. what example i could i illustrate it to an asian... so a dragon will come and rape your lolis.
I plan to do "kernel" functionality on every core, not just BSP and as far as I know there's absolutely no issues doing it that way, and that's what pretty much every "real" OS does too...
You do have to make sure that your kernel functionality doesn't mess with what the other cores are doing but that's just basic multi-threading, if you don't understand multi-threading then you shouldn't be bringing the AP's up until you do understand.
Depending on your system, going to BSP for every kernel call may have significant performance impact..
Re: Mutex Problem (Or Anything Else) in SMP Scheduling
Posted: Sat May 06, 2017 7:07 am
by Korona
Geri wrote:at this level, you cant have mutexes/semaphores. those are high-level features, that you ensure with software assistance in your kernel. cache coherency is not ensured beethwen cores, you need a few thousands of cycles to sync the cache beethwen cores, a mutex-like feature will not magically work.
That is just bullshit. You need kernel level mutexes/semaphores to implement a SMP-safe kernel. The most common form of mutexes at the kernel level are spinlocks that protect data structures from concurrent modification. This has to be coupled with disabling interrupts if the structures are also accessed by IRQ handlers. More complex kernels will require more complex mutexes that e.g. can block/unblock kernel threads.
Re: Mutex Problem (Or Anything Else) in SMP Scheduling
Posted: Sat May 06, 2017 7:22 am
by dozniak
LtG wrote:Geri wrote:at this level, you cant have mutexes/semaphores. those are high-level features, that you ensure with software assistance in your kernel. cache coherency is not ensured beethwen cores, you need a few thousands of cycles to sync the cache beethwen cores, a mutex-like feature will not magically work.
I guess I don't understand what you are talking about at all. Are you saying that x86 doesn't have cache coherency? I mean, isn't cache coherency _only_ about multi-core to begin with? You're saying that either x86 is incoherent or it lacks multi-core cache coherency, but cache coherency is only relevant when there's multiple cores, so..?
Don't listen to him, he has absolutely no clue.
Re: Mutex Problem (Or Anything Else) in SMP Scheduling
Posted: Sat May 06, 2017 11:28 am
by Geri
dozniak:
who didnt yet did smp (or anything), leave the topic.
oh jesus, you are out from the game
Re: Mutex Problem (Or Anything Else) in SMP Scheduling
Posted: Sat May 06, 2017 11:31 am
by Geri
LtG wrote:
Also where does that "few thousand cycles" come from? Any source?
i benchmarked it
LtG wrote:
I think is something like 100-300 cycles and cache would be thousands according to you?
you thinking. i thinking, and i benchmarking. dont just think. do.
Re: Mutex Problem (Or Anything Else) in SMP Scheduling
Posted: Sat May 06, 2017 1:07 pm
by LtG
Geri wrote:LtG wrote:
Also where does that "few thousand cycles" come from? Any source?
i benchmarked it
What did you benchmark? I thought you said there's no cache coherency, so what is it that you benchmarked?
Care to show the code used to do the benchmarking? Maybe you accidentally included some other things in the benchmark that radically affect the results.
Here's some info w.r.t. cache coherency:
https://spcl.inf.ethz.ch/Publications/. ... -bench.pdf
Re: Mutex Problem (Or Anything Else) in SMP Scheduling
Posted: Sat May 06, 2017 2:25 pm
by dozniak
Geri wrote:who didnt yet did smp (or anything), leave the topic.
You didn', so why don't you leave the game?
Re: Mutex Problem (Or Anything Else) in SMP Scheduling
Posted: Sat May 06, 2017 2:38 pm
by Brendan
Hi,
Geri wrote:at this level, you cant have mutexes/semaphores. those are high-level features, that you ensure with software assistance in your kernel.
At which level? The original poster said they're using spinlocks (and not mutexes) for their scheduler, and didn't say what mutexes were being used for. With the lack of information available its reasonable to assume that the mutexes are only used at a higher level.
Korona wrote:You need kernel level mutexes/semaphores to implement a SMP-safe kernel. The most common form of mutexes at the kernel level are spinlocks that protect data structures from concurrent modification.
Most people define "spinlock" as a kind of lock that spins when the lock can't be acquired (that never does a task switch), and "mutex" as a kind of lock that does a task switch when the lock can't be acquired (potentially after spinning briefly). With these definitions a mutex can't be used by the scheduler's task switch code without causing a nesting problem (e.g. when the scheduler can't acquire a mutex needed to switch tasks, the task blocks and the scheduler tries to acquire a mutex to switch tasks, and ...).
Geri wrote: cache coherency is not ensured beethwen cores, you need a few thousands of cycles to sync the cache beethwen cores, a mutex-like feature will not magically work.
For normal caches; cache coherency is ensured between cores, unless you've done something extremely wrong (misconfigured the MTRRs such that one CPU thinks an area is uncached while a different CPU thinks the same area is cached).
The TLBs are the only caches where coherency (between the TLB on one CPU and the caches on the same CPU, and between the TLB on a CPU and normal caches on any other CPU, and between the TLB on one CPU and the TLBs on another CPU) isn't ensured.
If you have done something extremely wrong (misconfigured the MTRRs) it's impossible to sync the caches. The only things you can do is invalidate specific cache lines, invalidate the entire cache or disable the cache/s.
Geri wrote:in my kernel, i organize everything from core 0. that controls what thread will execute what thread.
That's extremely poor for scalability - you'd end up with most CPUs wasted most of the time because they're waiting for one CPU to reschedule; plus all CPUs fighting over the same cache lines.
Geri wrote:your solution sounds nice, but its too nice to be true, dont try writing spectacularry optimistic code involving kernel funcionality to be placed on the AP-s, becouse in most cases ....
AP CPUs using kernel functionality, including AP CPUs doing scheduling independently, is how every sane OS has always done it.
LtG wrote:Geri wrote:LtG wrote:
Also where does that "few thousand cycles" come from? Any source?
i benchmarked it
What did you benchmark? I thought you said there's no cache coherency, so what is it that you benchmarked?
As a random guess, I'd assume Geri benchmarked a WBINVD instruction - they're very expensive (but unnecessary for cache coherency).
Cheers,
Brendan
Re: Mutex Problem (Or Anything Else) in SMP Scheduling
Posted: Sat May 06, 2017 3:16 pm
by Geri
Brendan: the problem is that he was not very specific about his problem. his only hint was that the problem was directly with his smp code (thats why he named the topic like that). a lot of thing can go wrong in his code - so i told him hints what is generally his issue. if we dont see any code, i dont think i or we can be more specific. reading his post about his SMP implementation, it is shockingly complex (compared to mine). basically he tried to gain very lot of performance and/or increase the effective clock cycle of his sheduler. he sacraficed the simplicity and the cleanness for a brutal optimization by placing a kernel mode task switching code snippet on the AP-s, making it almost undebuggable. this isnt a question of an used instruction, this basically could and will lead to various logic errors in the handling, and we cant distant-heal it.
Re: Mutex Problem (Or Anything Else) in SMP Scheduling
Posted: Sat May 06, 2017 3:38 pm
by Brendan
Hi,
Geri wrote:Brendan: the problem is that he was not very specific about his problem. his only hint was that the problem was directly with his smp code (thats why he named the topic like that). a lot of thing can go wrong in his code - so i told him hints what is generally his issue. if we dont see any code, i dont think i or we can be more specific.
Yes; and this is why max asked for some code (in the very first reply to this topic), and why everyone (except you) was waiting for Js2xxx to respond to max's request for some code and didn't reply.
Geri wrote:reading his post about his SMP implementation, it is shockingly complex (compared to mine). basically he tried to gain very lot of performance and/or increase the effective clock cycle of his sheduler. he sacraficed the simplicity and the cleanness for a brutal optimization by placing a kernel mode task switching code snippet on the AP-s, making it almost undebuggable. this isnt a question of an used instruction, this basically could and will lead to various logic errors in the handling, and we cant distant-heal it.
Complexity is relative, and (compared to most multi-CPU schedulers) this scheduler is simple.
When you say it's "too complex" what you're really saying is "simple compared to most multi-CPU schedulers, but complex relative to your scheduler". From this we can infer that your scheduler is "simplistic" compared to the scheduler Js2xxx described and that your scheduler is "extremely simplistic" compared to most multi-CPU schedulers.
Cheers,
Brendan
Re: Mutex Problem (Or Anything Else) in SMP Scheduling
Posted: Sat May 06, 2017 8:31 pm
by Js2xxx
I haven't logged in for some days and just now I saw a lot of posts... It scared me...
Well, someone asked me for the code, so here it is:
Code: Select all
; The whole code of the timer handler: (It works well in single processor)
TimerLock db 0
MsgReentrant db "!", 0
aqtemp dq 0 ; other variables used in the handler are extern.
TimerHandler:
sub rsp, 8
mov [rel aqtemp], rsp
push rax
push rcx
push rdx
push rbx
push qword [rel aqtemp]
push rbp
push rsi
push rdi
push r8
push r9
push r10
push r11
push r12
push r13
push r14
push r15
mov ecx, IA32_X2APIC_LVT_TIMER
rdmsr
bts eax, 16
wrmsr
mov ecx, IA32_X2APIC_EOI
xor rax, rax
xor rdx, rdx
wrmsr
call GetPcsrId
movzx rax, ax
lea rbx, [rel ReentrantFlag]
inc dword [rbx + rax * 4]
cmp dword [rbx + rax * 4], 0
ja .reentrant
mov rsp, 0xFFFFFF8000590000
call SetTempTss
mov ax, 0x11 << 3
ltr ax
sti
mov rdi, TimerLock
call EnterSpinLock
call GetPcsrId
mov rdi, rax
call Schedule
cli
mov ecx, IA32_X2APIC_LVT_TIMER
rdmsr
btr eax, 16
wrmsr
mov rax, [rel CurrentT]
mov rsp, [rax]
lea rax, [rsp + 22 * 8]
mov rbx, 0xFFFFFF8000601800
mov qword [rbx + 4], rax
call GetPcsrId
mov rdi, TimerLock
call LeaveSpinLock
shl ax, 3
mov ax, 0x9 << 3
ltr ax
jmp .next
.reentrant:
mov rdi, MsgReentrant
call DispStr
.next:
call GetPcsrId
movzx rax, ax
lea rbx, [rel ReentrantFlag]
dec dword [rbx + rax * 4]
pop r15
pop r14
pop r13
pop r12
pop r11
pop r10
pop r9
pop r8
pop rdi
pop rsi
pop rbp
add rsp, 8
pop rbx
pop rdx
pop rcx
pop rax
add rsp, 8
iretq
And the main code of the scheduler is here:
Code: Select all
public void Schedule(int pcsr)
{
CurrentT = &(ReadyPtr[pcsr]);
(*CurrentT)->Ticks--;
if((*CurrentT)->Ticks <= 0)
{
(*CurrentT)->Ticks = (*CurrentT)->Priority;
Switch(pcsr);
}
ReadyPtr[pcsr] = &(ReadyThreads[pcsr].Data[ReadyThreads[pcsr].Front]);
CurrentT = &(ReadyPtr[pcsr]);
(*CurrentT)->State = Running;
TssDesc[pcsr]->AttrLow = TSS | DESC_P;
}
I believe there're a lot of mistakes but I can't find them. Thanks for any help and advice.
Re: Mutex Problem (Or Anything Else) in SMP Scheduling
Posted: Sat May 06, 2017 8:53 pm
by Brendan
Hi,
I see multiple problems in the timer IRQ handler's code that could/will cause problems when different CPUs are handling timer IRQs at the same time. Specifically, "aqtemp" and the stack at 0xFFFFFF8000590000 are shared state that would need a lock to prevent one CPU from trashing another CPU's data.
I'd also suggest commenting the code properly, so that you will be able to maintain it later on (after you've forgotten why you've done things), and so that other people can figure out why you're doing horrific things like "mov rsp, 0xFFFFFF8000590000".
Finally, I'd say that the entire scheduler is dependant on timer IRQs and completely useless for tasks blocking/unblocking, and needs to be redesigned and rewritten because of that.
Note: You might like the "steps for writing a scheduler" I wrote at the end of this post.
Cheers,
Brendan