Mutex Problem (Or Anything Else) in SMP Scheduling
Mutex Problem (Or Anything Else) in SMP Scheduling
Hi everyone. I've made threads run in my OS with round-robin algorithm and with uniprocessor(and it works well). Now I want to make threads run with another CPU. So I've correctly made AP's awake and let it load another thread. It works well for several milliseconds (the first picture: "B" runs on an AP and "I" runs on the BSP, the "!" suggests causing reentrant).
However, after that, everything is broken and the system reset (the second picture).
What's wrong? Is it a problem in mutex? Or any other causes?
Thanks for any help and advice.
Some useful(I think it is) information about my OS:
1.The timer handler(written in asm) and the scheduler(written in C) is shared with all the CPUs.
2.Every CPU has got it's own thread queue(based on array and with proper enqueue and dequeue operation).
3.The CPUs' logical IDs are stored in IA32_TSC_AUX and the scheduler can determine which thread queue to choose.
4.Before calling the scheduler, there's a spinlock and after that the lock will be released so the scheduler is atomic.
5.The thread context is stored in stacks.
However, after that, everything is broken and the system reset (the second picture).
What's wrong? Is it a problem in mutex? Or any other causes?
Thanks for any help and advice.
Some useful(I think it is) information about my OS:
1.The timer handler(written in asm) and the scheduler(written in C) is shared with all the CPUs.
2.Every CPU has got it's own thread queue(based on array and with proper enqueue and dequeue operation).
3.The CPUs' logical IDs are stored in IA32_TSC_AUX and the scheduler can determine which thread queue to choose.
4.Before calling the scheduler, there's a spinlock and after that the lock will be released so the scheduler is atomic.
5.The thread context is stored in stacks.
Doing steadfastly, or doing nil.
- max
- Member
- Posts: 616
- Joined: Mon Mar 05, 2012 11:23 am
- Libera.chat IRC: maxdev
- Location: Germany
- Contact:
Re: Mutex Problem (Or Anything Else) in SMP Scheduling
Hey,
you can't really expect help if you don't post any code or have done any more research on what's going wrong.
My crystal ball says you're causing a triple fault somewhere in your code.
Greets
you can't really expect help if you don't post any code or have done any more research on what's going wrong.
My crystal ball says you're causing a triple fault somewhere in your code.
Greets
Re: Mutex Problem (Or Anything Else) in SMP Scheduling
at this level, you cant have mutexes/semaphores. those are high-level features, that you ensure with software assistance in your kernel. cache coherency is not ensured beethwen cores, you need a few thousands of cycles to sync the cache beethwen cores, a mutex-like feature will not magically work.
in my kernel, i organize everything from core 0. that controls what thread will execute what thread. your solution sounds nice, but its too nice to be true, dont try writing spectacularry optimistic code involving kernel funcionality to be placed on the AP-s, becouse in most cases .... uh.. what example i could i illustrate it to an asian... so a dragon will come and rape your lolis.
in my kernel, i organize everything from core 0. that controls what thread will execute what thread. your solution sounds nice, but its too nice to be true, dont try writing spectacularry optimistic code involving kernel funcionality to be placed on the AP-s, becouse in most cases .... uh.. what example i could i illustrate it to an asian... so a dragon will come and rape your lolis.
Operating system for SUBLEQ cpu architecture:
http://users.atw.hu/gerigeri/DawnOS/download.html
http://users.atw.hu/gerigeri/DawnOS/download.html
Re: Mutex Problem (Or Anything Else) in SMP Scheduling
I guess I don't understand what you are talking about at all. Are you saying that x86 doesn't have cache coherency? I mean, isn't cache coherency _only_ about multi-core to begin with? You're saying that either x86 is incoherent or it lacks multi-core cache coherency, but cache coherency is only relevant when there's multiple cores, so..?Geri wrote:at this level, you cant have mutexes/semaphores. those are high-level features, that you ensure with software assistance in your kernel. cache coherency is not ensured beethwen cores, you need a few thousands of cycles to sync the cache beethwen cores, a mutex-like feature will not magically work.
Also where does that "few thousand cycles" come from? Any source? Yes, moving cache lines between cores costs, but it would be pretty stupid if it costs more than actually going all the way to RAM, and going to RAM I think is something like 100-300 cycles and cache would be thousands according to you?
I plan to do "kernel" functionality on every core, not just BSP and as far as I know there's absolutely no issues doing it that way, and that's what pretty much every "real" OS does too...Geri wrote: in my kernel, i organize everything from core 0. that controls what thread will execute what thread. your solution sounds nice, but its too nice to be true, dont try writing spectacularry optimistic code involving kernel funcionality to be placed on the AP-s, becouse in most cases .... uh.. what example i could i illustrate it to an asian... so a dragon will come and rape your lolis.
You do have to make sure that your kernel functionality doesn't mess with what the other cores are doing but that's just basic multi-threading, if you don't understand multi-threading then you shouldn't be bringing the AP's up until you do understand.
Depending on your system, going to BSP for every kernel call may have significant performance impact..
Re: Mutex Problem (Or Anything Else) in SMP Scheduling
That is just bullshit. You need kernel level mutexes/semaphores to implement a SMP-safe kernel. The most common form of mutexes at the kernel level are spinlocks that protect data structures from concurrent modification. This has to be coupled with disabling interrupts if the structures are also accessed by IRQ handlers. More complex kernels will require more complex mutexes that e.g. can block/unblock kernel threads.Geri wrote:at this level, you cant have mutexes/semaphores. those are high-level features, that you ensure with software assistance in your kernel. cache coherency is not ensured beethwen cores, you need a few thousands of cycles to sync the cache beethwen cores, a mutex-like feature will not magically work.
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
Re: Mutex Problem (Or Anything Else) in SMP Scheduling
Don't listen to him, he has absolutely no clue.LtG wrote:I guess I don't understand what you are talking about at all. Are you saying that x86 doesn't have cache coherency? I mean, isn't cache coherency _only_ about multi-core to begin with? You're saying that either x86 is incoherent or it lacks multi-core cache coherency, but cache coherency is only relevant when there's multiple cores, so..?Geri wrote:at this level, you cant have mutexes/semaphores. those are high-level features, that you ensure with software assistance in your kernel. cache coherency is not ensured beethwen cores, you need a few thousands of cycles to sync the cache beethwen cores, a mutex-like feature will not magically work.
Learn to read.
Re: Mutex Problem (Or Anything Else) in SMP Scheduling
dozniak:
who didnt yet did smp (or anything), leave the topic.
oh jesus, you are out from the game
who didnt yet did smp (or anything), leave the topic.
oh jesus, you are out from the game
Last edited by Geri on Sat May 06, 2017 11:32 am, edited 1 time in total.
Operating system for SUBLEQ cpu architecture:
http://users.atw.hu/gerigeri/DawnOS/download.html
http://users.atw.hu/gerigeri/DawnOS/download.html
Re: Mutex Problem (Or Anything Else) in SMP Scheduling
i benchmarked itLtG wrote: Also where does that "few thousand cycles" come from? Any source?
you thinking. i thinking, and i benchmarking. dont just think. do.LtG wrote: I think is something like 100-300 cycles and cache would be thousands according to you?
Operating system for SUBLEQ cpu architecture:
http://users.atw.hu/gerigeri/DawnOS/download.html
http://users.atw.hu/gerigeri/DawnOS/download.html
Re: Mutex Problem (Or Anything Else) in SMP Scheduling
What did you benchmark? I thought you said there's no cache coherency, so what is it that you benchmarked?Geri wrote:i benchmarked itLtG wrote: Also where does that "few thousand cycles" come from? Any source?
Care to show the code used to do the benchmarking? Maybe you accidentally included some other things in the benchmark that radically affect the results.
Here's some info w.r.t. cache coherency:
https://spcl.inf.ethz.ch/Publications/. ... -bench.pdf
Re: Mutex Problem (Or Anything Else) in SMP Scheduling
You didn', so why don't you leave the game?Geri wrote:who didnt yet did smp (or anything), leave the topic.
Learn to read.
Re: Mutex Problem (Or Anything Else) in SMP Scheduling
Hi,
The TLBs are the only caches where coherency (between the TLB on one CPU and the caches on the same CPU, and between the TLB on a CPU and normal caches on any other CPU, and between the TLB on one CPU and the TLBs on another CPU) isn't ensured.
If you have done something extremely wrong (misconfigured the MTRRs) it's impossible to sync the caches. The only things you can do is invalidate specific cache lines, invalidate the entire cache or disable the cache/s.
Cheers,
Brendan
At which level? The original poster said they're using spinlocks (and not mutexes) for their scheduler, and didn't say what mutexes were being used for. With the lack of information available its reasonable to assume that the mutexes are only used at a higher level.Geri wrote:at this level, you cant have mutexes/semaphores. those are high-level features, that you ensure with software assistance in your kernel.
Most people define "spinlock" as a kind of lock that spins when the lock can't be acquired (that never does a task switch), and "mutex" as a kind of lock that does a task switch when the lock can't be acquired (potentially after spinning briefly). With these definitions a mutex can't be used by the scheduler's task switch code without causing a nesting problem (e.g. when the scheduler can't acquire a mutex needed to switch tasks, the task blocks and the scheduler tries to acquire a mutex to switch tasks, and ...).Korona wrote:You need kernel level mutexes/semaphores to implement a SMP-safe kernel. The most common form of mutexes at the kernel level are spinlocks that protect data structures from concurrent modification.
For normal caches; cache coherency is ensured between cores, unless you've done something extremely wrong (misconfigured the MTRRs such that one CPU thinks an area is uncached while a different CPU thinks the same area is cached).Geri wrote: cache coherency is not ensured beethwen cores, you need a few thousands of cycles to sync the cache beethwen cores, a mutex-like feature will not magically work.
The TLBs are the only caches where coherency (between the TLB on one CPU and the caches on the same CPU, and between the TLB on a CPU and normal caches on any other CPU, and between the TLB on one CPU and the TLBs on another CPU) isn't ensured.
If you have done something extremely wrong (misconfigured the MTRRs) it's impossible to sync the caches. The only things you can do is invalidate specific cache lines, invalidate the entire cache or disable the cache/s.
That's extremely poor for scalability - you'd end up with most CPUs wasted most of the time because they're waiting for one CPU to reschedule; plus all CPUs fighting over the same cache lines.Geri wrote:in my kernel, i organize everything from core 0. that controls what thread will execute what thread.
AP CPUs using kernel functionality, including AP CPUs doing scheduling independently, is how every sane OS has always done it.Geri wrote:your solution sounds nice, but its too nice to be true, dont try writing spectacularry optimistic code involving kernel funcionality to be placed on the AP-s, becouse in most cases ....
As a random guess, I'd assume Geri benchmarked a WBINVD instruction - they're very expensive (but unnecessary for cache coherency).LtG wrote:What did you benchmark? I thought you said there's no cache coherency, so what is it that you benchmarked?Geri wrote:i benchmarked itLtG wrote: Also where does that "few thousand cycles" come from? Any source?
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Mutex Problem (Or Anything Else) in SMP Scheduling
Brendan: the problem is that he was not very specific about his problem. his only hint was that the problem was directly with his smp code (thats why he named the topic like that). a lot of thing can go wrong in his code - so i told him hints what is generally his issue. if we dont see any code, i dont think i or we can be more specific. reading his post about his SMP implementation, it is shockingly complex (compared to mine). basically he tried to gain very lot of performance and/or increase the effective clock cycle of his sheduler. he sacraficed the simplicity and the cleanness for a brutal optimization by placing a kernel mode task switching code snippet on the AP-s, making it almost undebuggable. this isnt a question of an used instruction, this basically could and will lead to various logic errors in the handling, and we cant distant-heal it.
Operating system for SUBLEQ cpu architecture:
http://users.atw.hu/gerigeri/DawnOS/download.html
http://users.atw.hu/gerigeri/DawnOS/download.html
Re: Mutex Problem (Or Anything Else) in SMP Scheduling
Hi,
When you say it's "too complex" what you're really saying is "simple compared to most multi-CPU schedulers, but complex relative to your scheduler". From this we can infer that your scheduler is "simplistic" compared to the scheduler Js2xxx described and that your scheduler is "extremely simplistic" compared to most multi-CPU schedulers.
Cheers,
Brendan
Yes; and this is why max asked for some code (in the very first reply to this topic), and why everyone (except you) was waiting for Js2xxx to respond to max's request for some code and didn't reply.Geri wrote:Brendan: the problem is that he was not very specific about his problem. his only hint was that the problem was directly with his smp code (thats why he named the topic like that). a lot of thing can go wrong in his code - so i told him hints what is generally his issue. if we dont see any code, i dont think i or we can be more specific.
Complexity is relative, and (compared to most multi-CPU schedulers) this scheduler is simple.Geri wrote:reading his post about his SMP implementation, it is shockingly complex (compared to mine). basically he tried to gain very lot of performance and/or increase the effective clock cycle of his sheduler. he sacraficed the simplicity and the cleanness for a brutal optimization by placing a kernel mode task switching code snippet on the AP-s, making it almost undebuggable. this isnt a question of an used instruction, this basically could and will lead to various logic errors in the handling, and we cant distant-heal it.
When you say it's "too complex" what you're really saying is "simple compared to most multi-CPU schedulers, but complex relative to your scheduler". From this we can infer that your scheduler is "simplistic" compared to the scheduler Js2xxx described and that your scheduler is "extremely simplistic" compared to most multi-CPU schedulers.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Mutex Problem (Or Anything Else) in SMP Scheduling
I haven't logged in for some days and just now I saw a lot of posts... It scared me...
Well, someone asked me for the code, so here it is:
And the main code of the scheduler is here:
I believe there're a lot of mistakes but I can't find them. Thanks for any help and advice.
Well, someone asked me for the code, so here it is:
Code: Select all
; The whole code of the timer handler: (It works well in single processor)
TimerLock db 0
MsgReentrant db "!", 0
aqtemp dq 0 ; other variables used in the handler are extern.
TimerHandler:
sub rsp, 8
mov [rel aqtemp], rsp
push rax
push rcx
push rdx
push rbx
push qword [rel aqtemp]
push rbp
push rsi
push rdi
push r8
push r9
push r10
push r11
push r12
push r13
push r14
push r15
mov ecx, IA32_X2APIC_LVT_TIMER
rdmsr
bts eax, 16
wrmsr
mov ecx, IA32_X2APIC_EOI
xor rax, rax
xor rdx, rdx
wrmsr
call GetPcsrId
movzx rax, ax
lea rbx, [rel ReentrantFlag]
inc dword [rbx + rax * 4]
cmp dword [rbx + rax * 4], 0
ja .reentrant
mov rsp, 0xFFFFFF8000590000
call SetTempTss
mov ax, 0x11 << 3
ltr ax
sti
mov rdi, TimerLock
call EnterSpinLock
call GetPcsrId
mov rdi, rax
call Schedule
cli
mov ecx, IA32_X2APIC_LVT_TIMER
rdmsr
btr eax, 16
wrmsr
mov rax, [rel CurrentT]
mov rsp, [rax]
lea rax, [rsp + 22 * 8]
mov rbx, 0xFFFFFF8000601800
mov qword [rbx + 4], rax
call GetPcsrId
mov rdi, TimerLock
call LeaveSpinLock
shl ax, 3
mov ax, 0x9 << 3
ltr ax
jmp .next
.reentrant:
mov rdi, MsgReentrant
call DispStr
.next:
call GetPcsrId
movzx rax, ax
lea rbx, [rel ReentrantFlag]
dec dword [rbx + rax * 4]
pop r15
pop r14
pop r13
pop r12
pop r11
pop r10
pop r9
pop r8
pop rdi
pop rsi
pop rbp
add rsp, 8
pop rbx
pop rdx
pop rcx
pop rax
add rsp, 8
iretq
Code: Select all
public void Schedule(int pcsr)
{
CurrentT = &(ReadyPtr[pcsr]);
(*CurrentT)->Ticks--;
if((*CurrentT)->Ticks <= 0)
{
(*CurrentT)->Ticks = (*CurrentT)->Priority;
Switch(pcsr);
}
ReadyPtr[pcsr] = &(ReadyThreads[pcsr].Data[ReadyThreads[pcsr].Front]);
CurrentT = &(ReadyPtr[pcsr]);
(*CurrentT)->State = Running;
TssDesc[pcsr]->AttrLow = TSS | DESC_P;
}
Last edited by Js2xxx on Sat May 06, 2017 8:57 pm, edited 1 time in total.
Doing steadfastly, or doing nil.
Re: Mutex Problem (Or Anything Else) in SMP Scheduling
Hi,
I see multiple problems in the timer IRQ handler's code that could/will cause problems when different CPUs are handling timer IRQs at the same time. Specifically, "aqtemp" and the stack at 0xFFFFFF8000590000 are shared state that would need a lock to prevent one CPU from trashing another CPU's data.
I'd also suggest commenting the code properly, so that you will be able to maintain it later on (after you've forgotten why you've done things), and so that other people can figure out why you're doing horrific things like "mov rsp, 0xFFFFFF8000590000".
Finally, I'd say that the entire scheduler is dependant on timer IRQs and completely useless for tasks blocking/unblocking, and needs to be redesigned and rewritten because of that. Note: You might like the "steps for writing a scheduler" I wrote at the end of this post.
Cheers,
Brendan
I see multiple problems in the timer IRQ handler's code that could/will cause problems when different CPUs are handling timer IRQs at the same time. Specifically, "aqtemp" and the stack at 0xFFFFFF8000590000 are shared state that would need a lock to prevent one CPU from trashing another CPU's data.
I'd also suggest commenting the code properly, so that you will be able to maintain it later on (after you've forgotten why you've done things), and so that other people can figure out why you're doing horrific things like "mov rsp, 0xFFFFFF8000590000".
Finally, I'd say that the entire scheduler is dependant on timer IRQs and completely useless for tasks blocking/unblocking, and needs to be redesigned and rewritten because of that. Note: You might like the "steps for writing a scheduler" I wrote at the end of this post.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.