Hi,
Thank you for your answers!
Code: Select all
flag = ccb->hd_timerq && tcba->alarmusec <= ccb->sched_ticks;
pri = tcba->priority; <-- possible fault here
do {
@linuxyne: your code is different to mine. For example your code would throw a page fault right on the first call to sched_pick(). My code only resolves tcba fields after it verified that hd_timerq is not empty.
If compiler can show at compile time that flag remains true
That's my point, it is not possible no matter what. For example if no task calls alarm() or delay() system calls then hd_timerq will be 0 all the time and tcba wouldn't be mapped at all. Also it is not possible to evaluate wether sched_ticks is bigger than alarmusec at compile time. That purely depends on
when you call sched_pick() in run-time. There's no way to correctly predict the value of "flag" at compile time.
You can see that if we call sched_awake (with i == pri == PRI_SYS), there's no need to run the loop.
Not true. There's another condition, namely sched_ticks. Also, you can't possibly predict the value of "pri" at compile time either. How could you possibly know in which priority queue will a task call alarm() (or for that matter that there'll be a task calling alarm() in the first place)? Both predictions on "flag" and "pri" is a mistake which the compiler shouldn't have made (and it does not if I remove the "continue" keyword).
But let's assume you're right and the compiler assumes both flag == true and i == pri == PRI_SYS. What's with the rest of the sched_pick() function when returned from sched_awake()? Even if the compiler eliminates the loop the instructions after the "found:" label should be still there! Those are doing things like switching to the new address space and calculating the next scheduler interrupt, none of which sched_awake() do.
About jumping in the middle of sched_awake, one needs to look at the disassembly in the entirely to see what the generated code actually was before concluding that jumping in the middle was actually incorrect.
You are missing the point here. Even if the rest of sched_pick() is moved into sched_awake(), the sched_pick() does not have an input argument like sched_awake(), and %rdi is not set at all before the jump. That can't be good, no matter what's in the disassembly (and it's not, it's causing randomly page faults depending on current %rdi value when I do the sched_pick() call).
I am not sure what you mean by volatile did not help. Were the loops removed even after forcing the compiler to read from memory for tcba and ccb pointers?
Yes exactly. It generated the same code even if I added "volatile" to ccb and tcba. I've also tried to add volatile to the struct field hd_timerq.
If any compiler has the issues that are being suggested here, it would be unusable.
I wouldn't call it unusable, but optimization is certainly not as mature as many developer think.
Or, your code is just the right piece to expose them
More likely. It seems I've came up with a simple, yet so complex algorithm that confuses the compiler's optimizer. (Just for the records, what you see here is a recursive scheduler in which reentrancy was replaced by a loop and a state machine, as stored in cr_active queues).
Either this, or those scientologist bastards hacked my computer again. Last time they broke into my house to gain physical access to my non-networked computer when I was at work. Thank God I had that cam installed and they haven't found it, one more rock-solid proof against them
Seriously, who in their right mind would think that breaking into a house is "fair" by any means??? Clear the planet, you can start by eliminating the cult of scientology!!!
@Octrocontrabass: yes I know that. What I meant inlining was an unwanted compiler's choice. I call sched_awake() from other parts as well, so this is not a called-only-once function, nor a small one, therefore it shouldn't be inlined at all. Same stands for sched_pick(), which is also called from assembly in the syscall handler. And there's no removed unreachable code in sched_awake() at all, it was working well when the message sender called it when a new message arrived to a blocked task. And compiler thinking that any other parts of sched_pick() were unreachable is clearly a mistake too.
Unfortunately I can't show you a complete compileable code right now because I'm in a middle of a refactoring for SMP. But when I'm finished with that I'll push to my remote repo and came back to this and try to reproduce the error and let you know. My code is quite complex, not easy to cut out this part alone (ccb and tcb are platform specific structs, for example ccb_t is just a normal struct under AArch64, but it aligns with TSS segment on x86_64). But here's source for sched_awake() if it helps, nothing special as you can see, only single and double linked list handling:
Code: Select all
void sched_awake(tcb_t *tcb)
{
ccb_t *ccb;
if(tcb->magic != TCB_MAGIC || tcb->priority > PRI_IDLE || tcb->cpuid >= numcores)
return;
ccb = (ccb_t*)CCB_ADDRESS(tcb->cpuid);
if(tcb->state != TCB_RUNNING) {
if(ccb->hd_blocked == tcb->pid) {
ccb->hd_blocked = tcb->next;
}
if(ccb->hd_timerq == tcb->pid) {
ccb->hd_timerq = tcb->next;
if(ccb->hd_timerq)
vmm_map(LDYN_tcbalarm, PID2PAGE(ccb->hd_timerq));
}
if(tcb->prev) {
vmm_map(LDYN_tmp, PID2PAGE(tcb->prev));
((tcb_t*)LDYN_tmp)->next = tcb->next;
}
if(tcb->next) {
vmm_map(LDYN_tmp, PID2PAGE(tcb->next));
((tcb_t*)LDYN_tmp)->prev = tcb->prev;
}
sched_add(tcb);
if(ccb->id != ((ccb_t*)LDYN_ccb)->id && ccb->currentpid == idle_tcb.pid)
platform_awakecpu(ccb->id);
}
ccb->cr_active[tcb->priority] = tcb->pid;
}
A little twist, sched_pick() operates on current cpu core's ccb (mapped at LDYN_ccb), while sched_awake() operates on the ccb of the cpu core on which the appointed task is running. Also if sched_awake() removes a task from the timerq, it maps the tcb of the new head at LDYN_tcbalam, which is relevant for the next call of sched_pick().
Cheers,
bzt