Page 1 of 1
Multitasking problems
Posted: Sun Aug 27, 2006 7:59 am
by matthias
I've a big problem with multitasking, first I had a lot of General Protection Faults, even triple faults. After changing the code I finally got the code not to fault but do something else instead :-X
It just hangs after the first task switch ::)
My scheduler code:
Code: Select all
unsigned long ticks = 0;
extern void switch_to(unsigned long);
thread* prev;
void schedule(struct regs* r)
{
/* get next thread */
SchedulerGetNextNode();
if(ticks % 10000 == 0)
{
puts("1 second passed\n");
}
ticks++;
int sw = 0;
if(node != prev)
{
sw = 1;
}
prev = node;
if(sw)
{
switch_to(node->gdt); /* after this nothing happens */
}
}
extern _schedule
; 32: IRQ0
_irq0:
cli
pusha
push ds
push es
push fs
push gs
mov ax, 0x10
mov ds, ax
mov es, ax
mov fs, ax
mov gs, ax
mov eax, esp
push eax
mov eax, _schedule
call eax
pop eax
pop gs
pop fs
pop es
pop ds
popa
add esp, 8
sti
iret
.globl _switch_to
_switch_to:
ljmp *(%esp)
ret
As an attachment I give you the source file. (Or
this file)
p.s. have been debugging a whole week now :-[
Re:Multitasking problems
Posted: Sun Aug 27, 2006 3:22 pm
by Habbit
I really don't know a lot about multitasking, as I haven't yet reached that point of my kernel, but I can see a major flaw in your code: you jump to the task while you're still inside the interrupt handler.
I will explain myself:
Code: Select all
IRQ0:
cli
pusha and other pushes
call _schedule <<-------- C function schedule
{
account_for_passed_ticks;
find_next_thread();
switch_to(tread->gdt); <<---------- ASM proc switch_to
ljmp *(%esp) <<--- jump to the procedure
ret
}
other pops and popa
sti
iret
And the flaws are...
- No really a flaw, but you're mixing intel syntax (proc IRQ0) with AT&T (switch_to)... What for? Use nasm or gas, but not both (well, do what you want, just an opinion)
- You're using the hardware task switching method. Again, do what you want, but even if hardware task management is far easier for the OSdever, it 1) is way slower than a good designed software management, and 2) won't work for AMD64 processors, so it is not "future compatible"
- And the real flaw: you're jumping to the task while still inside the IRQ0 handler, as this language-mixed, call-nested view shows. So, when you jump to the next thread, INTERRUPTS ARE STILL DISABLED (execution has not yet arrived to STI). So, even when the task switch succeeds (that I don't know), the next clock interrupt will never arrive and the scheduler won't be called again.
So what can you do with the real flaw? Well, you could, for example, modify the return information in the stack of IRQ0.
This works for software task management, but I don't know if it will for the hardware, TSS-based version. Basically, the idea is this: the CS:EIP in force when the interrupt was detected is stored in the stack, and used by IRET to return to it. Just replace it with the address you want to jump to.
Assuming that information was stored just before calling the handler (that I don't know, look for it at the intel/amd manuals), the code would be like this:
Code: Select all
IRQ0:
cli
push %ebp
mov %esp, %ebp <<---- return address would be at 4(%ebp)
pusha and other pushes
<<-- save current thread info from the stack to the thread table
call _schedule <<---- C functions return value is at %EAX
mov %eax, 4(%ebp) <<-- modify return address
<<-- if the task will change, load the new state from the thread table to the stack
other pops and popa
pop %ebp
sti
iret <<-- this will return to the modified return address
void* schedule()
{
account_for_ticks;
nextThread = GetNextThread();
return nextThread->eip;
}
This code is 100% na?ve, does not account for changes in the code segment of apps, has probably more errors than your code, and a lot of etceteras, but I think (correct me please, OSdevers of the forum) these are the basics of multitasking
[edit]aargh! i always go crazy with the stack direction T_T[/edit]
Re:Multitasking problems
Posted: Sun Aug 27, 2006 3:32 pm
by matthias
Habbit wrote:
lots of text
Well, I based this code on another's OS scheduling code. Some polish guys were working on an OS called ChaOS. 2 years ago I sort of copy&pasted their code into another kernel of mine. Now 2 years later I ported it to my new kernel with many changes regarding the scheduling lists and I didn't really changed a lot in the switching method. Or even the task creation. That code from my previous kernel worked like a choochoo. As for the interrupts, putting a sti before the switch doesn't help
![Wink ;)](./images/smilies/icon_wink.gif)
Tried it but failed. Also my eflags are also setup ok, as in INTERUPTABLE bit is set. It has been days mangling with my code now -_-.
As far as for the x86-64 stuff I do not intend to make such an OS yet, it will support it in future, so in 32-bits mode I prefer the security of tss-based switching and in 64-bits I'll have to obey the power of software switching. I have reasons for tss-based-switching, which are for instance security and just testing.
Also to do hardware task switching you'll have to do a call, jmp or emulate an interrupt (i.e. syscall) to force context switches (task gates, interrupt gates, etc.)
Re:Multitasking problems
Posted: Sun Aug 27, 2006 6:18 pm
by Habbit
matthias wrote:
As for the interrupts, putting a sti before the switch doesn't help
![Wink ;)](./images/smilies/icon_wink.gif)
Tried it but failed.
How is the IRQ0 handler getting called? Through a task gate? If that is the case, a STI won't suffice: you will actually have to IRET from the IRQ0 handler or the BUSY bit in the task gate will be active, forbidding reentrance (and thus another call to the scheduler). Besides, the NT (nested task) bit in EFLAGS (or CRsomething, dunno) can be responsible.
Or so I think I read somewhere... ;D
Re:Multitasking problems
Posted: Mon Aug 28, 2006 5:05 am
by matthias
Habbit wrote:
matthias wrote:
As for the interrupts, putting a sti before the switch doesn't help
![Wink ;)](./images/smilies/icon_wink.gif)
Tried it but failed.
How is the IRQ0 handler getting called? Through a task gate? If that is the case, a STI won't suffice: you will actually have to IRET from the IRQ0 handler or the BUSY bit in the task gate will be active, forbidding reentrance (and thus another call to the scheduler). Besides, the NT (nested task) bit in EFLAGS (or CRsomething, dunno) can be responsible.
Or so I think I read somewhere... ;D
I setup the idt entry as:
Code: Select all
idt_set_gate(32, (unsigned)irq0, 0x08, 0x8E);
idt code:
Code: Select all
/* Use this function to set an entry in the IDT. Alot simpler
* than twiddling with the GDT ;) */
void idt_set_gate(unsigned char num, unsigned long base, unsigned short sel, unsigned char flags)
{
/* The interrupt routine's base address */
idt[num].base_lo = (base & 0xFFFF);
idt[num].base_hi = (base >> 16) & 0xFFFF;
/* The segment or 'selector' that this IDT entry will use
* is set here, along with any access flags */
idt[num].sel = sel;
idt[num].always0 = 0;
idt[num].flags = flags;
}
About the eflags, I flushed the eflags register so I'm certain the nested bit is not '1'. Even without flushing the eflags it does not work.
Also when I don't do a switch_to call every time a second passes it writes it to screen so my interrupt handlers are ok. The only problem is that when it gets to the instruction to jump to the tss it just doesn't do anything at all. I've tested some other nasm code:
Code: Select all
global _switch_to
_switch_to:
push ebp
mov ebp, esp
jmp far [ss:ebp+4]
pop ebp
ret
The code gets to the point of jmp far [ss:ebp+4] and then it's over with the fun
![Sad :(](./images/smilies/icon_sad.gif)
And yes I do send an EOI
Re:Multitasking problems
Posted: Mon Aug 28, 2006 1:48 pm
by Ryu
Habbit wrote:
[*]You're using the hardware task switching method. Again, do what you want, but even if hardware task management is far easier for the OSdever, it 1) is way slower than a good designed software management, and 2) won't work for AMD64 processors, so it is not "future compatible"
I actually recommend starting with hardware switches because this allows you to get familiar with the processor behavoir which will lead to a tuned up software switching and scheduling eventually. But even software switching on intel based processors you still need a TSS to switch into differnt rings. Is this really "future compatible"?, I've not yet looked into AMD prcoessors but if they are differnt I plan to create an entirely new task switcher and most likely the entire scheduler if the differnts if so great such as if theres no intel's equivalent "Task Gates". But I mean if this is the case then you still have to make two differnt software task switchers which doesn't tell me a software switcher will be compitable in any case.
Re:Multitasking problems
Posted: Mon Aug 28, 2006 2:16 pm
by Habbit
Ryu wrote:
But even software switching on intel based processors you still need a TSS to switch into differnt rings. Is this really "future compatible"?, I've not yet looked into AMD prcoessors but if they are differnt I plan to create an entirely new task switcher and most likely the entire scheduler if the differnts if so great such as if theres no intel's equivalent "Task Gates".
There are still TSSs in AMD64. What is not available is the task switching mechanism. That is, the TSS is just an information storage. In fact, it has been revamped to that purpose: its fields to store CR3 and the GPR have been removed to store the new "interrupt stack table", a set of clean stacks for certain interrupts (used for something like what the hardware switcher was: providing a clean stack for the double fault - or, by the way, any - interrupt handler). There are many more changes, so you should take a look at either the AMD or the Intel manual (or even better, at both)
Re:Multitasking problems
Posted: Mon Aug 28, 2006 2:56 pm
by matthias
I redesigned the IRQ handling, PIC remapping (auto EOI on master) and IDT, but I still don't get it to switch
![Sad :(](./images/smilies/icon_sad.gif)
Any ideas you guys?
Re:Multitasking problems
Posted: Fri Sep 01, 2006 6:21 am
by matthias
I detected some strange behaviour. If I try the taskswitch on a real system, it just hangs, however in a virtual machine (MS VPC) it reboots and says it encountered a fault. What can be concluded from this?
And another question I had reading upon the topic:
http://www.mega-tokyo.com/forum/index.php?board=1;action=display;threadid=10129
Could my problem be of the same issue?
Once again my IRQ routine (updated)
Code: Select all
; 32: IRQ0
_irq0:
cli
push byte 0
push byte 32
jmp irq_common_stub
extern _irq_handler
irq_common_stub:
pusha
push ds
push es
push fs
push gs
mov ax, 0x10
mov ds, ax
mov es, ax
mov fs, ax
mov gs, ax
mov eax, esp
push eax
mov eax, _irq_handler
call eax
pop eax
pop gs
pop fs
pop es
pop ds
popa
add esp, 8
iret
irq_handler() calls the schedule() function
![Wink ;)](./images/smilies/icon_wink.gif)
Re:Multitasking problems
Posted: Fri Sep 01, 2006 7:51 am
by Combuster
The stack is probably the problem here - whats missing in the taskswitch is the following:
(you push registers and all, good)
- save current esp, so that the next time its scheduled you can restore it
- determine next task to run
- load the task's context (like page directory, esp0 in the tss, iopl bits)
- load the task's corresponding esp
(return, pop registers and iret back to execution)
By jumping to code you are simply throwing away your tasks previous context - you'll never know what its esp was before you suspended it
modified excerpt from my taskswitcher
Code: Select all
; have your next task ready in [nexttask],
; maybe on the current stack or somewhere in global memory
; both should work
MOV EAX, [scheduler_curtask] ; load current task info pointer
MOV [EAX+STACK], ESP ; store ESP
MOV EAX, [nexttask] ; get next task info pointer
MOV [scheduler_curtask], EAX ; set current task info pointer
MOV ESP, [EAX+STACK] ; load new ESP
Note: If you do userspace threads this way you'll have to set ESP0 in your TSS as well
For the rest, it is wise to check what is on a thread's stack before you try to switch to it the first time. Bochs'll help you big time with that.
Re:Multitasking problems
Posted: Fri Sep 01, 2006 8:13 am
by Pype.Clicker
matthias wrote:
I redesigned the IRQ handling, PIC remapping (auto EOI on master) and IDT, but I still don't get it to switch
![Sad :(](./images/smilies/icon_sad.gif)
Any ideas you guys?
i fear you've been fooled. The "auto EOI" feature cannot be used on a PC: you _have_ to send the EOI. that's how 80x86 processors expect the PIC to work, and programming the PIC to work in another way won't solve the issue.
Re:Multitasking problems
Posted: Fri Sep 01, 2006 8:16 am
by matthias
I already noticed that, but I'm sending EOI already, so this is not the problem.
Re:Multitasking problems
Posted: Sun Sep 03, 2006 5:01 am
by matthias
This is my new routine:
Code: Select all
tmp_stack: dd 0
_irq0:
cli
cld
pushad
push ds
push es
push fs
push gs
mov [tmp_stack], esp ; save current stack value
mov esp, _tss_stack ; move to new stack
call _schedule
mov esp, [tmp_stack] ; restore stack
pop gs
pop fs
pop es
pop ds
popad
mov al, 0x20
out 0x20, al
sti
iretd
SECTION .bss
resb 4096 ; This reserves 4KBytes of memory here
[global _tss_stack]
_tss_stack:
This works ok, but the task still doesn't run
Re:Multitasking problems
Posted: Mon Sep 04, 2006 2:58 am
by Pype.Clicker
may i ask what's supposed to be on the outgoing stack ? i mean, the first thing you'll do there will be popping out registers values and the like. You do have values to pop in first place, don't you?
btw,
1. doing a CLI in the start of a IRQ routine usually mean you're using a trap gate instead of an interrupt gate in your IDT, but it's (imho) bad practice.
2. doing a STI at the end of an IRQ routine is hazardous. If you've received an IRQ during the handler, it will be processed _before_ you had the option to completely clean the stack from the pushed CS and EIP. after a while, that can make your stack growing and growing
![Razz :P](./images/smilies/icon_razz.gif)
3. i haven't read all the above posts in details, but i fail to see why you *need* the "_tss_stack" in first place. handling _schedule on the incoming stack would be as fine, no?