Page 1 of 1

adding tss asm problems

Posted: Sat May 20, 2006 3:54 pm
by GLneo
hi all, OK, I am working on adding user space task's, and in asm i'm not sure on how, i do have a little code, but the system doesn't switch?
now for some code:
asm switcher (where the problem is(I think))

Code: Select all

_task_timer:
    pushad
    push ds
    push es
    push fs
    push gs
    mov ax, 0x10
    mov ds, ax
    mov es, ax
    mov fs, ax
    mov gs, ax
    mov eax, _running_stack
    mov dword [eax], esp
    push eax
    call _task_timer_c
    pop ebx
    test ebx, eax
    je noswitch
    mov  [_running_stack], ebx
    cmp dword[ebx], 0x03
    jne notss
    mov ecx, [_TSS]
    mov word [ecx+8], KERNEL_DATA_SEL
    mov edx, [ebx+8]
    add edx, 4096
    mov dword[ecx+4], edx
 notss:
    mov esp, [ebx]
 noswitch:
    pop gs
    pop fs
    pop es
    pop ds
    popad
    iretd
and some of the c code that goes with it:

Code: Select all

void start_sys()
{
    TSS = (struct tss *)malloc(sizeof(struct tss));
   memset((void *)TSS, 0x00, sizeof(struct tss));
   set_a_gdt(5, (unsigned int)TSS, sizeof(struct tss)-1, 0x89, 0x00);
   ltr(KERNEL_TSS_SEG);
   
    add_process(make_task("NOT_DOING_NOTHIN'_TASK", task0, 0));
    running_stack = HEAD->stack;
    add_process(make_task("task0", task0, 3));
    add_process(make_task("task1", task1, 3));
}

Code: Select all

struct stack_data
{
   unsigned int gs;
   unsigned int fs;
   unsigned int es;
   unsigned int ds;
   unsigned int edi;
   unsigned int esi;
   unsigned int ebp;
   unsigned int esp;
   unsigned int ebx;
   unsigned int edx;
   unsigned int ecx;
   unsigned int eax;
   unsigned int eip;
   unsigned int cs;
   unsigned int eflags;
};

struct tss
{
    short backlink, __blh;
    int esp0;
    short ss0, __ss0h;
    int esp1;
    short ss1, __ss1h;
    int esp2;
    short ss2, __ss2h;
    int cr3;
    int eip;
    int eflags;
    int eax, ecx, edx, ebx;
    int esp, ebp, esi, edi;
    short es, __esh;
    short cs, __csh;
    short ss, __ssh;
    short ds, __dsh;
    short fs, __fsh;
    short gs, __gsh;
    short ldt, __ldth;
    short trace, bitmap;
}__attribute__((packed));

struct task_data 
{
    int ring;
    struct stack_data *stack;
    unsigned int stack_base;
    char name[33];
    unsigned int ss;
    unsigned int kstack;
    unsigned int PID;
    unsigned int P_PID;
    unsigned int time;
    unsigned int priority;
    unsigned int state;
    struct task_data *prev;
};
do you see anything wrong? ask if you need more code posted, thx

Re:adding tss asm problems

Posted: Sat May 20, 2006 5:55 pm
by mystran
Try running that inside Bochs, and see if Bochs reports something interesting. Oh, and make sure that you don't have "reset_on_tripple_fault" or somesuchthing enabled, because you want it to panic and give you a register dump.

Then proceed from there. I think there's some ?bercool way to add breakpoints to Bochs, but I've been using "cli;hlt" sequence, as Bochs is kind enough to report that it's stuck with halted processor when interrupts are disabled (so I know when to kill it to get a register dump).

Btw, my personal preference (after several attempts otherwise) is to separate thread switching (with kernel threads) and kernel entry/exit, so that the only problem that's solved in kernel entry/exit is saving/restoring the userspace registers, and calling the right C function depending on what happened. Userspace registers I simply save into the bottom of the kernel thread's stack.

This way all entry/exit need to do is push/pop the userspace registers and select kernel segments (actually mine does do some extra trickery to get a single entry/exit for all possible interrupts and exceptions, but that's just me, I guess).

Anyway, start with testing in Bochs, with some error or something it's a lot easier to figure out what's wrong.

Re:adding tss asm problems

Posted: Sun May 21, 2006 5:11 pm
by GLneo
ok i've found something, bochs said:
[tt]00006510053e[CPU ] load_seg_reg: LDT invalid
00006510144i[CPU ] LOCK prefix unallowed (op1=0x53, attr=0x0, mod=0x0, nnn=0)
00006510178i[CPU ] LOCK prefix unallowed (op1=0x53, attr=0x0, mod=0x0, nnn=0)
00006510210p[CPU ] >>PANIC<< prefetch: running in bogus memory[/tt]
so the code error is propably in this code:

Code: Select all

struct task_data *make_task(char *t_name, void (*entry)(), int ring)
{
    struct task_data *new_process;
    void *stack_mem;
    struct stack_data *stack;
    new_process = (struct task_data *)malloc(sizeof(struct task_data));
    if(new_process == NULL)
        return NULL;
    new_process->PID = PID_count++;
    new_process->P_PID = NULL;
    new_process->time = get_pri_time(1);
    new_process->priority = 1;
    new_process->ring = ring;
    
    stack_mem = (unsigned int *)malloc(STACK_SIZE);
    stack_mem += STACK_SIZE - sizeof(struct stack_data);
    stack = stack_mem;
    if(ring == 0)
    {
        stack->gs = KERNEL_DATA_SEG;
        stack->fs = KERNEL_DATA_SEG;
        stack->es = KERNEL_DATA_SEG;
        stack->ds = KERNEL_DATA_SEG;
        stack->cs = KERNEL_CODE_SEG;
    }
    else
    {
        stack->gs = USER_DATA_SEG;
        stack->fs = USER_DATA_SEG;
        stack->es = USER_DATA_SEG;
        stack->ds = USER_DATA_SEG;
        stack->cs = USER_CODE_SEG;
    }    
    stack->edi = 0;
    stack->esi = 0;
    stack->esp = (unsigned int)stack;
    stack->ebp = stack->esp;
    stack->ebx = 0;
    stack->edx = 0;
    stack->ecx = 0;
    stack->eax = 0;
    stack->eip = (unsigned int)entry;
    stack->eflags = 0x00000202;
    
    strncpy(new_process->name, t_name, 32);
    new_process->stack = stack;
    new_process->stack_base = (int)stack_mem;
    new_process->ss = KERNEL_STACK_SEG;
    new_process->state = READY;
    
    return new_process;
}

Code: Select all

void ltr(int num)
{
   asm("ltr %0" :: "rm" (num));
}

Code: Select all

void set_a_gdt(int num, unsigned long base, unsigned long limit, unsigned char access, unsigned char gran)
{
    gdt[num].basel = (base & 0xFFFF);
    gdt[num].basem = (base >> 16) & 0xFF;
    gdt[num].baseh = (base >> 24) & 0xFF;
    gdt[num].limit = (limit & 0xFFFF);
    gdt[num].granularity = ((limit >> 16) & 0x0F);
    gdt[num].granularity |= (gran & 0xF0);
    gdt[num].access = access;
}
i still am clueless?, any help appreciated ;)

Re:adding tss asm problems

Posted: Sun May 21, 2006 11:33 pm
by crackers
I can't see setting LDT field in make_task function. My guess is that after malloc in LDT field is random number diffrent from zero.

Re:adding tss asm problems

Posted: Mon May 22, 2006 5:37 am
by mystran
Bochs supposedly also gives you a register dump.

Start with CS and EIP. First see if CS is for kernel or user segment. It is useless trying to find an error in code that's switching to userspace, if the real problem happens to be an exception right after the return.

EIP obviously gives you the point in code where you where executing. Use your favourite disassembler (like objdump) to get a listing of the code, and search for the exact position that EIP gives you +/- any relocation you did or didn't do while loading.

That way you know where the exception occurs, and whether it's kernel or user code. Also look at the other registers: do at least the segment registers make sense?

Other thing that might be worth looking at: CR3 is the current page directory, CR2 the last pagefault address. If you have paging enabled, that is.

Finally, Bochs gives you a bit more info if you edit the log options for individual devices, and set CPU to have debug=report. That way you see all interrupts and exceptions and some other stuff of possible importance.

Oh, well, a week (or a few) ago I spent like 12 hours straight, trying to figure a very similar situation, with very similar symptoms. After some testing I reasoned that it'd probably be something to do with saving thread's kernel stack into tss->esp0. So I fixed that it scheduler code, with the result that it seemed to work for a short time, then started doing other weird things. I assumed I'd found yet another bug...

Ultimately, after almost proving most of my kernel, I found that my fix was updating (in my scheduler code) tss->esp0 to contain the stack of the old (and not the new) thread, because the preceding line called the stack switching function, so local variables wouldn't really be what they'd supposed to be.

So my userspace was working correctly, but any interrupt/exception back to kernel would potentially corrupt pre-empted thread's stacks, getting the userspace code to report about itself would crash the thing.

And that wasn't really the worst bug I've seen. Worst one was a certain paging bug that took few weeks to figure out....so, don't panic if you can't find an error instantly. If you persist, you will, eventually. :)

Re:adding tss asm problems

Posted: Mon May 22, 2006 4:54 pm
by GLneo
wow, i just learned how to use that bochsdbg thing, and what i think i have found that my switcher gets interupted even though i have a cli?, i can trace cli then a push sequence(begining a handler) but before a pop sequence i get another push sequence, but no sti in-between??? are cpu faults not disabled by cli's?, becouse the int happends right after: pop gs

Re:adding tss asm problems

Posted: Mon May 22, 2006 8:15 pm
by Kemp
CLI and STI CLear and SeT (right expansion?) Interrupts, not faults. If you switched fault handling off then all hell could break loose.

Re:adding tss asm problems

Posted: Tue May 23, 2006 4:34 am
by mystran
Like Kemp says, CLI/STI only affect IF in EFLAGS, which only affects whether IRQ:s are allowed to raise interrupts.

So exceptions, and things like NMI can happen anyway, and software INT-instruction still can raise interrupts (assuming it's allowed, if it's not, then GPF instead).

Most probably cause of getting right back to kernel after trying to go to userspace, is that some exception triggers instantly when trying to prefetch instructions. So you definitely should have some kind of exception handlers, at least enough to report what exception you got (although bochs cpu:debug=report will tell that too).

edit: added more about software interrupts

Re:adding tss asm problems

Posted: Wed May 24, 2006 10:03 am
by GLneo
i do have a handler and it said it is a GPF, but my handler is very bad at giving me the real error, my first question is just if anybody sees any logic errors in what reg's i put and take from, i have found one my self: just look at the first thing i posted, i read from _running_stack then later write back to [_running_stack] :P