adding tss asm problems

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
GLneo

adding tss asm problems

Post by GLneo »

hi all, OK, I am working on adding user space task's, and in asm i'm not sure on how, i do have a little code, but the system doesn't switch?
now for some code:
asm switcher (where the problem is(I think))

Code: Select all

_task_timer:
    pushad
    push ds
    push es
    push fs
    push gs
    mov ax, 0x10
    mov ds, ax
    mov es, ax
    mov fs, ax
    mov gs, ax
    mov eax, _running_stack
    mov dword [eax], esp
    push eax
    call _task_timer_c
    pop ebx
    test ebx, eax
    je noswitch
    mov  [_running_stack], ebx
    cmp dword[ebx], 0x03
    jne notss
    mov ecx, [_TSS]
    mov word [ecx+8], KERNEL_DATA_SEL
    mov edx, [ebx+8]
    add edx, 4096
    mov dword[ecx+4], edx
 notss:
    mov esp, [ebx]
 noswitch:
    pop gs
    pop fs
    pop es
    pop ds
    popad
    iretd
and some of the c code that goes with it:

Code: Select all

void start_sys()
{
    TSS = (struct tss *)malloc(sizeof(struct tss));
   memset((void *)TSS, 0x00, sizeof(struct tss));
   set_a_gdt(5, (unsigned int)TSS, sizeof(struct tss)-1, 0x89, 0x00);
   ltr(KERNEL_TSS_SEG);
   
    add_process(make_task("NOT_DOING_NOTHIN'_TASK", task0, 0));
    running_stack = HEAD->stack;
    add_process(make_task("task0", task0, 3));
    add_process(make_task("task1", task1, 3));
}

Code: Select all

struct stack_data
{
   unsigned int gs;
   unsigned int fs;
   unsigned int es;
   unsigned int ds;
   unsigned int edi;
   unsigned int esi;
   unsigned int ebp;
   unsigned int esp;
   unsigned int ebx;
   unsigned int edx;
   unsigned int ecx;
   unsigned int eax;
   unsigned int eip;
   unsigned int cs;
   unsigned int eflags;
};

struct tss
{
    short backlink, __blh;
    int esp0;
    short ss0, __ss0h;
    int esp1;
    short ss1, __ss1h;
    int esp2;
    short ss2, __ss2h;
    int cr3;
    int eip;
    int eflags;
    int eax, ecx, edx, ebx;
    int esp, ebp, esi, edi;
    short es, __esh;
    short cs, __csh;
    short ss, __ssh;
    short ds, __dsh;
    short fs, __fsh;
    short gs, __gsh;
    short ldt, __ldth;
    short trace, bitmap;
}__attribute__((packed));

struct task_data 
{
    int ring;
    struct stack_data *stack;
    unsigned int stack_base;
    char name[33];
    unsigned int ss;
    unsigned int kstack;
    unsigned int PID;
    unsigned int P_PID;
    unsigned int time;
    unsigned int priority;
    unsigned int state;
    struct task_data *prev;
};
do you see anything wrong? ask if you need more code posted, thx
mystran

Re:adding tss asm problems

Post by mystran »

Try running that inside Bochs, and see if Bochs reports something interesting. Oh, and make sure that you don't have "reset_on_tripple_fault" or somesuchthing enabled, because you want it to panic and give you a register dump.

Then proceed from there. I think there's some ?bercool way to add breakpoints to Bochs, but I've been using "cli;hlt" sequence, as Bochs is kind enough to report that it's stuck with halted processor when interrupts are disabled (so I know when to kill it to get a register dump).

Btw, my personal preference (after several attempts otherwise) is to separate thread switching (with kernel threads) and kernel entry/exit, so that the only problem that's solved in kernel entry/exit is saving/restoring the userspace registers, and calling the right C function depending on what happened. Userspace registers I simply save into the bottom of the kernel thread's stack.

This way all entry/exit need to do is push/pop the userspace registers and select kernel segments (actually mine does do some extra trickery to get a single entry/exit for all possible interrupts and exceptions, but that's just me, I guess).

Anyway, start with testing in Bochs, with some error or something it's a lot easier to figure out what's wrong.
GLneo

Re:adding tss asm problems

Post by GLneo »

ok i've found something, bochs said:
[tt]00006510053e[CPU ] load_seg_reg: LDT invalid
00006510144i[CPU ] LOCK prefix unallowed (op1=0x53, attr=0x0, mod=0x0, nnn=0)
00006510178i[CPU ] LOCK prefix unallowed (op1=0x53, attr=0x0, mod=0x0, nnn=0)
00006510210p[CPU ] >>PANIC<< prefetch: running in bogus memory[/tt]
so the code error is propably in this code:

Code: Select all

struct task_data *make_task(char *t_name, void (*entry)(), int ring)
{
    struct task_data *new_process;
    void *stack_mem;
    struct stack_data *stack;
    new_process = (struct task_data *)malloc(sizeof(struct task_data));
    if(new_process == NULL)
        return NULL;
    new_process->PID = PID_count++;
    new_process->P_PID = NULL;
    new_process->time = get_pri_time(1);
    new_process->priority = 1;
    new_process->ring = ring;
    
    stack_mem = (unsigned int *)malloc(STACK_SIZE);
    stack_mem += STACK_SIZE - sizeof(struct stack_data);
    stack = stack_mem;
    if(ring == 0)
    {
        stack->gs = KERNEL_DATA_SEG;
        stack->fs = KERNEL_DATA_SEG;
        stack->es = KERNEL_DATA_SEG;
        stack->ds = KERNEL_DATA_SEG;
        stack->cs = KERNEL_CODE_SEG;
    }
    else
    {
        stack->gs = USER_DATA_SEG;
        stack->fs = USER_DATA_SEG;
        stack->es = USER_DATA_SEG;
        stack->ds = USER_DATA_SEG;
        stack->cs = USER_CODE_SEG;
    }    
    stack->edi = 0;
    stack->esi = 0;
    stack->esp = (unsigned int)stack;
    stack->ebp = stack->esp;
    stack->ebx = 0;
    stack->edx = 0;
    stack->ecx = 0;
    stack->eax = 0;
    stack->eip = (unsigned int)entry;
    stack->eflags = 0x00000202;
    
    strncpy(new_process->name, t_name, 32);
    new_process->stack = stack;
    new_process->stack_base = (int)stack_mem;
    new_process->ss = KERNEL_STACK_SEG;
    new_process->state = READY;
    
    return new_process;
}

Code: Select all

void ltr(int num)
{
   asm("ltr %0" :: "rm" (num));
}

Code: Select all

void set_a_gdt(int num, unsigned long base, unsigned long limit, unsigned char access, unsigned char gran)
{
    gdt[num].basel = (base & 0xFFFF);
    gdt[num].basem = (base >> 16) & 0xFF;
    gdt[num].baseh = (base >> 24) & 0xFF;
    gdt[num].limit = (limit & 0xFFFF);
    gdt[num].granularity = ((limit >> 16) & 0x0F);
    gdt[num].granularity |= (gran & 0xF0);
    gdt[num].access = access;
}
i still am clueless?, any help appreciated ;)
crackers

Re:adding tss asm problems

Post by crackers »

I can't see setting LDT field in make_task function. My guess is that after malloc in LDT field is random number diffrent from zero.
mystran

Re:adding tss asm problems

Post by mystran »

Bochs supposedly also gives you a register dump.

Start with CS and EIP. First see if CS is for kernel or user segment. It is useless trying to find an error in code that's switching to userspace, if the real problem happens to be an exception right after the return.

EIP obviously gives you the point in code where you where executing. Use your favourite disassembler (like objdump) to get a listing of the code, and search for the exact position that EIP gives you +/- any relocation you did or didn't do while loading.

That way you know where the exception occurs, and whether it's kernel or user code. Also look at the other registers: do at least the segment registers make sense?

Other thing that might be worth looking at: CR3 is the current page directory, CR2 the last pagefault address. If you have paging enabled, that is.

Finally, Bochs gives you a bit more info if you edit the log options for individual devices, and set CPU to have debug=report. That way you see all interrupts and exceptions and some other stuff of possible importance.

Oh, well, a week (or a few) ago I spent like 12 hours straight, trying to figure a very similar situation, with very similar symptoms. After some testing I reasoned that it'd probably be something to do with saving thread's kernel stack into tss->esp0. So I fixed that it scheduler code, with the result that it seemed to work for a short time, then started doing other weird things. I assumed I'd found yet another bug...

Ultimately, after almost proving most of my kernel, I found that my fix was updating (in my scheduler code) tss->esp0 to contain the stack of the old (and not the new) thread, because the preceding line called the stack switching function, so local variables wouldn't really be what they'd supposed to be.

So my userspace was working correctly, but any interrupt/exception back to kernel would potentially corrupt pre-empted thread's stacks, getting the userspace code to report about itself would crash the thing.

And that wasn't really the worst bug I've seen. Worst one was a certain paging bug that took few weeks to figure out....so, don't panic if you can't find an error instantly. If you persist, you will, eventually. :)
GLneo

Re:adding tss asm problems

Post by GLneo »

wow, i just learned how to use that bochsdbg thing, and what i think i have found that my switcher gets interupted even though i have a cli?, i can trace cli then a push sequence(begining a handler) but before a pop sequence i get another push sequence, but no sti in-between??? are cpu faults not disabled by cli's?, becouse the int happends right after: pop gs
Kemp

Re:adding tss asm problems

Post by Kemp »

CLI and STI CLear and SeT (right expansion?) Interrupts, not faults. If you switched fault handling off then all hell could break loose.
mystran

Re:adding tss asm problems

Post by mystran »

Like Kemp says, CLI/STI only affect IF in EFLAGS, which only affects whether IRQ:s are allowed to raise interrupts.

So exceptions, and things like NMI can happen anyway, and software INT-instruction still can raise interrupts (assuming it's allowed, if it's not, then GPF instead).

Most probably cause of getting right back to kernel after trying to go to userspace, is that some exception triggers instantly when trying to prefetch instructions. So you definitely should have some kind of exception handlers, at least enough to report what exception you got (although bochs cpu:debug=report will tell that too).

edit: added more about software interrupts
GLneo

Re:adding tss asm problems

Post by GLneo »

i do have a handler and it said it is a GPF, but my handler is very bad at giving me the real error, my first question is just if anybody sees any logic errors in what reg's i put and take from, i have found one my self: just look at the first thing i posted, i read from _running_stack then later write back to [_running_stack] :P
Post Reply