More Multitasking Problems

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
User avatar
AJ
Member
Member
Posts: 2646
Joined: Sun Oct 22, 2006 7:01 am
Location: Devon, UK
Contact:

More Multitasking Problems

Post by AJ »

Hello All,

[EDIT - uint32 = unsigned long - something which points to my main Win32 dev environment being c#!]

Sorry to appear with a multitasking question *again* but, well...

I have set up a TSS for my outgoing task (the kernel itself), and have loaded the TSS descriptor - this works fine. I then set up another task which is simply:

Code: Select all

void task()
{
  for(;;);
}
...and write it's tss descriptor with:

Code: Select all

gdt_set_descriptor(4, (uint32)&tss[1], 0x67, 0x89, 0x00);
This is the same function I use for setting the kernel descriptor, code segs, data segs etc..., so it works!

I then drop to an assembly function called mt_switch which does the following:

Code: Select all

[global _mt_switch]
_mt_switch:
  jmp dword 0x20:0x00
jmp $
I know that it doesn't take in to account returning to the calling task, c calling conventions etc... - but I'm not worried about that at the moment :)

The thing is, the CPU appears to switch the task, but then triple faults as soon as it enters it. Bochs gives the following code as the culprit:

Code: Select all

add byte ptr ds:[eax], al:0000
All the new task's registers are loaded correctly (with null values), and the code and data segments have been switched as I would expect. EIP is at the new task's entry point when the triple fault occurs.

I wondered if it was my timer interrupt causing the fault initially, but it happens even if I cli before the task switch.

Does anyone have any ideas about what is happening? The full code is listed below. If you need the code for any other functions to diagnose this, I'll post them.

Cheers,
Adam

Code: Select all

void mt_install()
{
  //install multitasking support
  memsetd((void*)&tss[0], 0, 0x100/4);
  
    
  lldt(0); //loads ldtr with null
  
  tss[0].trace_bitmap = tss[1].trace_bitmap = 0;
  tss[0].io_map_addr =  tss[1].io_map_addr = 0;      /* I/O map just after the TSS */
  tss[0].ldtr = tss[1].ldtr = 0x00;                /* ldtr = 0 */
  
 
  printf("Task is at 0x%x", &task); /*this printed value and the value of    eip after the triple fault appear to match*/
  
  tss[1].fs = tss[1].gs = 0x10;                    /* fs=gs=0 */
  tss[1].ds = tss[1].es = tss[1].ss = 0x10;      /* ds=es=ss = data segment */
  tss[1].esp = (uint32)task_stack + 1024;    /* sp points to task stack top  - task_stack is declared as char task_stack[1024];*/
  tss[1].cs = 0x08;
  tss[1].eip = (uint32)&task;                     /* cs:eip point to task() */
  tss[1].cr3 = (uint32)read_cr3;               //use kernel page dir
  tss[1].eflags = (uint32)0x0202;
  
  //params = gdt entry, base, limit, access, granularity.
  mmu_gdt_set_descriptor(3, (uint32)&tss[0], 0x67, 0x89, 0x00);
  mmu_gdt_set_descriptor(4, (uint32)&tss[1], 0x67, 0x89, 0x00);
  set_task_register(0x03*0x08);           //does an ltr(0x18) - blank kernel tss
  
  mt_switch();     //does a jmp 0x20:0x00
}
User avatar
AJ
Member
Member
Posts: 2646
Joined: Sun Oct 22, 2006 7:01 am
Location: Devon, UK
Contact:

Post by AJ »

More information - the TS bit in CR0 has been set by the processor, so it appears that the fault is definitely *after* the commit point.

Also, after the tripple fault, there is a value in CR2 which points to a location in my GDT (in fact, the 208th entry - which I never use). So it seems to me that one of the exceptions was a page fault.

I just can't see the woods for the trees at the moment :?

Adam
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Post by Combuster »

Have you checked that the (new) CR3 points to the physical address of a real page table, and all of the GDT, IDT, and code/data are properly mapped in there?
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
urxae
Member
Member
Posts: 149
Joined: Sun Jul 30, 2006 8:16 am
Location: The Netherlands

Post by urxae »

Are you using paging?

IIRC when paging the CPU only translates the starting address of the TSS to its physical address, and then assumes the rest is located consecutively in physical memory.

So you might want to make sure your TSS either doesn't cross a page boundary, or is located in consecutive pageframes (physical pages) as well as consecutive virtual pages.

I just put my TSS in a global variable and aligned it on a 128-byte boundary to be sure it doesn't cross a page boundary.
User avatar
AJ
Member
Member
Posts: 2646
Joined: Sun Oct 22, 2006 7:01 am
Location: Devon, UK
Contact:

Post by AJ »

Thanks for these answers.
Have you checked that the (new) CR3 points to the physical address of a real page table, and all of the GDT, IDT, and code/data are properly mapped in there?
The new CR3 is exactly the same pre- and post- task switch.

The way I have my kernel set up is that at boot time it is loaded at 4MB. The entire 4-8MB range is then mapped starting at 0xC0000000. My kernel stack resides at 0xC0400000 (and grows downwards). At the point of the task switch, esp has grown down to 0xC03FFFBC. The GDT, IDT and code are currently all present in this 4MB range (in fact, the GDT resides at 0xC0200000). All this means that I am fairly confident that the entire range os pages is mapped in.
IIRC when paging the CPU only translates the starting address of the TSS to its physical address, and then assumes the rest is located consecutively in physical memory.
Thanks for this - I checked in the Intel manuals (in fact, I have now read the task switching section several times :( ) and that is the case. At present, I have just declared my tss as an array within my code (I have also tried using kmalloc - my heap manager is capable of avoiding page boundaries if necessary). The first tss starts at 0xC0008590 - this means that the tss's should therefore avoid any page boundaries.

Thanks for helping me to start thinking rationally about this - my yesterday evening my code had all become a blur! :shock: I'll keep scouring through the options....

[EDIT]I have just written an extra section in my kernel's paging setup routine which confirms that the secondary loader has paged in the entire 4MB as read/write - and it has :wink: -- oh - wait a minute - the faulting address is actually in the IDT - entry at offset 0x40, which would make it the....DF handler?? [scratches head] also, I have done a cli before the task switch. [/EDIT]

Cheers,
Adam
User avatar
AJ
Member
Member
Posts: 2646
Joined: Sun Oct 22, 2006 7:01 am
Location: Devon, UK
Contact:

Post by AJ »

Well bugg*r me! I just re-write my entire multitasking module and it now appears to work :D

Thanks again for the replies given earlier. I still wish I knew what the original problem was so I can avoid it in the future, but for now, I'm just too happy to worry :roll:

Cheers,
Adam
Post Reply