Page 1 of 1

double fault works in bochs but not vmware

Posted: Thu Dec 29, 2005 10:58 pm
by proxy
so I have been setting up my double fault handler, and it appears to work fine in bochs, but not vmware :(

anyway, here some of my code:

Code: Select all

TSS g_DoubleFaultTSS;
uint8_t g_DoubleFaultStack[1024];

/*---snip---*/
   std::memset(&g_DoubleFaultTSS, 0, sizeof(TSS));
   g_DoubleFaultTSS.esp0 = reinterpret_cast<uint32_t>(g_DoubleFaultStack + sizeof(g_DoubleFaultStack));
   g_DoubleFaultTSS.ss0 = GDT::kernel_ds;
   g_DoubleFaultTSS.eip = reinterpret_cast<uint32_t>(&_exception_08);
   g_DoubleFaultTSS.ds = GDT::kernel_ds;
   g_DoubleFaultTSS.es = GDT::kernel_ds;
   g_DoubleFaultTSS.fs = GDT::kernel_ds;
   g_DoubleFaultTSS.gs = GDT::kernel_ds;
   g_DoubleFaultTSS.ss = GDT::kernel_ds;   
   g_DoubleFaultTSS.cs = GDT::kernel_cs;
   g_DoubleFaultTSS.cr3 = reinterpret_cast<uint32_t>(kernel_process->pageDirectory());
I then set the GDT entry at GDT::double_fault_tss is set to this TSS entry. I also set my IDT entry for exception #8 to be a task gate, present and ring0 with a segment of GDT::double_fault_tss.

It appears to work perfectly in bochs, when i tell my OS to trip a double fault, it goes to my handler. But in vmware, it simply resets (I assume I got a triple fault).

My code to trigger a double fault is as follows:

Code: Select all

            __asm__ __volatile__ ("mov $0x12345678, %esp"); /* trash esp */
            __asm__ __volatile__ ("push 0xdeadbeef"); /* push some value, will page fault and subsequently page fault in the page fault handler */
proxy

Re:double fault works in bochs but not vmware

Posted: Fri Dec 30, 2005 12:38 am
by proxy
on a related note, does anyone think it is a good idea to make the page fault exception a task gate so that it will have a dedicated stack? The primary gain would be that page faults triggered by invalid stack access would be recoverable in supervisor mode. This would allow things such as more robust error reporting for kernel coding errors, growable kenel stacks, clean shutdown of kernel threads which overflow there stack, and so forth.

However, I have concerns about the potential performance hit of doing this.

Anyone have any thoughts about this?

proxy

Re:double fault works in bochs but not vmware

Posted: Fri Dec 30, 2005 2:11 am
by Cjmovie
I'm not sure as to your double fault handler, but....
The page fault idea - I don't see much of a performance problem for doing this unless you're on a very limited system, where chunks of memory are constantly being paged in and out of physical memory. Point is, the stack will grow to the size it's going to grow rather soon on in a lot of cases (and therefore you have one large chunk of faults at startup but then very few). And considering you aren't on a system low on RAM, you won't get all that many (shouldn't, at least...) page faults for un-paged code chunks. Then just make sure dynamic memory (for use by malloc etc.) are _requested_, not faulted, and I say you're off rather well.

Re:double fault works in bochs but not vmware

Posted: Sun Jan 01, 2006 6:55 am
by Pype.Clicker
proxy wrote: on a related note, does anyone think it is a good idea to make the page fault exception a task gate so that it will have a dedicated stack? The primary gain would be that page faults triggered by invalid stack access would be recoverable in supervisor mode. This would allow things such as more robust error reporting for kernel coding errors, growable kenel stacks, clean shutdown of kernel threads which overflow there stack, and so forth.
Honnestly, i disencouradge you to implement page faults through a task gate. Beyond the performance penalty (page faults might be the second cause of kernel invocation after system calls), you'll have to handle the fault in a different address space than the one that triggered the fault, which may make handling much more complicated (e.g. retrieve context first, and page directories basing on that context ...)

However, you _can_ protect your kernel stack by the way of segmentation: if the stack overflow breaks segment limit (using expand-down segments) rather than missing page, it means that a 'stack fault" will be issued, which is _recommended_ to go through a task gate.

And regarding portability, both mechanism are equivalently bound to IA-32 ...