double fault works in bochs but not vmware

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
proxy

double fault works in bochs but not vmware

Post by proxy »

so I have been setting up my double fault handler, and it appears to work fine in bochs, but not vmware :(

anyway, here some of my code:

Code: Select all

TSS g_DoubleFaultTSS;
uint8_t g_DoubleFaultStack[1024];

/*---snip---*/
   std::memset(&g_DoubleFaultTSS, 0, sizeof(TSS));
   g_DoubleFaultTSS.esp0 = reinterpret_cast<uint32_t>(g_DoubleFaultStack + sizeof(g_DoubleFaultStack));
   g_DoubleFaultTSS.ss0 = GDT::kernel_ds;
   g_DoubleFaultTSS.eip = reinterpret_cast<uint32_t>(&_exception_08);
   g_DoubleFaultTSS.ds = GDT::kernel_ds;
   g_DoubleFaultTSS.es = GDT::kernel_ds;
   g_DoubleFaultTSS.fs = GDT::kernel_ds;
   g_DoubleFaultTSS.gs = GDT::kernel_ds;
   g_DoubleFaultTSS.ss = GDT::kernel_ds;   
   g_DoubleFaultTSS.cs = GDT::kernel_cs;
   g_DoubleFaultTSS.cr3 = reinterpret_cast<uint32_t>(kernel_process->pageDirectory());
I then set the GDT entry at GDT::double_fault_tss is set to this TSS entry. I also set my IDT entry for exception #8 to be a task gate, present and ring0 with a segment of GDT::double_fault_tss.

It appears to work perfectly in bochs, when i tell my OS to trip a double fault, it goes to my handler. But in vmware, it simply resets (I assume I got a triple fault).

My code to trigger a double fault is as follows:

Code: Select all

            __asm__ __volatile__ ("mov $0x12345678, %esp"); /* trash esp */
            __asm__ __volatile__ ("push 0xdeadbeef"); /* push some value, will page fault and subsequently page fault in the page fault handler */
proxy
proxy

Re:double fault works in bochs but not vmware

Post by proxy »

on a related note, does anyone think it is a good idea to make the page fault exception a task gate so that it will have a dedicated stack? The primary gain would be that page faults triggered by invalid stack access would be recoverable in supervisor mode. This would allow things such as more robust error reporting for kernel coding errors, growable kenel stacks, clean shutdown of kernel threads which overflow there stack, and so forth.

However, I have concerns about the potential performance hit of doing this.

Anyone have any thoughts about this?

proxy
Cjmovie

Re:double fault works in bochs but not vmware

Post by Cjmovie »

I'm not sure as to your double fault handler, but....
The page fault idea - I don't see much of a performance problem for doing this unless you're on a very limited system, where chunks of memory are constantly being paged in and out of physical memory. Point is, the stack will grow to the size it's going to grow rather soon on in a lot of cases (and therefore you have one large chunk of faults at startup but then very few). And considering you aren't on a system low on RAM, you won't get all that many (shouldn't, at least...) page faults for un-paged code chunks. Then just make sure dynamic memory (for use by malloc etc.) are _requested_, not faulted, and I say you're off rather well.
User avatar
Pype.Clicker
Member
Member
Posts: 5964
Joined: Wed Oct 18, 2006 2:31 am
Location: In a galaxy, far, far away
Contact:

Re:double fault works in bochs but not vmware

Post by Pype.Clicker »

proxy wrote: on a related note, does anyone think it is a good idea to make the page fault exception a task gate so that it will have a dedicated stack? The primary gain would be that page faults triggered by invalid stack access would be recoverable in supervisor mode. This would allow things such as more robust error reporting for kernel coding errors, growable kenel stacks, clean shutdown of kernel threads which overflow there stack, and so forth.
Honnestly, i disencouradge you to implement page faults through a task gate. Beyond the performance penalty (page faults might be the second cause of kernel invocation after system calls), you'll have to handle the fault in a different address space than the one that triggered the fault, which may make handling much more complicated (e.g. retrieve context first, and page directories basing on that context ...)

However, you _can_ protect your kernel stack by the way of segmentation: if the stack overflow breaks segment limit (using expand-down segments) rather than missing page, it means that a 'stack fault" will be issued, which is _recommended_ to go through a task gate.

And regarding portability, both mechanism are equivalently bound to IA-32 ...
Post Reply