Bochs/VMWare problems - weird results

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
kdx7214
Member
Member
Posts: 25
Joined: Tue Jun 07, 2011 5:34 pm

Bochs/VMWare problems - weird results

Post by kdx7214 »

My basic setup right now uses pxe to load a kernel and switch to long mode. Long mode is where things are getting truly bizarre for me. I switch directly from 16-bit to long mode
without using 32-bit pmode in between (as explained in a forum post by Brendan and a few articles in the wiki). This is working fine on a live machine and all runs exactly as expected.

When I run on either bochs/etherboot or VMWare player I get truly bizarre differences. Since either is extremely convenient to use for development I would love to figure out what exactly I'm doing wrong to freak them out so bad. Here's what they're doing:

VMWare Player: So far this is the best pxe support I've found in a VM environment. But after I enter long mode I have a really weird GPF (exception 13) happening when I try to ret from a function call. I've enabled an IDT with all exception handlers and so know it's a GPF, but no other exceptions (like paging) beforehand to give any clues. I'm identity mapping the first 2mb of ram and from what I can tell it is setup correctly (i.e. works on real machine). Just no clue why I get a GPF on a ret. If I hand code a print routine (without using calls) it works fine. My first suspicion was a stack problem, but I can push/pop over 100 qwords and no stack exceptions. I've double checked the code to make sure all push/pop are matched as well.

After that I tried using bochs with etherboot (gpxe). This is kludgy as hell using the PCI pseudo-nic but it does load a kernel. First weird thing is that even with interrupts disabled and PXE unloaded the pseudo-nic sends interrupts. Weird thing the second, as soon as I try to set cr4 in my paging setup code it locks the vm. I can ctrl-c out and quit, but am unable to step to see where the problem is.

Any thoughts on this? I'd also love to find out what you guys use as a setup for VM type things for testing. Since I use pxe that sort of limits me from what I can tell, but having the debugger around in bochs is definitely useful. Basically I'm just looking for advice to prevent having to reboot all the time.

Thanks!
Mike
gerryg400
Member
Member
Posts: 1801
Joined: Thu Mar 25, 2010 11:26 pm
Location: Melbourne, Australia

Re: Bochs/VMWare problems - weird results

Post by gerryg400 »

By 'ret', do you mean near 'ret' (opcode 0xc3) ? I ask because far ret and iret are sometimes a little difficult to get right in long mode.
If a trainstation is where trains stop, what is a workstation ?
stlw
Member
Member
Posts: 357
Joined: Fri Apr 04, 2008 6:43 am
Contact:

Re: Bochs/VMWare problems - weird results

Post by stlw »

kdx7214 wrote: After that I tried using bochs with etherboot (gpxe). This is kludgy as hell using the PCI pseudo-nic but it does load a kernel. First weird thing is that even with interrupts disabled and PXE unloaded the pseudo-nic sends interrupts. Weird thing the second, as soon as I try to set cr4 in my paging setup code it locks the vm. I can ctrl-c out and quit, but am unable to step to see where the problem is.

Any thoughts on this? I'd also love to find out what you guys use as a setup for VM type things for testing. Since I use pxe that sort of limits me from what I can tell, but having the debugger around in bochs is definitely useful. Basically I'm just looking for advice to prevent having to reboot all the time.

Thanks!
Mike
I can try to help you with Bochs. Can you explain more what is happening ?

Stanislav
kdx7214
Member
Member
Posts: 25
Joined: Tue Jun 07, 2011 5:34 pm

Re: Bochs/VMWare problems - weird results

Post by kdx7214 »

gerryg400 wrote:By 'ret', do you mean near 'ret' (opcode 0xc3) ? I ask because far ret and iret are sometimes a little difficult to get right in long mode.
I've not looked at the actual opcode that is generated by I'm using what I assume to be a near ret. It's just 'ret' on a line by itself. Hadn't considered that it might need something else - although I'm puzzled by the fact that it works on a live machine.
stlw wrote:I can try to help you with Bochs. Can you explain more what is happening ?
There are two problems I'm finding with bochs. First I've unloaded the etherboot pxe module which means the network card should not be generating interrupts anymore. I've also done a cli. When running the bochs debugger trying to isolate the other problem I am almost constantly thrown into the real mode IVT for IRQ0 or IRQ14 (PIT or PNIC). From what I can tell this shouldn't be happening. Note that I've also disabled NMI just to be sure. No clue why this is happening or what to do about it.

The problem that lead to the discovery of the first problem is that when I load cr4 with a simple "mov cr4, eax" it hangs the VM. Solid. I can ctrl-c and quit but am unable to step. What I would love to do is set a breakpoint on my code and use that in the debugger, but have yet to get breakpoints to work as expected. It's a big tough to get the exact breakpoint address as I'm using nasm to generate a bin file so there is no symbol table around to look at. What I've resorted to doing is using a hex editor, finding the offset of the instruction and adding that to the load time address of 0x7c00 and setting 4 or 5 breakpoints in 1-byte increments around that address. Sometimes it works and sometimes it doesn't.

I'm going to try rebuilding box later today (after I've gotten some sleep) and see where that takes me.

Thanks!
Mike
Post Reply