VMX operations.
VMX operations.
Hey, I've been trying to setup Intel's virtual extensions by using the VMXON instruction. While I do need help, I think this should be in the design/theory area as it's a very poorly covered area here and I think it deserves some light to be shed on it.
Basically, what I've done so far is:
* Enter PMode (setting CR0.PE)
* Stay in CPL(0)
* Check for the VMX bit in the CPUID features list
* Set the CR4.VMXE bit
* Check IA32_FEATURE_CONTROL MSR to make sure bits 0 (lock bit) and 2 (outside SMX enabled bit) are set.
* Allocate an 16-byte aligned IA32_VMX_BASIC.44:32-sized region of memory (in my cases 4k bytes).
* execute: asm volatile("vmxon %0;"::"m"(region)); // where "region" is the aligned region of memory
I get a lovely GPF as soon as I execute the VMXON. I don't enable paging (yes, I know it's "required"), but the SDM's do say that it's possible to use VMX without paging in PMode. Should I just cave in and make a 1:1 map of the entire 4GB space and still use the "physical" region just to make it happy? Also, one of the checks that VMXON does is to make sure that A20M mode is disabled (which it's enabled in the bootloader). I can't disable this in the kernel can I? I'd imagine that would cause some serious addressing issues. [edit]Tried it, and it faults during the disabling routine.[/edit].
I see most references to memory operands and registers sizes are 64-bits and that some instructions check for IA32_EFER.LMA to be set. Does this mean that I "must" be in 64-bit mode? I also see info about the upper 32-bits of operands being zero'd when using 32-bit registers/addresses, so isn't that somewhat contradictory?
I think this deserves a lot of discussion as these instructions could definitely change the way OS's could be developed (even hobby OS's).
Basically, what I've done so far is:
* Enter PMode (setting CR0.PE)
* Stay in CPL(0)
* Check for the VMX bit in the CPUID features list
* Set the CR4.VMXE bit
* Check IA32_FEATURE_CONTROL MSR to make sure bits 0 (lock bit) and 2 (outside SMX enabled bit) are set.
* Allocate an 16-byte aligned IA32_VMX_BASIC.44:32-sized region of memory (in my cases 4k bytes).
* execute: asm volatile("vmxon %0;"::"m"(region)); // where "region" is the aligned region of memory
I get a lovely GPF as soon as I execute the VMXON. I don't enable paging (yes, I know it's "required"), but the SDM's do say that it's possible to use VMX without paging in PMode. Should I just cave in and make a 1:1 map of the entire 4GB space and still use the "physical" region just to make it happy? Also, one of the checks that VMXON does is to make sure that A20M mode is disabled (which it's enabled in the bootloader). I can't disable this in the kernel can I? I'd imagine that would cause some serious addressing issues. [edit]Tried it, and it faults during the disabling routine.[/edit].
I see most references to memory operands and registers sizes are 64-bits and that some instructions check for IA32_EFER.LMA to be set. Does this mean that I "must" be in 64-bit mode? I also see info about the upper 32-bits of operands being zero'd when using 32-bit registers/addresses, so isn't that somewhat contradictory?
I think this deserves a lot of discussion as these instructions could definitely change the way OS's could be developed (even hobby OS's).
Website: https://joscor.com
Re: VMX operations.
It has to be possible to run it from 32 bit, as Virtual PC could make use of those instructions from 32bit windows..
Apparently it also doesn't "require" kernel support to be tied in since Virtual PC didn't have any special modifications to the NT kernel..
Apparently it also doesn't "require" kernel support to be tied in since Virtual PC didn't have any special modifications to the NT kernel..
Re: VMX operations.
Make sure your fixed bits in CR0 and CR4 are set properly, you might actually need PAE enabled even if paging isn't required for example. Appendix sections G.7 and G.8 in then Intel SDM 3B covers how to read and set the necessary values.
Also try changing your vmxon instruction to this: asm volatile("vmxon (%0);"::"r" (region));
Also make sure to copy over the revision ID to your VMXON region (Section 20.10.4 in the Intel SDM covers this).
Another thing to keep in mind if you do end up using paging is that the region instructions require physical addresses to be passed to them.
With regards to EFER checks, they're basically there to ensure you're passing full addresses in long mode. If you're running from protected mode and the EFER bit isn't set it shouldn't be a problem. But you must execute VMX instructions from a 64-bit code segment if long mode is actually enabled. The EFER will also come into play if you're actually changing it's value in the guest environment, but that's easy enough to deal with by setting the proper fields and flags in the VMCS.
VMX features are pretty powerful but even basic operations can require a lot of setup with it. The first 3 sections of Chapter 22 in the SDM will likely be your best friend in troubleshooting VM entries. Once you get everything up and running it's a great tool to have, especially for dealing with real mode code in long mode.
Also try changing your vmxon instruction to this: asm volatile("vmxon (%0);"::"r" (region));
Also make sure to copy over the revision ID to your VMXON region (Section 20.10.4 in the Intel SDM covers this).
Another thing to keep in mind if you do end up using paging is that the region instructions require physical addresses to be passed to them.
With regards to EFER checks, they're basically there to ensure you're passing full addresses in long mode. If you're running from protected mode and the EFER bit isn't set it shouldn't be a problem. But you must execute VMX instructions from a 64-bit code segment if long mode is actually enabled. The EFER will also come into play if you're actually changing it's value in the guest environment, but that's easy enough to deal with by setting the proper fields and flags in the VMCS.
VMX features are pretty powerful but even basic operations can require a lot of setup with it. The first 3 sections of Chapter 22 in the SDM will likely be your best friend in troubleshooting VM entries. Once you get everything up and running it's a great tool to have, especially for dealing with real mode code in long mode.
Reserved for OEM use.
Re: VMX operations.
Thanks for the tips.
I caved in (to be on the safe side) and am now using paging with PAE. So I've set the PAE and PG bits now in the control registers. I also changed the assembly routine to your method (even though it really shouldn't matter). I'm still getting a GPF when I execute VMXON.
Any other ideas?
btw, here's some of the registers before I execute VMXON (using a breakpoint right before and in BOCHS debugger):
I caved in (to be on the safe side) and am now using paging with PAE. So I've set the PAE and PG bits now in the control registers. I also changed the assembly routine to your method (even though it really shouldn't matter). I'm still getting a GPF when I execute VMXON.
Any other ideas?
btw, here's some of the registers before I execute VMXON (using a breakpoint right before and in BOCHS debugger):
Code: Select all
vmxon qword ptr ds:[ebx]; // EBX = 0x0031c000 (4k aligned)
CR0=0xe0000013: PG CD NW ac wp ne ET ts em MP PE
CR2=0x0000000000000000
CR3=0x0011a000
CR4=0x00002020: osxsave smx VMX osxmmexcpt osfxsr pce pge mce PAE pse de tsd pvi vme
EFER=0x00000000: ffxsr nxe lma lme sce
Website: https://joscor.com
Re: VMX operations.
It still looks like your fixed bits might be off for CR0/CR4, the NE bit in CR0 is set in bochs after going through that process for me, there might be some in CR4.
This is the code I use preceding the initialization of the VMXON region to set the necessary fixed bits. It's probably possible in theory for VMX to require you to disable some feature your OS relies on too and complete code would require some checks for that as well.
MSRs:
You're right about the asm code too, it doesn't matter. I must have confused that with something else, my apologies.
This is the code I use preceding the initialization of the VMXON region to set the necessary fixed bits. It's probably possible in theory for VMX to require you to disable some feature your OS relies on too and complete code would require some checks for that as well.
Code: Select all
//Get the current state of our CR4 and CR0 registers
asm ( "mov %%cr4, %%rax\n"
"or $0x2000, %%rax\n"
"mov %%rax, %%cr4\n": "=a" ( cr4 ) : );
asm ( "mov %%cr0, %%rax\n": "=a" ( cr0 ) : );
//Set and clear the required bits in CR0 and CR4
fixedCr0 = readMSR ( MSR_VMX_CR0_FIXED1 );
fixedCr4 = readMSR ( MSR_VMX_CR4_FIXED1 );
//Clear the necessary fixed bits
cr0 &= fixedCr0;
cr4 &= fixedCr4;
//Set the necessary fixed bits
cr0 |= readMSR ( MSR_VMX_CR0_FIXED0 );
cr4 |= readMSR ( MSR_VMX_CR4_FIXED0 );
//Load new CR0/CR4 values
asm ( "mov %%rax, %%cr0\n":: "a" ( cr0 ) );
asm ( "mov %%rax, %%cr4\n":: "a" ( cr4 ) );
Code: Select all
#define MSR_VMX_CR0_FIXED0 0x486
#define MSR_VMX_CR0_FIXED1 0x487
#define MSR_VMX_CR4_FIXED0 0x488
#define MSR_VMX_CR4_FIXED1 0x489
Reserved for OEM use.
Re: VMX operations.
Thanks for all your help!
I did get it to finally execute VMXON without GPF'ing. I checked the fixed bits for CR0/CR4 and found that the "NE" bit was the problem. I was resetting that bit earlier in the OS when I initialize the x87 FPU. I may write a small wiki article about this as it was kind of a pain.
So I'm assuming you have at least partial VMX support in your OS? Could you explain what features it's offered you or what you've been able to accomplish by using the extensions? Now that my code issue has been resolved, I'm very eager to discuss these newer features and what it could possibly present to OS developers.
[edit]Also, if you'd like to contribute, I've started a wiki article: http://wiki.osdev.org/VMX [/edit]
I did get it to finally execute VMXON without GPF'ing. I checked the fixed bits for CR0/CR4 and found that the "NE" bit was the problem. I was resetting that bit earlier in the OS when I initialize the x87 FPU. I may write a small wiki article about this as it was kind of a pain.
So I'm assuming you have at least partial VMX support in your OS? Could you explain what features it's offered you or what you've been able to accomplish by using the extensions? Now that my code issue has been resolved, I'm very eager to discuss these newer features and what it could possibly present to OS developers.
[edit]Also, if you'd like to contribute, I've started a wiki article: http://wiki.osdev.org/VMX [/edit]
Website: https://joscor.com
Re: VMX operations.
Right now I strictly use VMX extensions for accessing real mode interrupts from long mode. Essentially all my code does is put the processor into a paged protected mode with VM86 and VME enabled. I setup the necessary segment settings each call, base, limit and access rights. Point the guest state to the CS:IP specified in the IP, save/load the necessary registers and enter the VM. For my kernel it's locked with static structures (stack, TSS, page tables). I don't trap I/O at the VM level but I believe that VME pretty much requires you to use both an I/O bitmap and a interrupt bitmap of it's own. Eventually I'll probably remove the need for VME and just use a more elaborate VM monitor setup to track any software interrupt calls.
However there's a lot of potential to use it for other things, especially with some of the newer features. I suppose the holy grail of it for a hobby OS developer would be the ability to nest another OS inside of yours for compatibility reasons and create hardware hooks/customized drivers to enable things like network access. That's basically what VMX extensions were designed for after all, but I think there are a lot of other possibilities. One I've thinking about implementing is essentially a two level OS, with hardware services and drivers as well as physical memory management in the VM monitor and then a smaller micro kernel for a task or group of tasks. It'd provide redundancy and simplify the number of variables that generally had to be dealt with. Certain events like hardware interrupts could trap to the kernel or be directed to the microkernel (things like PIT interrupts, etc.). Then possibly use the preemption timer to control VM/Task switches.
The biggest catch with VMX is that it's still being actively developed. Some of the best features like EPT and unrestricted guests aren't supported.
However there's a lot of potential to use it for other things, especially with some of the newer features. I suppose the holy grail of it for a hobby OS developer would be the ability to nest another OS inside of yours for compatibility reasons and create hardware hooks/customized drivers to enable things like network access. That's basically what VMX extensions were designed for after all, but I think there are a lot of other possibilities. One I've thinking about implementing is essentially a two level OS, with hardware services and drivers as well as physical memory management in the VM monitor and then a smaller micro kernel for a task or group of tasks. It'd provide redundancy and simplify the number of variables that generally had to be dealt with. Certain events like hardware interrupts could trap to the kernel or be directed to the microkernel (things like PIT interrupts, etc.). Then possibly use the preemption timer to control VM/Task switches.
The biggest catch with VMX is that it's still being actively developed. Some of the best features like EPT and unrestricted guests aren't supported.
Reserved for OEM use.
Re: VMX operations.
Wow, neat. Do you set up the host/guest areas of the active VMCS and then execute a VMLAUNCH to place you in RMode? I'm still very shaky on the concepts around this and how the basics work (using VMLAUNCH and such). To me it seems like the host puts the guest (using VMLAUNCH) into PMode at the address specified in the guest EIP fields, but that doesn't seem like a true VM?
I've set up the guest/host state areas of an active VMCS and now I've run into an issue. After setting up those values, I execute a VMLAUNCH only to find that (even though the guest EIP is correct at 0x7C00), it's not a physical 0x7C00 (as the values aren't correct). Is there some guest->host mapping that I'm missing for physical memory? If you'd like to see what I've set up, here's the source code for the VMX code. Btw, the start code is at the bottom of the page in setup_vmx_management().
The Intel manuals are great, but this is a complex subject and the information is spread across a few very distant areas of the series.
Btw, thanks for helping out on the Wiki article, I'll add more once I learn more.
I've set up the guest/host state areas of an active VMCS and now I've run into an issue. After setting up those values, I execute a VMLAUNCH only to find that (even though the guest EIP is correct at 0x7C00), it's not a physical 0x7C00 (as the values aren't correct). Is there some guest->host mapping that I'm missing for physical memory? If you'd like to see what I've set up, here's the source code for the VMX code. Btw, the start code is at the bottom of the page in setup_vmx_management().
The Intel manuals are great, but this is a complex subject and the information is spread across a few very distant areas of the series.
Btw, thanks for helping out on the Wiki article, I'll add more once I learn more.
Website: https://joscor.com
Re: VMX operations.
My understanding is that VMX basically has to run with a minimum of protected mode with paging for a guest. The exception to this is that certain processors can use the unrestricted guest bit in the secondary processor controls for actual real mode/unpaged protected mode. You need EPT to be enabled to use that feature anyways though, so there's always some form of address translation. So yeah just loading a CR3 value for your guest should help. It might not require PAE though, so make sure you have that enabled in CR4 if you want to use the same tables.
I'm not sure what you mean by not a real VM, but it looks like your host state isn't saved properly in your code. Basically any triple fault should bounce you right back out to a functional host state. The same goes for any control based settings and the VMCALL instruction. Make sure you save the hosts CR3, GDTR base, IDTR base, RSP and RIP. You'll want the TR base too if you're using one. One thing to keep in mind is that general registers are not saved during VM transitions and carry across from host to guest state, so they'll need to be manually maintained.
I'm not sure what you mean by not a real VM, but it looks like your host state isn't saved properly in your code. Basically any triple fault should bounce you right back out to a functional host state. The same goes for any control based settings and the VMCALL instruction. Make sure you save the hosts CR3, GDTR base, IDTR base, RSP and RIP. You'll want the TR base too if you're using one. One thing to keep in mind is that general registers are not saved during VM transitions and carry across from host to guest state, so they'll need to be manually maintained.
Reserved for OEM use.
Re: VMX operations.
Is your code that uses VMX available to the public? If so, I'd really like to check it out.
Currently, I added a ton of host/guest state information, and now I can enter a VM and execute code just like normal (but within a VM). When a fault happens, it VMexit's and I can check out the error codes and such. I'm getting a weird page-fault though if I enable bit 0 of the pin-based VM-execution controls (external interrupt exiting). Ever interrupt that happens causes the VM to exit (on purpose) and then I execute a VMRESUME to bring it back to where it was before the interrupt. About 2-3 interrupts (fails) in, it generates a #PF. If I don't enable that bit in the pin-based controls, it will execute the code fine and my keyboard driver (which it initializes) works fine. I think my host or guest state is not being saved 100% correctly and when the VM exits, either the host is wrong and screws up before even getting to the VMRESUME, or the guest state is not restored properly and generates one on return.
I updated that same link I posted before with the newer code (it's more aesthetically pleasing this time) if you want to check it out. I basically want to set up a virtual machine that will encapsulate the kernel execution (basically, execute the code as before, but from within a VM). It's a fairly easy task conceptually, but I can't get it to work with interrupt exiting.
So, if I wanted to nest another OS within my own, I'd need to use the secondary processor controls and allow the unrestricted guest feature (so that the other OS(s) would start in real mode) and then basically multi-task the VMs? I'm thinking that in order to run them "simultaneously" I'd need to make an array of the guest VM VMCS's and then switch in-between them on, say, every PIT tick.
Currently, I added a ton of host/guest state information, and now I can enter a VM and execute code just like normal (but within a VM). When a fault happens, it VMexit's and I can check out the error codes and such. I'm getting a weird page-fault though if I enable bit 0 of the pin-based VM-execution controls (external interrupt exiting). Ever interrupt that happens causes the VM to exit (on purpose) and then I execute a VMRESUME to bring it back to where it was before the interrupt. About 2-3 interrupts (fails) in, it generates a #PF. If I don't enable that bit in the pin-based controls, it will execute the code fine and my keyboard driver (which it initializes) works fine. I think my host or guest state is not being saved 100% correctly and when the VM exits, either the host is wrong and screws up before even getting to the VMRESUME, or the guest state is not restored properly and generates one on return.
I updated that same link I posted before with the newer code (it's more aesthetically pleasing this time) if you want to check it out. I basically want to set up a virtual machine that will encapsulate the kernel execution (basically, execute the code as before, but from within a VM). It's a fairly easy task conceptually, but I can't get it to work with interrupt exiting.
So, if I wanted to nest another OS within my own, I'd need to use the secondary processor controls and allow the unrestricted guest feature (so that the other OS(s) would start in real mode) and then basically multi-task the VMs? I'm thinking that in order to run them "simultaneously" I'd need to make an array of the guest VM VMCS's and then switch in-between them on, say, every PIT tick.
Website: https://joscor.com
Re: VMX operations.
I haven't specifically used the interrupt exiting as of yet, but I'm fairly sure that it won't be handled by the host's IDT. So it's very possible an EOI isn't being sent and that's causing some problems. You'll probably want to put in some code before the VMRESUME to check the VM-Exit interruption field and determine the proper course of action if it's an external interrupt, depending on the setup you'd like to use you might also want to check the basic exit information. I'd imagine your guest state is getting corrupted due to the general registers not being saved as well the way your code is written. You'll probably want to write a small assembly stub to save it similar to your ISR code.
You wouldn't need an unrestricted guest to actually nest a whole OS. I haven't tried, but my understanding of it is that you would start out in VM86 mode and mask CR0 to indicate that protected mode is not active. When the guest OS tries to enable it, the VM would exit and and indicate that it did so due to a control register write. You might need to actually parse the opcodes to figure out the control register and the bit the caused the exit. After that you disable VM86 mode in the guest rflags increase EIP by the value indicated in the instruction length field and it should be in protected mode. Paging without EPT would likely be a lot messier though. You'd have to maintain consistency between what the VM sees and where it really is in physical memory. I don't know that there's a great way to do that without a bunch of table crawling to be honest.
But yes, basically you'd run it like a multitasking system and have the virtual machine monitor intercept timer interrupts or possibly use the preemption timer based on whatever kind of time sharing model you prefer. A full fledged virtual machine monitor would likely include some emulation for basic PC style hardware, major devices, and real mode interrupts, but it's a lot less work then the alternative.
I'll try to clean up my code and post a sample on here a bit later, my project isn't really open source but I have no problem releasing snippets on stuff like this really.
You wouldn't need an unrestricted guest to actually nest a whole OS. I haven't tried, but my understanding of it is that you would start out in VM86 mode and mask CR0 to indicate that protected mode is not active. When the guest OS tries to enable it, the VM would exit and and indicate that it did so due to a control register write. You might need to actually parse the opcodes to figure out the control register and the bit the caused the exit. After that you disable VM86 mode in the guest rflags increase EIP by the value indicated in the instruction length field and it should be in protected mode. Paging without EPT would likely be a lot messier though. You'd have to maintain consistency between what the VM sees and where it really is in physical memory. I don't know that there's a great way to do that without a bunch of table crawling to be honest.
But yes, basically you'd run it like a multitasking system and have the virtual machine monitor intercept timer interrupts or possibly use the preemption timer based on whatever kind of time sharing model you prefer. A full fledged virtual machine monitor would likely include some emulation for basic PC style hardware, major devices, and real mode interrupts, but it's a lot less work then the alternative.
I'll try to clean up my code and post a sample on here a bit later, my project isn't really open source but I have no problem releasing snippets on stuff like this really.
Reserved for OEM use.
Re: VMX operations.
More issues. =(
Now, I get the host to enter a VM and continue executing code as normal (but now within a virtual machine) and it works great in BOCHS, but not so well on real hardware. I even tried setting the usable memory in BOCHS to random values (instead of it being zero'd memory), but while that did reveal a couple bugs, it did not fix the issue. The real hardware reports that the guest state is invalid and as soon as I VMLAUNCH, VMexit gets called with the error message (in VMX_EXIT_REASON) "VM-entry failure due to invalid guest state". That's about a vague as they could have possibly made it . I've been combing through this code for a while and can't seem to find the issue. The latest commit of the code is the link in my signature (or the link from earlier). I've added quite a few guest-state checks of my own, but they don't catch any issues. BOCHS doesn't report any warnings/errors from the VMLAUNCH (as it would if it found any). As the VMX support in BOCHS is fairly new (and I had to report a bug in BOCHS the other day), I'm a bit sceptical of it's implementation of VMX and if it's just overlooking something that shouldn't be overlooked.
Now, I get the host to enter a VM and continue executing code as normal (but now within a virtual machine) and it works great in BOCHS, but not so well on real hardware. I even tried setting the usable memory in BOCHS to random values (instead of it being zero'd memory), but while that did reveal a couple bugs, it did not fix the issue. The real hardware reports that the guest state is invalid and as soon as I VMLAUNCH, VMexit gets called with the error message (in VMX_EXIT_REASON) "VM-entry failure due to invalid guest state". That's about a vague as they could have possibly made it . I've been combing through this code for a while and can't seem to find the issue. The latest commit of the code is the link in my signature (or the link from earlier). I've added quite a few guest-state checks of my own, but they don't catch any issues. BOCHS doesn't report any warnings/errors from the VMLAUNCH (as it would if it found any). As the VMX support in BOCHS is fairly new (and I had to report a bug in BOCHS the other day), I'm a bit sceptical of it's implementation of VMX and if it's just overlooking something that shouldn't be overlooked.
Website: https://joscor.com
Re: VMX operations.
There's a lot to go through there, but I think it might be your TR and LDTR for the guest. You should try disabling the registers, by setting bit 16 in the access rights and see if that helps. Real procs are very fussy about the values they'll accept, as a general rule of thumb though I'd avoid creating junk settings in any of the registers even though you have access to the individual fields. If you don't think you'll need it you're better off disabling it and going from there.
Reserved for OEM use.
Re: VMX operations.
Hey,
I ended up finding the issue. BOCHS didn't perform a check on the Guest-state Link Pointer (64-bit control area) that lead to BOCHS allowing execution and real hardware to detect a fault. I submitted a bug report (titled: Half-baked VMX Link Pointer state checking.).
I had to write my own guest-state checking routing (it's huge) in order to find the fault. It was a pain, but it was worth it. Now the VMLAUNCH works perfectly in both BOCHS and on real hardware (that I have).
Next up: either I/O bitmap support (protection) or set up some multiple-VM management.
I'll probably add a bit more to the wiki entry today as I've learned a lot more due to fine-combing. =)
I ended up finding the issue. BOCHS didn't perform a check on the Guest-state Link Pointer (64-bit control area) that lead to BOCHS allowing execution and real hardware to detect a fault. I submitted a bug report (titled: Half-baked VMX Link Pointer state checking.).
I had to write my own guest-state checking routing (it's huge) in order to find the fault. It was a pain, but it was worth it. Now the VMLAUNCH works perfectly in both BOCHS and on real hardware (that I have).
Next up: either I/O bitmap support (protection) or set up some multiple-VM management.
I'll probably add a bit more to the wiki entry today as I've learned a lot more due to fine-combing. =)
Website: https://joscor.com
Re: VMX operations.
Glad to hear you got everything working. I managed to fubar my VMX code up somehow while restructuring it, but once I fix the bugs I have I'll post it here.
Reserved for OEM use.