Page 2 of 2

Re: nested VM virtualization

Posted: Thu Mar 17, 2016 3:12 am
by cianfa72
Kevin wrote:If you only write data with vmwrite so that you can later read it with vmread, but you never actually start a VM from that VMCS, then there is no point in using those instructions. You can then simply write the data to some memory location with normal non-VMX instructions.
Sorry, maybe what was unclear to me is the following point: VMCS structure, apart of revision ID and VMX Abort fields, is an *opaque* collection of six subgroups (i.e. the internal structure is not known/needed to the user) and the only way to access them is with VMREAD/WMWRITE instructions (they trap to VMX root mode if executed in VMX non-root mode).

As consequence L1 hypervisor basically allocate a 4KiB aligned memory region in its (guest) physical address space for VMCS1-2 and considering that, as you said, never VMCS1-2 physical address will be loaded as current-VMCS pointer by the processor, L0 hypervisor is free to choose a whatever internal structure for VMCS1-2 (apart of first 8 byte: revision ID + VMX Abort) emulating then L1 hypervisor VMREAD/VMWRITE.

L0 hypervisor will use VMREAD/VMRITE (actually executed by processor) just in order to manage "real" VMCSs that will be actually loaded by processor as current-VMCS pointer to run L1 hypervisor (VMCS0-1) and L2 guest (VMCS0-2) respectively

Can you confirm my understanding ?

Re: nested VM virtualization

Posted: Thu Mar 17, 2016 3:28 am
by Kevin
Correct, that's what I meant.

Re: nested VM virtualization

Posted: Fri Mar 18, 2016 4:07 am
by feryno
Hi cianfa72
as Kevin pointed you, L0 vm exit handler most efficient way on handling L1 vmwrite is to store the value into L0 internal memory and on L1 vmread return it from there. That's when VMCS shadowing is not available or not implemented.
Only L0 is allowed to execute real VMX instruction on hardware and must emulate all of them for L1. Their execution in L1 cause vmexits to L0.
L0 will have to vmwrite child hypervisor VMCS fields later (from L0 internal memory into VMCS02) on emulating L1 execution of VMLAUNCH/VMRESUME and vmread them on VM exits from L2 to L0 (from VMCS02 into L0 internal memory).
After vmexit from L2 to L0, parent hypervisor (L0) must switch from L0 to L1 vmexit handler so L1 runs with an illusion of root mode but in fact it runs in nonroot mode.

Re: nested VM virtualization

Posted: Mon Mar 21, 2016 9:49 am
by cianfa72
ok good and now, what if VMCS shadowing is supported by processor/L0 hypervisor and enabled ?

As far as I understand when L1 hypervisor code is running (e.g. L1 vm exit handler), the processor current-VMCS pointer point to VMCS0-1 machine physical address (VMCS used by L0 hypervisor to run L1) along with a pointer to the shadow VMCS machine physical address (basically a shadow version of VMCS1-2 used by L1 hypervisor to run L2)
Then L1 hypervisor VMREAD/VMWRITEs are actually executed by the processor accessing the shadow VMCS region in machine physical memory. Additional logic is needed on L0 hypervisor to sync shadow VMCS with VMCS02 (used by L0 to directly support L2 guest).

Does it make sense ? Thanks :!:

Re: nested VM virtualization

Posted: Tue Mar 22, 2016 12:44 am
by feryno
Hi cianfa72.
Yes, when CPU supports VMCS shadowing and it is enabled, then L0 does not need to care about vm exit handlers for vmread/vmwrite executed by L1.
But then L0 has another things to do at emulation of vmlaunch/vmresume executed by L1 and at vm exits from L2 to L0 and that's accessing VMCS shadow and keep it synced.
It is easier for the first approach to have nesting without VMCS shadowing, later you will implement and enable it. That's because supporting CPUs without VMCS shadowing and also less code, easier to find problems and solve them. E.g. it is easy to incrementally do programming and test whether these VMX instructions behave correctly: vmxon, vmxoff, vmclear, vmptrld, vmptrst, vmread, vmwrite, invept, invvpid. After that instructions are emulated correctly by L0, nesting becomes very complicated. VM exit handlers in L0 for vmlaunch/vmresume executed by L1, then L0 code for switching worlds (after vm exit from L2 to L0 you have to switch from L0 to L1 and support L1 running its vm exit handler under an illusion of root mode).