Page 1 of 1
Exception handling outside of guest VM
Posted: Wed Feb 17, 2021 5:28 am
by gonzo
Hey all, I am looking to rubber duck a little and maybe get some good advice from you.
I am creating a micro VM in KVM for <several good reasons>, and not really emulating a PC.
So, one thing I more or less remember from my osdev days is that CPU exceptions are handled by jumping to IDT entry, which specifies the GDT selector as well as the IST entry and finally a handler address. Now, I have a VM with no CPL=0, that is an unprivileged VM.
Do you need to implement exceptions inside the guest for KVM at all? I think yes, but it would be much better if I didn't have to. I realize that it gets weird if you can't triple fault to reboot, but I absolutely hated that "feature".
How would you sneak the descriptors into the guest? Would you generate the IDT entries as well as the functions (simple write to MMIO handler to exit the guest)?
I absolutely do not want to have any CPL=0 in the guest, but if that is not possible I would want to know where I would have some issues with that.
I currently don't see any problems. System calls are just writes to a MMIO address, and I'm using the offset into the page as the system call number to save registers. I guess I'll be benching this to figure out which method is better, but it would be even better if I could use OUT instruction - however as far as I can tell that's not possible.
The plan is to support AMD64 with maybe ARM64 support on the multi.-year horizon.
Re: Exception handling outside of guest VM
Posted: Wed Feb 17, 2021 11:57 am
by nullplan
gonzo wrote:Hey all, I am looking to rubber duck a little and maybe get some good advice from you.
I am unfamiliar with that idiom. I know several meanings of "rubber duck", none of which have anything to do with asking questions.
gonzo wrote:Do you need to implement exceptions inside the guest for KVM at all?
Yes, of course. What if there is a faulty application running inside the guest OS? You need to tell the guest OS that the application is faulty. Unless you would like to just crash the VM.
gonzo wrote:How would you sneak the descriptors into the guest?
I should think that's the guest's job. When the guest executes "lgdt" or "lidt", since you are running at CPL 0, these will fault. You need to catch that fault in your VM and then you just do what the CPU would, noting down base address and length. Then if something weird happens you dispatch the fault like the CPU would as well. You also need the TSS base address if ISTs are used.
Re: Exception handling outside of guest VM
Posted: Wed Feb 17, 2021 12:17 pm
by gonzo
Hey, thanks for the response. This is a micro KVM guest and I have already snuck IDT entries into the guest memory using the KVM API. I also implemented CPL=3 system calls using MMIO traps. I'm still working on CPL=3 exception handling, but I don't see anything that would prevent it from working. I also don't have need for interrupts, nor are they enabled, so I don't technically need IST, but we will see. No multi-processing. I can think of stack overflow causing faulty reports, but I am severely limited on memory so all these things are nice-to-have.
I already have this solution in place using an instruction set emulator, but binary translation only gives you so much speed. I can fork this emulator in ~1 microsecond for an enterprise product, and it has a 64kb working memory. Copy-on-write is used on everything. The problem with this is it doesn't scale well and it might be worth it to take the 100+ microsecond hit to get a KVM guest up and running. Now, naturally the guest environment is not an OS in this case. It's just a very special userspace program that uses a custom API for everything. Easily achieved by building with newlib and overriding everything.
So, I don't actually want to handle anything in the guest. But I might have to handle exceptions - which is what this thread is all about. KVM does seem to shutdown the guest when it faults in a loop. Maybe it has some kind of triple-fault loop detection technology! Regardless of where the exceptions are handled, we will immediately exit the guest and do a remote backtrace in the host. That is, regardless of where they are handled, the guest is very untrusted.
Re: Exception handling outside of guest VM
Posted: Wed Feb 17, 2021 12:54 pm
by feryno
if you start hypervisor first and then let to start guest (VM with OS) then you need to setup only host IDT and host exception handlers (hypervisor part) and let guest VM (OS) to setup its exception handling (guest IDT base, guest IDT limit, setting all necessary vectors)
KVM intercepts triple faults (vm exit from guest) and shuts down faulting VM, this is the same as when triple fault occurs on baremetal cpu (cpu is put into shutdown state by hardware mechanism)
you can intercept guest attempts to execute LIDT if your CPU and virtualization version is capable to do that (old intel versions were not capable like core 2 duo)
you can also inspect guest IDT base and guest IDT limit
from the hypervisor, you can inject fake interrupts into guest as well erase some exceptions which occurred while guest was running (so exception won't be delivered into guest OS via its guest IDT)
you can intercept all guest exceptions you are interested in by enabling corresponding bits (intel vmcs exception bitmap, amd VMCB offset +8, offset +C bit 1. for NMI), under intel the your mentioned triple fault always causes vm exit (unconditionally) while for the other exceptions you can enable/disable their interception (conditional vm exits)
Re: Exception handling outside of guest VM
Posted: Wed Feb 17, 2021 2:47 pm
by gonzo
Thanks feryno, that was very helpful. Is it possible to manipulate the VMCB from userspace using the KVM API?
If not, I'll just make every exception handler cause a VMEXIT
Re: Exception handling outside of guest VM
Posted: Thu Feb 18, 2021 12:17 pm
by feryno
Hi gonzo, I do not know exactly. KVM is too big project, I developed my own lightweight hypervisor (uefi binary below 50 kB). But I expect for security reasons KVM should not allow to manipulate Intel VMCS / AMD VMCB from usermode (I mean VM ring3). Maybe KVM has an interface to access them from VM ring 0 (something must cause vm exit, KVM must know this vm exit was an request to read or write some VMCS/VMCB field, then KVM must handle the request and finally resume the guest, I know that ms hyper-v has such hypercalls using Intel vmcall / AMD vmmcall instructions, but I do not know KVM so well). For Intel direct accessing the VMCS memory is discouraged and VMREAD/VMWRITE is dedicated to do that (offsets of VMCS fields change among VMX versions and CPU may use hardware cached values instead of inspecting VMCS memory). For AMD there VMCB is read/written using memory reads/writes but there is not direct way to find VMCB memory page (and again CPU may use fast internal caches and not to look into VMCB memory page). There are projects like
Actaeon to scan mem and find VMCS/VMCB - it is trivial to use memory virtualization and not to expose page holding VMCS/VMCB into guest physical memory. If you are able to find and manipulate memory holding VMCB/VMCS, the CPU could use internal caches (fast) and not to look into the memory page (slower). I just wanted to say do not try hackers method, even if you find a security hole, it will be patched soon if you publish it. Try to find official way in KVM documentation. If you don't find suitable way in documentation, then you have to modify KVM source code and adapt everything to your purposes.