How does VMWare work?

Anton · Post by **Anton** » Thu Feb 26, 2004 12:00 am

How does the VMWare emulate the CPU.(virtual adresses, ...)

EyeScream · Post by **EyeScream** » Thu Feb 26, 2004 12:00 am

I don't get the question. It just creates a virtual memory space and does exactly the same the CPU would do it each case... There's nothing special about it...

carbonBased · Post by **carbonBased** » Thu Feb 26, 2004 12:00 am

Indeed.

More to the point, it creates a VM86 task, and emulates a basic (well... not so basic

x86 platform. If you look up some docs om VM86 mode, it'll prob'ly make more sense.

Essentially all it does is execute your OS withen a special environment (vm86) where every priveleged instruction causes an exception... that exception handler (0x13) can be used to emulate those priveleged instructions.

That's a (very) simplified version of it, but really... that's how it works

Cheers,
Jeff

ziggy · Post by **ziggy** » Thu Feb 26, 2004 12:00 am

you might find some insight @ http://www.kernelthread.com/publication ... alization/
it is a general high level overview of virtualization

rexlunae · Post by **rexlunae** » Thu Feb 26, 2004 12:00 am

Either I am missing something major about vm86 mode, or vmware cannot possibly work like carbonbased says. As I understand it, the segmentation registers in vm86 mode take on interpretations similar to those in real mode, meaning that there is no way for the task to specify that it wants to run 32-bit code without adding the overrides manually into the code. How, than, can it execute 32-bit code?

Anton · Post by **Anton** » Fri Feb 27, 2004 12:00 am

>where every priveleged instruction causes an exception
I agree if the instruction is privilaged, then it can be catched and executed in a special way, but there exist some instructions, which are not privilaged, but they play a major role in CPU execution:
mov ds,selector is not a privilaged instrution, so there is no way to handle them. There is no way to set a trap, unless some trick is used.
More ever, it is a know fact(intel does not argue with this), that intels processors are not truely virtualizable(even the new itanium), unless some patent technology is used, and that's what VMWare does.
So do you know how to handle instructions like mov ds,selector?
Anton.

Anton · Post by **Anton** » Fri Feb 27, 2004 12:00 am

Thanx, i know the basics, i am more interested in virtualization of the intel's CPUs.
Anton.

Anton · Post by **Anton** » Fri Feb 27, 2004 12:00 am

There are other problematic instructions like reading a system register:it is not a previlaged instruction, so you can't trap it:)
Anton.

carbonBased · Post by **carbonBased** » Fri Feb 27, 2004 12:00 am

Ahh, okay, now I understand.

I would imagine vm86 is still used for much/all of the operation in real mode (ie, the bios and boot time, and dos, etc).

The protected mode operation would require some assistance from the operation system (hense the modules loaded into the kernel for linux).  The environment can be faked through the paging mechanism, however.

This is just off the top of my head, and I'm merely guessing, here, but the following would/should work:

For your example, 'mov ds, selector,' you'd have to trace back right to where the OS began.  lgdt is a priveleged instruction, and would be emulated by VMWare and would take the memory referenced by the GDT and cache it somewhere.  From this, it can tell exactly what ranges of memory the OS is going to be using.

The actual moving of values into a register is irrelevant... the OS can move anything into DS.  The virtualization comes in when the OS tries to access memory in DS.  If paged properly, this will create a fault, which the OS can emulate, if need be.  VMWare would probably use this operation for any special ranges of memory, such as 0xA0000, and properly display that memory on the virtual display.

Much of the time, though, there would be no reason.  VMWare can safely map the host OSs memory into ranges that the virtual OS expects without having to worry about emulating anything.

Essentially VMWare simply arranges things so that exceptions are generated on anything that might be suspect.  The host OS would be responsible for directing those exceptions over to the VMWare application, rather then trying to interpret them itself.

Cheers,
Jeff

Anton · Post by **Anton** » Mon Mar 01, 2004 12:00 am

OK. But the instruction to read the desc. table is not privilaged, so the running os can always check if the DT used, is what it loaded. This means that we have to let the OS load it's own DT, but then how do we survive this? I was thinking about loading this OS in it's task segment, but then this segment has to be in the DT, which means that we are going to have to alter it, but we can't do that(since the os can always check this, by reading(storing) the DT).

I could also read the eflags, which means that i can check my privilaged level(and it better not be 3).

From these points, i make it that the binary file should be so how compiled:all system instructions, which are not privilaged, should be replaced by some other code. What are your ideas about solving these problems?

Anton.

carbonBased · Post by **carbonBased** » Wed Mar 03, 2004 12:00 am

The instruction to read the DT isn't priveleged, no, but it can be made to result in a page fault by mapping that page to a non-present page.  In this respect, the VM can emulate the DT completely.

Hmm... yes, nevermind... you must obviously mean sidt and sgdt?  Hmm... I'm not really sure how this would work.  Like I said, however, there must be some co-operation between VMWare and the OS, so it's possible that the VM task is allowed to tell the OS to make all descriptor tables unaccessable to ring 3 applications (either through segmentation or the pager).

In this instance, any attempt to read from the DT will result in a page fault.  The OS can interpret this however it wants for this own applications, but for applications under the control of the VM, it will ignore it and pass the fault onto the VM which can then properly return values from the fake descriptor table, rather then the real one.

Again, simply a guess.  The values returned from sidt and sgdt will still be wrong; they'll be the values of the _real_ IDT and GDT of the OS... and so if an OS so chose, it would be able to tell if it were under a VM if this method were to be used.

Of course... at the same time, why would you call SIDT if you actually knew where the IDT was supposed to be?

I dunno... it could work, anyway, but it's certainly not perfect.  I don't know what VMWare actually does.

Cheers,
Jeff

EyeScream · Post by **EyeScream** » Sat Mar 06, 2004 12:00 am

Well, I haven't been to the board for a while =)

I'm not sure about how _exactly_ VMWare works after all but I do believe that it has nothing to do with VM86 or any other native mode. It's quite understandable if you try to emulate something. I'll try to explain my vision of emulation.

First, let us declare the term "address space" as a contiguous array of memory which has addresses starting from 0 and onwards to its limit. Let us also address certain fixed locations in this memory as "registers".

Now, imagine that we have two address spaces (I am simplifying the things a bit but I guess it's OK in our case): address space M (with a limit of 4G) and address space R (the limit of which I am too lazy to calculate).

Now let's declare R[0] as a 32-bit register EAX, and at the same time as a 16-bit register AX and at the same time as a 8-bit register AL; and let's place a 8-bit AH register in R[1]. Continuing with all our IA-32 registers (EBX, ECX, EDX, EDI, ESI, EBP, ESP, LDTR, GDTR, IDTR, EIP, etc.) we may fill R with the complete set of them. The troubles begin when we want to declare segment registers along with their so-called "shadow" parts. The case is that we don't know the actual shadow part format, so we can't declare it the same way we have it in a real CPU. But then, the solution is simple: who cares? Actually there are _no_ processor instructions that operate on those shadow registers directly, these parts are updated automatically by processor when loading selectors. What we know about them is that at least the descriptor info is copied there. And that's all we need. So, we extend CS, DS, ES, FS and GS to pseudo-registers ECS, EDS, EES, EFS and EGS which now will contain _both_ the selector and the descriptor part (which is our "guess" of shadow format).

But now, when we have those two address spaces, we are absolutely free to declare _any_ processor instruction in the terms of address space manipulation. Of course, this doesn't include various devices emulation but that's not the topic of our erm... discussion =)

For example, here's how we can define the SIDT instruction in a pseudo-language (still, M and R are our address spaces):

SIDT(r32 a)
{
a = R[IDTR];
}

Things are a big harder when we do LIDT. We have to check for different security violations like the CPL value and so on, but it's all possible to do. Actually, Intel itself provides us with the solution: if you read it's IA-32 Manual (which is freely downloadable and is a must-have for all programmers), each instruction is explained both in a word form and in a pseudolanguage there. In terms of our address spaces, we _can_ implement this pseudolanguage.

So, we _can_ emulate an IA-32 CPU. Cool, ain't it? =)

This is my vision of how CPU emulation can be done. I believe that to some extent it's the way VMWare does it (and as for Bochs, I am sure the scheme is _quite_ similar). The actual problem is realization: we have to optimize the emulator code as good as possible because the emulation itself requires lots of CPU and we don't have much left for the guest OS. VMWare does a good optimization job here, as the guest OS runs quite well even with just 128M RAM.

Best regards,
EyeScream.

anton · Post by **anton** » Sun Mar 07, 2004 12:00 am

VMWare is not a emulator: you can't recompile VMWare for SUN systems, and run Windows under it. Boch is:you can recompile it for SUN, and run x86 programs. So, it is not the optimization which make VMWare work faster, it's (patented )virtualization technology.

Now about:
//For example, here's how we can define the SIDT instruction in a pseudo-//language (still, M and R are our address spaces):
//
//SIDT(r32 a)
//{
// a = R[IDTR];
//}
So i guess, the binary is first scaned, and every such instruction(system, but not privilaged) is replace with the equevelent code.
Anton.

EyeScream · Post by **EyeScream** » Sun Mar 07, 2004 12:00 am

Well, perhaps it's just about VMWare. Then I guess the kernel modules that come with the Linux version of VMWare do make sense.

Best regards,
EyeScream.

Anton · Post by **Anton** » Sun Mar 07, 2004 12:00 am

//Then I guess the kernel modules that come with the Linux version of VMWare
//do make sense.
kernel modules let VMWare execute stuff at ring 0, that's all there is to it. VMWare consists of a monitor(regular aplication) running in ring3 and driver(module in Linux) running in ring 0(they comunicate), which does all provilaged instructions, which are needed by the monitor.
Anton.

OSDev.org

How does VMWare work?

How does VMWare work?

RE:How does VMWare work?

RE:How does VMWare work?

RE:How does VMWare work?

RE:How does VMWare work?

RE:How does VMWare work?

RE:How does VMWare work?

RE:How does VMWare work?

RE:How does VMWare work?

RE:How does VMWare work?

RE:How does VMWare work?

RE:How does VMWare work?

RE:How does VMWare work?

RE:How does VMWare work?

RE:How does VMWare work?