8086 emulator - How do I implement Instructions?

tree5673 · Post by **tree5673** » Thu Jan 28, 2021 4:06 pm

I have been working on emulators and interpreters for a long time now.
My current project is an Intel 8086 emulator.
It is mainly instruction set oriented and I do not plan to add virtual devices and interrupts.

At this point, I have implemented register and memory I/O.

My recent projects used a while loop that ran until a halt code.
The loop would contain if statements that would use bit-wise operations as conditions to
read the opcode and find out which instruction to execute. The body of the if statements
would be the instructions themselves. The Body could further decode the instruction.

I used this model in my 8080 emulator, which was successful.

For my 8086 project, this method seems too difficult and error prone.

How should I implement the instruction set?

8086 project
https://repl.it/@tree5673/x86-2

8080 project
https://repl.it/@tree5673/basicvm-1222#main.c

nullplan · Post by **nullplan** » Sat Jan 30, 2021 1:07 am

x86 instructions are byte based. There is a number of optional prefixes, then an opcode byte (or two), and then an operand encoding. Typically a ModR/M byte that might be followed by an explicit displacement. I would probably go with a decoding buffer. Continuously read data into a decoding buffer until you have a complete instruction. Then interpret the instruction and move IP past it. That also makes it simpler if you want to implement interrupts later on, as you will not have IP set to in-between values in the middle of decoding. An x86 instruction is at most 15 bytes long, but that is the modern definition from AMD64. For 8086 I don't know the limit, it is likely smaller.

alexfru · Post by **alexfru** » Sat Jan 30, 2021 2:49 am

Learn the 8086 instruction encoding. If you stick to 8086 and avoid 80186 and newer, it's not too bad.

There are at most 256 opcodes (all opcodes are single-byte) on the 8086.
Every opcode has an encoding and operation associated with it.

So, you can define a table of 256 elements, where you put some description of how to decode every instruction and how to emulate it (in principle, you could have the table simply point to 256 functions, each of which would then decode and emulate one opcode).

But before you get to use that table, you need to consume and record all the instruction prefixes that come before the opcode byte (an instruction may be prefixed with multiple prefixes; every prefix is single-byte). They will be used during the opcode emulation.

Following the opcode byte may be various operands. The CPU manual tells what operands (if any) every opcode has.

Many opcodes are immediately followed by the MOD R/M byte, which can encode the following kinds of operand pairs:
register, register
register, memory
memory, register
Depending on the kind of the memory operand in the MOD R/M byte, a displacement (1 or 2 bytes) may follow the MOD R/M byte.

In some instructions the MOD R/M byte isn't used to encode two operands. It's used to encode just one operand (either register or memory), and the "spared" 3 bits of the MOD R/M byte are essentially an extension of the opcode byte. That is, until you decode the MOD R/M byte, you won't know precisely what operation is behind the opcode.

At the end of the instruction may be an immediate operand (another 1 or 2 bytes).

So, you need to be able to consume, record and decode all these common instruction parts:
prefixes, opcode, MOD R/M byte, displacement, immediate.
Write helper routines for this and use and reuse them.

rdos · Post by **rdos** » Sat Jan 30, 2021 2:58 am

I wrote an x86 emulator in 386 assembly and then executed the instructions themselves with modified addressing modes to provide the correct effects on flags, which might otherwise be tricky (and slow) to achieve with C. Source: http://rdos.net/vc/viewvc.cgi/rdos/trun ... b/emulate/

The emulator is also used in the post-panic debugger of my OS, but then is linked with the live per-core register & memory context of the physical machine.

moonchild · Post by **moonchild** » Sat Jan 30, 2021 4:39 am

nullplan wrote:Then interpret the instruction and move IP past it.

Correction: move the IP, then interpret the instruction :)

rdos · Post by **rdos** » Sat Jan 30, 2021 8:49 am

moonchild wrote:
nullplan wrote:Then interpret the instruction and move IP past it.
Correction: move the IP, then interpret the instruction

Nope. If an exception occurs during execution of the instruction, the fault handler must push the start of the instruction, not the next one.

alexfru · Post by **alexfru** » Sat Jan 30, 2021 4:47 pm

rdos wrote:
moonchild wrote:
nullplan wrote:Then interpret the instruction and move IP past it.
Correction: move the IP, then interpret the instruction
Nope. If an exception occurs during execution of the instruction, the fault handler must push the start of the instruction, not the next one.

You need both (IP and IP+instruction length) for different purposes. Relative jumps and calls need to have the address of the next instruction.

Octocontrabass · Post by **Octocontrabass** » Sat Jan 30, 2021 7:27 pm

nullplan wrote:An x86 instruction is at most 15 bytes long, but that is the modern definition from AMD64. For 8086 I don't know the limit, it is likely smaller.

For the 8086 and 80186, there is no limit. Instructions may be infinitely long if you keep adding prefixes. I suspect these CPUs treat prefixes as separate instructions that set internal flags to be used by the following "real" instruction. The 80186 manual notes that an interrupt during a repeated string instruction may not push the correct return address if there are redundant prefixes, which suggests that these CPUs only track whether or not they have seen the prefix and not where the first byte of the instruction is located.

moonchild · Post by **moonchild** » Sat Jan 30, 2021 7:38 pm

rdos wrote:
moonchild wrote:
nullplan wrote:Then interpret the instruction and move IP past it.
Correction: move the IP, then interpret the instruction :)
Nope. If an exception occurs during execution of the instruction, the fault handler must push the start of the instruction, not the next one.

Exceptions are likely to be much rarer than calls and jumps. (They are, after all, exceptional.) So, better to be optimistic and, in the case when an exception does happen, roll back the ip.

rdos · Post by **rdos** » Thu Feb 04, 2021 5:03 am

moonchild wrote: Exceptions are likely to be much rarer than calls and jumps. (They are, after all, exceptional.) So, better to be optimistic and, in the case when an exception does happen, roll back the ip.

Exceptions are defined to save the ip of the faulting instruction and not the next one because then the exception handler can reexecute the instuction. If the next one was saved instead, the exception handler could not reliably reexecute it since there is no simple way of rolling back the ip.

Also, some exceptions, like page faults caused by lazy allocations or copy-on-write are not really exceptional.

xeyes · Post by **xeyes** » Thu Feb 11, 2021 6:25 am

tree5673 wrote: My recent projects used a while loop that ran until a halt code.
The loop would contain if statements that would use bit-wise operations as conditions to
read the opcode and find out which instruction to execute. The body of the if statements
would be the instructions themselves. The Body could further decode the instruction.

I used this model in my 8080 emulator, which was successful.

That's always the idea (loop till halt).

An obvious way to optimize is to plan for a decoder cache. As you don't want to repeatedly decode the same instructions, especially for something as complex as x86. AFAIK even the real hardware make use of this method, i.e. they decode the architectural instructions and cache the resulting uops.

tree5673 wrote: For my 8086 project, this method seems too difficult and error prone.

How should I implement the instruction set?

Why not steal another idea from the real hardware? Implement uops (aka simple helper functions that aren't too difficult and error prone), decode x86 instructions into them and run them instead. Would fit nicely with the cache idea as well later if you want to take that route.

Also take a look here: https://codegolf.stackexchange.com/ques ... l-8086-cpu

Many fancy ways to "abuse" various language features, most answers are only a few hundred SLOC with proper spacing and comments.

But I decided to cheat instead

Schol-R-LEA · Post by **Schol-R-LEA** » Sat Feb 13, 2021 2:38 pm

Are intent on strict emulation by interpretation, or have you considered dynamic binary translation (which is how many high-performance emulators work, IIUC including QEMU in cross-emulation mode)? This post discusses the topic at length.

Thomas · Post by **Thomas** » Tue Feb 16, 2021 5:29 am

Hi tree5673 ,

There are many ways you could do this. A simple way would be using a simple fetch, decode, execute loop. This is slow but relatively easy to implement. Another approach would be to use binary translation. Yet another approach would be to implement the processor logic in an FPGA.

You may use fake86 , 8086 tiny as reference implementations. They use the simple decode fetch execute loop approach and gets reasonable perfomance

--Thomas

bzt · Post by **bzt** » Tue Feb 16, 2021 8:14 am

tree5673 wrote:I have been working on emulators and interpreters for a long time now.

tree5673 wrote:How should I implement the instruction set?

What do you mean? If you're really working on emulators and interpreters for a long time, this shouldn't be an issue... You write an interpreter which happens to interpret x86 bytecode. Those are listed and explained in great detail in the Intel manuals. Start with chapter "Vol. 2A 2-1 CHAPTER 2 INSTRUCTION FORMAT".

I'd recommend to first write a disassembler, then you'll see how instruction encoding goes.

But FYI, others have done this (many times actually), here's one with full source code: https://github.com/wfeldt/libx86emu, and another (very compact one) https://github.com/stephenrkell/libx86emulate.

Cheers,
bzt

OSDev.org

8086 emulator - How do I implement Instructions?

8086 emulator - How do I implement Instructions?

Re: 8086 emulator - How do I implement Instructions?

Re: 8086 emulator - How do I implement Instructions?

Re: 8086 emulator - How do I implement Instructions?

Re: 8086 emulator - How do I implement Instructions?

Re: 8086 emulator - How do I implement Instructions?

Re: 8086 emulator - How do I implement Instructions?

Re: 8086 emulator - How do I implement Instructions?

Re: 8086 emulator - How do I implement Instructions?

Re: 8086 emulator - How do I implement Instructions?

Re: 8086 emulator - How do I implement Instructions?

Re: 8086 emulator - How do I implement Instructions?

Re: 8086 emulator - How do I implement Instructions?

Re: 8086 emulator - How do I implement Instructions?