BIOS Calls in Emulators
BIOS Calls in Emulators
Hi,
I'm wondering on how this works:
You write a real mode emulator and let in run in protected mode, you tell it to execute int 10h, somehow it magically does a VBE mode switch.
Can somebody explain this magic part to me?
Also what is the minimum amount of instructions I would have to emulate in order to be able to execute int 10h interrupt handling code?
Why are you doing this? Well this topic interests me and sounds like a fun project (x86 Real Mode Emulator for Mode Switching).
I'm wondering on how this works:
You write a real mode emulator and let in run in protected mode, you tell it to execute int 10h, somehow it magically does a VBE mode switch.
Can somebody explain this magic part to me?
Also what is the minimum amount of instructions I would have to emulate in order to be able to execute int 10h interrupt handling code?
Why are you doing this? Well this topic interests me and sounds like a fun project (x86 Real Mode Emulator for Mode Switching).
OS: Basic OS
About: 32 Bit Monolithic Kernel Written in C++ and Assembly, Custom FAT 32 Bootloader
About: 32 Bit Monolithic Kernel Written in C++ and Assembly, Custom FAT 32 Bootloader
- Combuster
- Member
- Posts: 9301
- Joined: Wed Oct 18, 2006 3:45 am
- Libera.chat IRC: [com]buster
- Location: On the balcony, where I can actually keep 1½m distance
- Contact:
Re: BIOS Calls in Emulators
An emulator - such as libx86emu which was specifically written for just this - executes code pretending to be a real machine. It is a closed system - if you want it to affect the real machine, then the emulated one has to forward to the real one, often by sharing I/O ports, BIOS and video card memory ranges.
The instructions you need? No two video cards are equal. Be prepared to support at least everything a 486 has, protected mode and everything.
The instructions you need? No two video cards are equal. Be prepared to support at least everything a 486 has, protected mode and everything.
Re: BIOS Calls in Emulators
Hi Octacone,
I remember reading this older post, which contains a lot of explanation and useful information.
viewtopic.php?f=1&t=22363
I remember reading this older post, which contains a lot of explanation and useful information.
viewtopic.php?f=1&t=22363
Re: BIOS Calls in Emulators
Yup, after some deeper digging, it looks like I/O ports are the main thing that is used for mode setting internally. The interrupt code itself is just to generate the required values that need to be forwarded trough ports.Combuster wrote:An emulator - such as libx86emu which was specifically written for just this - executes code pretending to be a real machine. It is a closed system - if you want it to affect the real machine, then the emulated one has to forward to the real one, often by sharing I/O ports, BIOS and video card memory ranges.
The instructions you need? No two video cards are equal. Be prepared to support at least everything a 486 has, protected mode and everything.
I also found out that protected mode support is not required and that all the code is 16 bit for compatibility reasons.
That was super useful.zity wrote:Hi Octacone,
I remember reading this older post, which contains a lot of explanation and useful information.
viewtopic.php?f=1&t=22363
I've started working on my emulator since, the thing that bothers me is, how do I handle single opcode multiple instructions?
For e.g. 0x80 can be add, adc, and, xor, or, sbb, sub, cmp.
Edit: I just discovered that opcodes are not just "randomly assigned numbers by Intel", there is a whole lot of things going on. Prefix bytes, m/r byte, SIB... a lot more that I initially thought.
OS: Basic OS
About: 32 Bit Monolithic Kernel Written in C++ and Assembly, Custom FAT 32 Bootloader
About: 32 Bit Monolithic Kernel Written in C++ and Assembly, Custom FAT 32 Bootloader
Re: BIOS Calls in Emulators
It is a real pain, so many variations. I've been reading on this topic for days and I'm still struggling to catch up.mariuszp wrote:Each opcode is one instruction (though it might have different mnemonics in assembly to make things more clear). Sometimes prefix + opcode is a different instruction; that's specified explicitly where necessary.Octacone wrote:I've started working on my emulator since, the thing that bothers me is, how do I handle single opcode multiple instructions?
For e.g. 0x80 can be add, adc, and, xor, or, sbb, sub, cmp.
Edit: I just discovered that opcodes are not just "randomly assigned numbers by Intel", there is a whole lot of things going on. Prefix bytes, m/r byte, SIB... a lot more that I initially thought.
As for the 0x80 thing: the ModR/M byte following the 0x80 opcode has 3 unsued bits (where normally you would specify a register operand) because it doesn't need 2 register operands. For example for "XOR r/m8, imm8" the encoding is "80 /6 ib", meaning the byte 0x80 is followed by a ModR/M byte where the 3 unsued bits are assigned the value "6", and then an immediate byte. The other instructions have different values in those 3 unused bits. As such, these 3 bits are actually an extension of the opcode, and that's how you differentied them.
(I wrote an x86 assembler as part of a project once, these things do get quite confusing. Trying to actually emulate these instructions is a whole new level of complex altogether)
There are just not many resources on 8086 instruction decoding.
OS: Basic OS
About: 32 Bit Monolithic Kernel Written in C++ and Assembly, Custom FAT 32 Bootloader
About: 32 Bit Monolithic Kernel Written in C++ and Assembly, Custom FAT 32 Bootloader
Re: BIOS Calls in Emulators
The A86 assembler has a file called A86MANU.TXT; the section "The 86 Instruction Set" has always been the simplest to understand in my opinion.Octacone wrote:
It is a real pain, so many variations. I've been reading on this topic for days and I'm still struggling to catch up.
There are just not many resources on 8086 instruction decoding.
Additionally, sandpile.org is indispensable (specifically look at their opcode encoding and opcode groups).
Re: BIOS Calls in Emulators
You don't need many. Just the CPU manual (have both, intel and AMD) and a way to experiment and check your understanding of the manual. For the latter use an assembler and a disassembler. NASM (and its NDISASM) will work perfectly here. You may also want a hex file viewer and a programmer's calculator. That's all.Octacone wrote:It is a real pain, so many variations. I've been reading on this topic for days and I'm still struggling to catch up.
There are just not many resources on 8086 instruction decoding.
Start with e.g. the add instruction. Write a bunch of different variants of it like so:
Code: Select all
; assemble: nasm -fbin file.asm -o file.bin
bits 16
add ax, bx
add ax, [bx]
add [bx], ax
add ax, [bx+di]
add [bx+di+2], ax
add [bx+di-2], ax
add [bx+di+1024], ax
add [bx+di-1024], ax
add word [bx+di-1024], 0x1234
For fun try to do the reverse. Given an instruction description/encoding, try to encode it by hand and see that the disassembly of your bytes gives you the expected instruction.
Extend this to 32 bits, throw in segment override prefixes, etc.
Beware, some instructions may have alternative encodings.
Re: BIOS Calls in Emulators
I would recommend the disassembly method mentioned above for all assembly language programmers. It helps to get a grasp of instructions and the their logic. No need to spend too much time inspecting the bytes but maybe a few hours or a day? In addition to that, trying labels and seeing what values the assembler sets to those could be enlightning, e.g. how "mov ax, my_label" or "jmp my_label" translate to bytes and how the values change when code is modified or assembler directives are used. As a more advanded topic, check how object files handle relocations.
Re: BIOS Calls in Emulators
+1, that file contains a metric ton of useful data, was looking for something like that.azblue wrote: The A86 assembler has a file called A86MANU.TXT; the section "The 86 Instruction Set" has always been the simplest to understand in my opinion.
Additionally, sandpile.org is indispensable (specifically look at their opcode encoding and opcode groups).
That's a smart idea. I didn't know ndisasm existed. Although it would be useful to have a program that could differentiate between prefixes, opcodes and other bytes, instead of having them all written together.alexfru wrote: You don't need many. Just the CPU manual (have both, intel and AMD) and a way to experiment and check your understanding of the manual. For the latter use an assembler and a disassembler. NASM (and its NDISASM) will work perfectly here. You may also want a hex file viewer and a programmer's calculator. That's all.
Start with e.g. the add instruction. Write a bunch of different variants of it like so:
Observe (from the assembly listing or from disassembly of the binary (e.g. "ndisasm -b 16 file.bin")) how they're encoded.Code: Select all
; assemble: nasm -fbin file.asm -o file.bin bits 16 add ax, bx add ax, [bx] add [bx], ax add ax, [bx+di] add [bx+di+2], ax add [bx+di-2], ax add [bx+di+1024], ax add [bx+di-1024], ax add word [bx+di-1024], 0x1234
For fun try to do the reverse. Given an instruction description/encoding, try to encode it by hand and see that the disassembly of your bytes gives you the expected instruction.
Extend this to 32 bits, throw in segment override prefixes, etc.
Beware, some instructions may have alternative encodings.
I'll definitely have to address jumps and calls sooner or later, since VBE code jumps around a lot.Antti wrote:I would recommend the disassembly method mentioned above for all assembly language programmers. It helps to get a grasp of instructions and the their logic. No need to spend too much time inspecting the bytes but maybe a few hours or a day? In addition to that, trying labels and seeing what values the assembler sets to those could be enlightning, e.g. how "mov ax, my_label" or "jmp my_label" translate to bytes and how the values change when code is modified or assembler directives are used. As a more advanded topic, check how object files handle relocations.
Getting my code to recognize the instruction is the hardest part, emulating them is easy. After all I don't need all of them.
OS: Basic OS
About: 32 Bit Monolithic Kernel Written in C++ and Assembly, Custom FAT 32 Bootloader
About: 32 Bit Monolithic Kernel Written in C++ and Assembly, Custom FAT 32 Bootloader
Re: BIOS Calls in Emulators
Near and short direct jumps and calls are encoded relative to the end of the instruction. You get an opcode and then an offset, and you first set IP to the end of the instruction and then add the sign-extended operand to get the new IP.
So for instance, the following snippet:
is encoded:
That last FD being -3 when sign-extended.
Far and indirect calls and jumps encode their target absolutely. So for instance, in 16-bit mode, the code bytes
mean
And that means: Look in memory at the word BX is pointing to and copy that into IP.
VBE code might also use software interrupts. If you don't know the function called in that case, you might also just emulate that as an indirect far call that pushes flags.
So for instance, the following snippet:
Code: Select all
hltloop:
hlt
jmp hltloop
Code: Select all
F4 EB FD
Far and indirect calls and jumps encode their target absolutely. So for instance, in 16-bit mode, the code bytes
Code: Select all
FF 27
Code: Select all
jmp [bx]
VBE code might also use software interrupts. If you don't know the function called in that case, you might also just emulate that as an indirect far call that pushes flags.
Carpe diem!
Re: BIOS Calls in Emulators
I found this online tool quite useful for encoding / decoding x86 assembly. Works with both 32/64 bit code.
https://defuse.ca/online-x86-assembler.htm
https://defuse.ca/online-x86-assembler.htm