x86 Emulator in NASM (emu86/Z86Emu/x86emu)
Posted: Wed Dec 18, 2024 9:19 am
I'm about to add protected mode support for my x86 emulator, to practice loading official/existing drivers instead of writing them for my low level tests.
For code/data selectors, is it OK that in real/unreal mode the emulator takes into account the limit of the segment register (to access 4GB), but not the base address - so in real mode all segments always have 0 base address+segment register value*16?
Or is it that segment base/limit is ONLY used by the CPU/emulator if you specify address and/or operator override instruction prefixes, so that you are forced to use the 32-bit version of the ModR/M byte and probably the 32-bit-only SIB byte, which need a separate code than the simpler 16-bit ModR/M byte logic (and is this one that doesn't check the segment base address for a normal far jump or RETF to return to unreal mode - then you would lock if you used a 32-bit ModR/M-dependent instruction with the altered code segment base address above from 0)?
The emulator will behave like this in that case:
- I can do far jumps to enter unreal and protected mode back and forth, I can jump to SELCod16:pm16 with selector base address corresponding to the real mode program's CS, and when I return disabling CR0 and doing RETF to flush real mode code, the emulator will ignore the rearranged base address CS*16 of the protected mode selector, so when returning from protected to unreal mode, the code segment's base address will always be 0, starting at the start of the RAM even if the protected mode selector had base address 0x7FD0, for example, if real mode CS was 0x07FD.
- For the data segments in real mode I also ignore the base address of the segment registers but apply the limit to access 4GB.
If the altered code segment base is altered, the correct code would return to real mode with a 32-bit-only far jump or RETF to 16-bit-only real mode code, and code selector with base address 0, to leave code segments properly prepared for any 32-bit instructions overrides that have to do with 32-bit instruction addresses in pure real mode.
That's why the CPU locks when you use address or operand-overriden instructions that exceed 64KB if haven't entered Unreal Mode first, the 32-bit logic of the CPU still works as in protected mode (probably in the ring you choose from protected mode too) and when it sees that you exceed the 64KB limit with ESI, EDI, EAX, etc., it faults and locks while you don't extend data segment limits to 4GB.
For code/data selectors, is it OK that in real/unreal mode the emulator takes into account the limit of the segment register (to access 4GB), but not the base address - so in real mode all segments always have 0 base address+segment register value*16?
Or is it that segment base/limit is ONLY used by the CPU/emulator if you specify address and/or operator override instruction prefixes, so that you are forced to use the 32-bit version of the ModR/M byte and probably the 32-bit-only SIB byte, which need a separate code than the simpler 16-bit ModR/M byte logic (and is this one that doesn't check the segment base address for a normal far jump or RETF to return to unreal mode - then you would lock if you used a 32-bit ModR/M-dependent instruction with the altered code segment base address above from 0)?
The emulator will behave like this in that case:
- I can do far jumps to enter unreal and protected mode back and forth, I can jump to SELCod16:pm16 with selector base address corresponding to the real mode program's CS, and when I return disabling CR0 and doing RETF to flush real mode code, the emulator will ignore the rearranged base address CS*16 of the protected mode selector, so when returning from protected to unreal mode, the code segment's base address will always be 0, starting at the start of the RAM even if the protected mode selector had base address 0x7FD0, for example, if real mode CS was 0x07FD.
- For the data segments in real mode I also ignore the base address of the segment registers but apply the limit to access 4GB.
If the altered code segment base is altered, the correct code would return to real mode with a 32-bit-only far jump or RETF to 16-bit-only real mode code, and code selector with base address 0, to leave code segments properly prepared for any 32-bit instructions overrides that have to do with 32-bit instruction addresses in pure real mode.
That's why the CPU locks when you use address or operand-overriden instructions that exceed 64KB if haven't entered Unreal Mode first, the 32-bit logic of the CPU still works as in protected mode (probably in the ring you choose from protected mode too) and when it sees that you exceed the 64KB limit with ESI, EDI, EAX, etc., it faults and locks while you don't extend data segment limits to 4GB.