can i ignore safely address size prefix in real mode?

newanabe · Post by **newanabe** » Mon Sep 09, 2013 12:13 pm

I am re-coding an old basic emulator I coded before time. Before, the emulator was ignoring address size prefixes, and I had not problems. Now, when re-coding it all, I just tried to add support for that prefix and I think NDISASM gone crazy. I was trying to figure out the behavior for a special case of ModRM byte when addres size prefix used, and this is what I got in ndisasm. I don't understand nothing. Can I keep ignoring that prefix? Do that prefix has any meaning in real mode?

h0bby1 · Post by **h0bby1** » Mon Sep 09, 2013 1:09 pm

lot of assembler and compiler use by default the size prefix override in 16 bit mode

you can find all the mod/rm bit stuff there

http://ref.x86asm.net/coder32.html

and more infos about opcode encoding

http://www.sandpile.org/x86/opc_enc.htm

the size override can change lot of default behavior regarding operand size and how the mod/rm bits must be interpreted

newanabe · Post by **newanabe** » Mon Sep 09, 2013 1:26 pm

thank you!
in the link I found that info(attachment)
so I think it is safe to ignore for now the address size prefix. after all, if I have to code the emulation for that weird code combinations, I will never terminate.

~ · Post by ~ » Mon Sep 09, 2013 1:54 pm

newanabe wrote:thank you!
in the link I found that info(attachment)
so I think it is safe to ignore for now the address size prefix. after all, if I have to code the emulation for that weird code combinations, I will never terminate.

I have an emulator. I have implemented data segment override prefix detection, repeat prefix, and operand-size override with the many ModR/M combinations.

If you want to be able to access 32-bit data in Real/Unreal Mode, you need to implement ModR/M and SIB bytes for 32-bit mode, but limited for Real Mode as a minimum.

Don't think that you'll never finish. You only need to pack the different ModR/M and SIB functions in a function table from which you can select the right entries using those byte values directly. Once you implement and pack them properly, any instruction can reuse them without duplicating that code. It will take just as much time as the ModR/M byte for 16-bit, and you'll need some test code, but that's all.

h0bby1 · Post by **h0bby1** » Mon Sep 09, 2013 1:55 pm

the size prefix is used a lot for 16 bit code, as intel very early had a 16 bit os (dos), running on 32 bit cpu, the size prefix has been used a lot for very long time for real mode program that run on 32 bit cpu, to use 32 bit registers and instruction set, and lot of intel opcode have 2 version depending on the operand size, it's rather common, gcc even systematically compile code for real mode using the 32 bit version of the opcodes with the prefix, it doesn't handle at all the true 16 bit real mode, and it just set segment registers at a fixed value and use 32bit version of code whenever possible, most assemblers and compiler will behave like that

~ · Post by ~ » Mon Sep 09, 2013 2:05 pm

h0bby1 wrote:the size prefix is used a lot for 16 bit code, as intel very early had a 16 bit os (dos), running on 32 bit cpu, the size prefix has been used a lot for very long time for real mode program that run on 32 bit cpu, to use 32 bit registers and instruction set, and lot of intel opcode have 2 version depending on the operand size, it's rather common, gcc even systematically compile code for real mode using the 32 bit version of the opcodes with the prefix, it doesn't handle at all the true 16 bit real mode, and it just set segment registers at a fixed value and use 32bit version of code whenever possible, most assemblers and compiler will behave like that

There is an operand-size prefix and an address-size prefix. Depending on each case if the instruction only handles registers but not memory or if it isn't concerned of it, it can do as if they were pure 32-bit overriding only the operand-size.

Note that if the instruction needs things like using 32-bit registers for the memory indexes, you need to use the 32-bit versions of ModR/M and SIB via the use of address-size prefix. And note that you can perfectly use 32-bit registers using the 16-bit version of the operand-size prefix, and then it automatically selects 16-bit registers under 32-bit code or 32-bit registers under 16-bit code, since that's what it's for. You also need to be aware of the effects of those prefixes for each instruction since there can be tiny variations of behavior that you need to account for.

newanabe · Post by **newanabe** » Mon Sep 09, 2013 2:15 pm

~ wrote:I have an emulator. I have implemented data segment override prefix detection, repeat prefix, and operand-size override with the many ModR/M combinations.

I had those implemented too in my old code, but never had implemented address size overwrite prefix(67h).

~ wrote:Note that if the instruction needs things like using 32-bit registers for the memory indexes, you need to use the 32-bit versions of ModR/M and SIB via the use of address-size prefix.

Actually my emulator never found such opcode in the emulated code. I know that because I had it coded to print "not coded: 67h" on screen, but never seen that error message. Now i'm overwriting the whole code and I thought it would be good to add more things, but after seeing that large instruction that is firing an exception, I think I will skip that extra work. Any case I will log that prefix if it occurs in code.

~ · Post by ~ » Mon Sep 09, 2013 2:23 pm

newanabe wrote:
~ wrote:I have an emulator. I have implemented data segment override prefix detection, repeat prefix, and operand-size override with the many ModR/M combinations.
I had those implemented too in my old code, but never had implemented address size overwrite prefix(67h).

~ wrote:Note that if the instruction needs things like using 32-bit registers for the memory indexes, you need to use the 32-bit versions of ModR/M and SIB via the use of address-size prefix.
Actually my emulator never found such opcode in the emulated code. I know that because I had it coded to print "not coded: 67h" on screen, but never seen that error message. Now i'm overwriting the whole code and I thought it would be good to add more things, but after seeing that large instruction that is firing an exception, I think I will skip that extra work. Any case I will log that prefix if it occurs in code.

I might suggest that you start over simple, and start by running only tiny programs for one or a few similar ModR/M cases at the time. This is how I managed to do it. And leave the 32-bit ModR/M and SIB bytes (which involve the address-size prefix) only until you have finished the 16-bit ModR/M and operand-size prefix functionality, AND have some tiny 32-bit code to run. This is also how I'm doing my emulator right now.

And for your original question, about how to handle the disp16 case:

This is what I do for LEA:

Code: Select all

    function CPU_ModRM_LEA16_disp16(CPU_n, ModRM, ModRM_Default_Seg, ModRM_Base_Seg, ModRM_Base_Off, R_or_W, dataType, dataSize, writeValue)
    {
     //We need to get the disp16 value for our
     //[SReg:disp16] index access:
     ///
      var _offOff=(ModRM_Base_Off+1)&0xFFFF;
      return _disp16=Mem_Controller_Access(
                                           CPU16_Segment_Offset_to_Abs(ModRM_Base_Seg, _offOff),    //RAM_Location
                                           0,  //R_or_W
                                           0,  //dataType,
                                           2,  //dataSize,
                                           0   //writeData
                                          );
    }

And this is what I do for the proper Modr/M case:

Code: Select all

    function CPU_ModRM_Common16_disp16__R(CPU_n, ModRM, ModRM_Default_Seg, ModRM_Base_Seg, ModRM_Base_Off, R_or_W, dataType, dataSize, writeValue)
    {
     //We need to get the disp16 value for our
     //[SReg:disp16] index access:
     ///
      var _offOff=(ModRM_Base_Off+1)&0xFFFF;
      var _disp16=Mem_Controller_Access(
                                        CPU16_Segment_Offset_to_Abs(ModRM_Base_Seg, _offOff),    //RAM_Location
                                        0,  //R_or_W
                                        0,  //dataType,
                                        2,  //dataSize,
                                        0   //writeData
                                       );


     //This function passes us a default segment register index,
     //so we will use that, since we need it for [SReg:disp16]:
     ///
      var _segment=new Uint16Array(1);

     //See if the data segment has been overrided:
     ///
      if(CPU_Core[CPU_n].overridedSegment==-1)
      {
       _segment[0]=CPU_SReg16_ReadRegisterFileOffset(CPU_n, CPU_Regs_DS);
      }
      else
       _segment[0]=CPU_SReg16_ReadRegisterFileOffset(CPU_n, CPU_Core[CPU_n].overridedSegment);




     //Now we must read the data at [SReg:disp16];
     //we might be reading a byte, word or doubleword here:
     ///
      return Mem_Controller_Access(
                                   CPU16_Segment_Offset_to_Abs(_segment[0], _disp16),    //RAM_Location
                                   0,         //R_or_W
                                   dataType,  //dataType,
                                   dataSize,  //dataSize,
                                   0          //writeData
                                  );
    }

As you can see, the first step is enough for ModR/M, which simply involves reading the unsigned 16-bit value right after the instruction opcode and the ModR/M byte.

For the proper case, we must also gather the data segment, whether it was overriden or not, and with this we access for reading or writing the value at [Segment:disp16], which can in turn be 8, 16 or 32-bit in size.

~ · Post by ~ » Mon Sep 09, 2013 2:41 pm

I forgot: Your emulator seems to have a limitation with the processing of consecutive prefixes, specially those it doesn't know.

If the code is using an address-override prefix, then you cannot ignore it and expect that instruction to function properly.

You cannot ignore any prefix unless you know that the result will always be the same with or without it for your current CPU state.

You should have a loop that iterates as long as it finds bytes that are prefixes. Then, for every recognized prefix byte, you should apply its effects (e.g., mark that the instruction that follows needs to be repeated, or marking an overridden segment prefix, or change the register or memory data size from 16 to 32-bit for operand-size prefix).

If you need a prefix for which you lack too much functionality, then you simply cannot run that program at this point and you need to complete the emulator by all means, and meanwhile you should create several tiny equivalent programs that check whether your algorithms are correct, both from the program and from the emulator.

newanabe · Post by **newanabe** » Mon Sep 09, 2013 2:45 pm

Thank you very much for the help!
My case is that I am not a professional coder, not in the meaning I don't try to code cool things, but it's just I don't code all the time, i'm in the mood now and after a month or two, I will lose the mood for coding and will make 3d design maybe for 6 months. Im not a pro in the way im not all the time coding. that's why I try to code only the minimally needed. And the second reason for that is I am not planning to make public my work, it's for personal use and if it hires a warning like "opcode xxh unattended" and halts, actually it's not a problem, I will code that missing opcode. Like having a menu for changing background color of desktop, I don't need that, I go to the source code and change it from there, then compile it again with the new desktop color. I mean I don't code for the public.
I appreciate your help. Could you very briefly tell me what default segment I use for the 16 bit displacement in the case of ModRM:00xxx110b it says it adds it to the index(note 2). what is the index in this case? the IP, the segment, is it flat address? It would be of much help if you answer me that briefly. I don't wont to steal you more time.
thanks again

PD:
making the emulator to count the prefixes sounds clever. I will try to implement it.

~ · Post by ~ » Mon Sep 09, 2013 2:55 pm

newanabe wrote:Could you very briefly tell me what default segment I use for the 16 bit displacement in the case of ModRM:00xxx110b it says it adds it to the index(note 2). what is the index in this case? the IP, the segment, is it flat address? It would be of much help if you answer me that briefly. I don't wont to steal you more time.
thanks again

It's always DS for all, except for those that use BP in the index (which use SS instead). Each instruction has a description of the default data segments, if that applies.

newanabe · Post by **newanabe** » Mon Sep 09, 2013 2:59 pm

Thank you!

h0bby1 · Post by **h0bby1** » Mon Sep 09, 2013 3:39 pm

newanabe wrote:Thank you very much for the help!
My case is that I am not a professional coder, not in the meaning I don't try to code cool things, but it's just I don't code all the time, i'm in the mood now and after a month or two, I will lose the mood for coding and will make 3d design maybe for 6 months. Im not a pro in the way im not all the time coding. that's why I try to code only the minimally needed. And the second reason for that is I am not planning to make public my work, it's for personal use and if it hires a warning like "opcode xxh unattended" and halts, actually it's not a problem, I will code that missing opcode. Like having a menu for changing background color of desktop, I don't need that, I go to the source code and change it from there, then compile it again with the new desktop color. I mean I don't code for the public.
I appreciate your help. Could you very briefly tell me what default segment I use for the 16 bit displacement in the case of ModRM:00xxx110b it says it adds it to the index(note 2). what is the index in this case? the IP, the segment, is it flat address? It would be of much help if you answer me that briefly. I don't wont to steal you more time.
thanks again

PD:
making the emulator to count the prefixes sounds clever. I will try to implement it.

http://ref.x86asm.net/coder32.html#modrm_byte_32 there is all this here

Owen · Post by **Owen** » Mon Sep 09, 2013 8:19 pm

Address size overrides in real mode code are rare, but not non-existent.

I've used them in the past in order to get around the crippled addressing of 16-bit mode, after ensuring that the upper half of the registers is zeroed. Note that correct behavior is for resultant addresses > 65536 to cause a general protection fault, IIRC; this follows directly from the Protected Mode rules.

~ · Post by ~ » Tue Sep 10, 2013 7:03 am

Owen wrote:Address size overrides in real mode code are rare, but not non-existent.

I've used them in the past in order to get around the crippled addressing of 16-bit mode, after ensuring that the upper half of the registers is zeroed. Note that correct behavior is for resultant addresses > 65536 to cause a general protection fault, IIRC; this follows directly from the Protected Mode rules.

When using Unreal Mode, those address-size overrides are required if you really want to be able to access the 4 GB address space from 16-bit code. For things like testing Super VGA modes from 16-bit code (which can be much bigger than the normal 65536-byte plane), and being able to use all of the 32-bit installed memory without Protected Mode, those address-size overrides will always be present.

OSDev.org

can i ignore safely address size prefix in real mode?

can i ignore safely address size prefix in real mode?

Re: can i ignore safely address size prefix in real mode?

Re: can i ignore safely address size prefix in real mode?

Re: can i ignore safely address size prefix in real mode?

Re: can i ignore safely address size prefix in real mode?

Re: can i ignore safely address size prefix in real mode?

Re: can i ignore safely address size prefix in real mode?

Re: can i ignore safely address size prefix in real mode?

Re: can i ignore safely address size prefix in real mode?

Re: can i ignore safely address size prefix in real mode?

Re: can i ignore safely address size prefix in real mode?

Re: can i ignore safely address size prefix in real mode?

Re: can i ignore safely address size prefix in real mode?

Re: can i ignore safely address size prefix in real mode?

Re: can i ignore safely address size prefix in real mode?