Page 1 of 1
Writing an Assembler - How?
Posted: Fri Nov 18, 2011 2:38 pm
by mioline
Hi there!
This is my first post here, so if I ask something lame... Do whatever you want with me.
I want to write an Assembler for the i8086 CPU, the parsing is OK, but I don't understand, how the machine-code instructions and a plain (for example a *.COM) binary file seem. I would appreciate if somebody could help me.
Thanks for the answers in advance!
mioline
Re: Writing an Assembler - How?
Posted: Fri Nov 18, 2011 3:23 pm
by DavidCooper
mioline wrote:I want to write an Assembler for the i8086 CPU, the parsing is OK, but I don't understand, how the machine-code instructions and a plain (for example a *.COM) binary file seem.
You don't understand how they seem? You need to state that more clearly so that people have a chance of working out what you mean. It appears that a .com file (
http://en.wikipedia.org/wiki/COM_file) has no header, so all you have to do is fill it with binary code, the first byte of which will, I presume, be designed to sit at address 0 within a 64K segment. If you're capable of writing an assembler, I can't imagine what you're now having difficulty with, so you need to spell out the problem clearly.
Re: Writing an Assembler - How?
Posted: Fri Nov 18, 2011 4:27 pm
by mioline
You don't understand how they seem? You need to state that more clearly so that people have a chance of working out what you mean. It appears that a .com file (
http://en.wikipedia.org/wiki/COM_file) has no header, so all you have to do is fill it with binary code, the first byte of which will, I presume, be designed to sit at address 0 within a 64K segment. If you're capable of writing an assembler, I can't imagine what you're now having difficulty with, so you need to spell out the problem clearly.
OK, I was wrong, sorry... I should have leave the "COM-part" out from my previous message... So, the main question is: could you tell me, how do the structure of the i8086 instruction codes seem?
Re: Writing an Assembler - How?
Posted: Fri Nov 18, 2011 4:48 pm
by Combuster
Tried the processor manual? The current one shows exactly how instructions are encoded. The only thing you need beyond that is a list of instructions valid on the 8086/8088 since it will be far less than the current spectrum of instructions.
Re: Writing an Assembler - How?
Posted: Fri Nov 18, 2011 4:49 pm
by DavidCooper
mioline wrote:OK, I was wrong, sorry... I should have leave the "COM-part" out from my previous message...
No, you weren't wrong and have nothing to apologise for. The probem is that you don't seem to know what the word "seem" means, so it's very hard to make sense of your question. If I try to answer it literally, this happens:-
So, the main question is: could you tell me, how do the structure of the i8086 instruction codes seem?
They seem very attractive to me. I love the way they've been organised into gorgeous groups of similar numbers with similar functions.
Are you trying to ask for a map of them showing how they are arranged?
Edit: I've had a look round, and it's hard to find anything that sets it out in a friendly way. This site isn't too bad as an introduction:
http://courses.engr.illinois.edu/ece390 ... codes.html - I started out with this document over a decade ago and built up my own map of the instructions from it. Another version of it can be found here (
http://www.scribd.com/doc/67624438/8086-OPCODE), but with the addition of an incomplete map at the top. If you need more detail than that document provides, that's when you should turn to the Intel processor manuals.
Re: Writing an Assembler - How?
Posted: Fri Nov 18, 2011 5:29 pm
by mioline
DavidCooper wrote:So, the main question is: could you tell me, how do the structure of the i8086 instruction codes seem?
They seem very attractive to me. I love the way they've been organised into gorgeous groups of similar numbers with similar functions.
Are you trying to ask for a map of them showing how they are arranged?
Oh my God! I should learn more English grammar. (And maybe semantics?)
So, I used the google and I found a lot of x86 code tables and I saw the official manual too, but I don't understand how the machine code instructions are built (for example: I can't understand what the SIB/Mod/RM byte is).
Re: Writing an Assembler - How?
Posted: Fri Nov 18, 2011 5:33 pm
by gerryg400
The Intel manuals have it all.
Re: Writing an Assembler - How?
Posted: Fri Nov 18, 2011 5:50 pm
by ACcurrent
It might be good Idea for you to start small first and implement two or three instructions. Intel manuals can be difficult to read if english is not your 1st language.
Re: Writing an Assembler - How?
Posted: Fri Nov 18, 2011 5:59 pm
by DavidCooper
mioline wrote:So, I used the google and I found a lot of x86 code tables and I saw the official manual too, but I don't understand how the machine code instructions are built (for example: I can't understand what the SIB/Mod/RM byte is).
I added a bit to the end of my previous post pointing towards a particular site which probably gives you the easiest way in:
http://courses.engr.illinois.edu/ece390 ... codes.html. Even so, it takes a lot of re-reading to understand it all. If you're using the 32-bit addressing mode, most oorrrmmm bytes use the mmm part to select the register which is pointing to the memory to be used, but when the oorrrmmm byte has its mmm part set to 100 and the oo part is less than 11, an extra instruction byte has to follow to specify which register is used to point at memory, and there is room left over in that byte for another register to be used as a scaled index, as well as a scale factor. The largest two bits are the scale factor (00=x1, 01=x2, 10=x4, 11=x8), the next three bits are the register containing the index to be multiplied by 1/2/4/8, and the smallest three bits indicate the register which would have been selected by the mmm part of oorrrmmm if it hadn't been set to 100 instead.
Eg: mov EAX,[EBX+2*ECX] becomes the three bytes 139, 4, 75 (decimal values - sorry, but that's the way I write machine code). The 75 in binary is 01,001,011, so the scale factor is 2 (the 01 part), the index to be scaled by that is ECX (the 001), and the result has to be added to the value in EBX (the 011) to get the required memory address.
[Edit: crucial typing error corrected - I'd typed EBX instead of ECX at the top.]
And, my guess is that SIB stands for Scaled Index Byte.
Re: Writing an Assembler - How?
Posted: Sat Nov 19, 2011 10:08 am
by neon
This is what we found useful with ours:
http://www.sandpile.org/
http://ref.x86asm.net/
Recognize the patterns in the ModRM and SIB tables and you will see that they really arent that complex. The ModRM byte stores the operands and addressing mode. In the case an SIB byte follows, the ModRM addressing mode is [sib+displacement]. This allows you to combine an SIB addressing scheme with a displacement, if any. SIB is only valid in 32 bit mode, however.
Re: Writing an Assembler - How?
Posted: Mon Nov 21, 2011 7:57 am
by azblue
mioline wrote:
So, I used the google and I found a lot of x86 code tables and I saw the official manual too, but I don't understand how the machine code instructions are built (for example: I can't understand what the SIB/Mod/RM byte is).
When the processor encounters a command like "ADD", it needs to know "add what?". That's what the mod/rm byte is; it specifies what data the preceding command works on.
The SIB byte does the same thing but if you're just writing for the 8086 you can ignore it, as that's only 386+.
You may want to try using an existing assembler and viewing the output from a few simple commands with a hex editor. Look at how it's making the mod/rm byte in relation to the commands used and compare it to the information on the mod/rm byte. Doing it like that kinda helps it all make sense.
Re: Writing an Assembler - How?
Posted: Mon Nov 21, 2011 8:15 am
by Solar
I recommend
The Art of Assembly, Chapter 4.7, The 80x86 MOV Instruction.
Actually I recommend
the whole book, as its introductionary chapters go into great detail about underlying logic, and how the x86 opcodes came to be.
Re: Writing an Assembler - How?
Posted: Thu Nov 24, 2011 4:10 am
by bitshifter
I put my reply in this attachment due to formatting problems...
Re: Writing an Assembler - How?
Posted: Fri Nov 25, 2011 2:12 pm
by Love4Boobies
bitshifter wrote:I put my reply in this attachment due to formatting problems...
You should have used the
Code: Select all
tag.
(This thread was too long for me to read right now so that's all I have to say :-))