Writing an Assembler - How?
Writing an Assembler - How?
Hi there!
This is my first post here, so if I ask something lame... Do whatever you want with me.
I want to write an Assembler for the i8086 CPU, the parsing is OK, but I don't understand, how the machine-code instructions and a plain (for example a *.COM) binary file seem. I would appreciate if somebody could help me.
Thanks for the answers in advance!
mioline
This is my first post here, so if I ask something lame... Do whatever you want with me.
I want to write an Assembler for the i8086 CPU, the parsing is OK, but I don't understand, how the machine-code instructions and a plain (for example a *.COM) binary file seem. I would appreciate if somebody could help me.
Thanks for the answers in advance!
mioline
- DavidCooper
- Member
- Posts: 1150
- Joined: Wed Oct 27, 2010 4:53 pm
- Location: Scotland
Re: Writing an Assembler - How?
You don't understand how they seem? You need to state that more clearly so that people have a chance of working out what you mean. It appears that a .com file (http://en.wikipedia.org/wiki/COM_file) has no header, so all you have to do is fill it with binary code, the first byte of which will, I presume, be designed to sit at address 0 within a 64K segment. If you're capable of writing an assembler, I can't imagine what you're now having difficulty with, so you need to spell out the problem clearly.mioline wrote:I want to write an Assembler for the i8086 CPU, the parsing is OK, but I don't understand, how the machine-code instructions and a plain (for example a *.COM) binary file seem.
Help the people of Laos by liking - https://www.facebook.com/TheSBInitiative/?ref=py_c
MSB-OS: http://www.magicschoolbook.com/computing/os-project - direct machine code programming
MSB-OS: http://www.magicschoolbook.com/computing/os-project - direct machine code programming
Re: Writing an Assembler - How?
OK, I was wrong, sorry... I should have leave the "COM-part" out from my previous message... So, the main question is: could you tell me, how do the structure of the i8086 instruction codes seem?You don't understand how they seem? You need to state that more clearly so that people have a chance of working out what you mean. It appears that a .com file (http://en.wikipedia.org/wiki/COM_file) has no header, so all you have to do is fill it with binary code, the first byte of which will, I presume, be designed to sit at address 0 within a 64K segment. If you're capable of writing an assembler, I can't imagine what you're now having difficulty with, so you need to spell out the problem clearly.
- Combuster
- Member
- Posts: 9301
- Joined: Wed Oct 18, 2006 3:45 am
- Libera.chat IRC: [com]buster
- Location: On the balcony, where I can actually keep 1½m distance
- Contact:
Re: Writing an Assembler - How?
Tried the processor manual? The current one shows exactly how instructions are encoded. The only thing you need beyond that is a list of instructions valid on the 8086/8088 since it will be far less than the current spectrum of instructions.
- DavidCooper
- Member
- Posts: 1150
- Joined: Wed Oct 27, 2010 4:53 pm
- Location: Scotland
Re: Writing an Assembler - How?
No, you weren't wrong and have nothing to apologise for. The probem is that you don't seem to know what the word "seem" means, so it's very hard to make sense of your question. If I try to answer it literally, this happens:-mioline wrote:OK, I was wrong, sorry... I should have leave the "COM-part" out from my previous message...
They seem very attractive to me. I love the way they've been organised into gorgeous groups of similar numbers with similar functions.So, the main question is: could you tell me, how do the structure of the i8086 instruction codes seem?
Are you trying to ask for a map of them showing how they are arranged?
Edit: I've had a look round, and it's hard to find anything that sets it out in a friendly way. This site isn't too bad as an introduction: http://courses.engr.illinois.edu/ece390 ... codes.html - I started out with this document over a decade ago and built up my own map of the instructions from it. Another version of it can be found here (http://www.scribd.com/doc/67624438/8086-OPCODE), but with the addition of an incomplete map at the top. If you need more detail than that document provides, that's when you should turn to the Intel processor manuals.
Last edited by DavidCooper on Fri Nov 18, 2011 5:42 pm, edited 1 time in total.
Help the people of Laos by liking - https://www.facebook.com/TheSBInitiative/?ref=py_c
MSB-OS: http://www.magicschoolbook.com/computing/os-project - direct machine code programming
MSB-OS: http://www.magicschoolbook.com/computing/os-project - direct machine code programming
Re: Writing an Assembler - How?
Oh my God! I should learn more English grammar. (And maybe semantics?)DavidCooper wrote:They seem very attractive to me. I love the way they've been organised into gorgeous groups of similar numbers with similar functions.So, the main question is: could you tell me, how do the structure of the i8086 instruction codes seem?
Are you trying to ask for a map of them showing how they are arranged?
So, I used the google and I found a lot of x86 code tables and I saw the official manual too, but I don't understand how the machine code instructions are built (for example: I can't understand what the SIB/Mod/RM byte is).
Re: Writing an Assembler - How?
The Intel manuals have it all.
If a trainstation is where trains stop, what is a workstation ?
Re: Writing an Assembler - How?
It might be good Idea for you to start small first and implement two or three instructions. Intel manuals can be difficult to read if english is not your 1st language.
Get back to work!
Github
Github
- DavidCooper
- Member
- Posts: 1150
- Joined: Wed Oct 27, 2010 4:53 pm
- Location: Scotland
Re: Writing an Assembler - How?
I added a bit to the end of my previous post pointing towards a particular site which probably gives you the easiest way in: http://courses.engr.illinois.edu/ece390 ... codes.html. Even so, it takes a lot of re-reading to understand it all. If you're using the 32-bit addressing mode, most oorrrmmm bytes use the mmm part to select the register which is pointing to the memory to be used, but when the oorrrmmm byte has its mmm part set to 100 and the oo part is less than 11, an extra instruction byte has to follow to specify which register is used to point at memory, and there is room left over in that byte for another register to be used as a scaled index, as well as a scale factor. The largest two bits are the scale factor (00=x1, 01=x2, 10=x4, 11=x8), the next three bits are the register containing the index to be multiplied by 1/2/4/8, and the smallest three bits indicate the register which would have been selected by the mmm part of oorrrmmm if it hadn't been set to 100 instead.mioline wrote:So, I used the google and I found a lot of x86 code tables and I saw the official manual too, but I don't understand how the machine code instructions are built (for example: I can't understand what the SIB/Mod/RM byte is).
Eg: mov EAX,[EBX+2*ECX] becomes the three bytes 139, 4, 75 (decimal values - sorry, but that's the way I write machine code). The 75 in binary is 01,001,011, so the scale factor is 2 (the 01 part), the index to be scaled by that is ECX (the 001), and the result has to be added to the value in EBX (the 011) to get the required memory address.
[Edit: crucial typing error corrected - I'd typed EBX instead of ECX at the top.]
And, my guess is that SIB stands for Scaled Index Byte.
Help the people of Laos by liking - https://www.facebook.com/TheSBInitiative/?ref=py_c
MSB-OS: http://www.magicschoolbook.com/computing/os-project - direct machine code programming
MSB-OS: http://www.magicschoolbook.com/computing/os-project - direct machine code programming
Re: Writing an Assembler - How?
This is what we found useful with ours:
http://www.sandpile.org/
http://ref.x86asm.net/
Recognize the patterns in the ModRM and SIB tables and you will see that they really arent that complex. The ModRM byte stores the operands and addressing mode. In the case an SIB byte follows, the ModRM addressing mode is [sib+displacement]. This allows you to combine an SIB addressing scheme with a displacement, if any. SIB is only valid in 32 bit mode, however.
http://www.sandpile.org/
http://ref.x86asm.net/
Recognize the patterns in the ModRM and SIB tables and you will see that they really arent that complex. The ModRM byte stores the operands and addressing mode. In the case an SIB byte follows, the ModRM addressing mode is [sib+displacement]. This allows you to combine an SIB addressing scheme with a displacement, if any. SIB is only valid in 32 bit mode, however.
OS Development Series | Wiki | os | ncc
char c[2]={"\x90\xC3"};int main(){void(*f)()=(void(__cdecl*)(void))(void*)&c;f();}
char c[2]={"\x90\xC3"};int main(){void(*f)()=(void(__cdecl*)(void))(void*)&c;f();}
Re: Writing an Assembler - How?
When the processor encounters a command like "ADD", it needs to know "add what?". That's what the mod/rm byte is; it specifies what data the preceding command works on.mioline wrote: So, I used the google and I found a lot of x86 code tables and I saw the official manual too, but I don't understand how the machine code instructions are built (for example: I can't understand what the SIB/Mod/RM byte is).
The SIB byte does the same thing but if you're just writing for the 8086 you can ignore it, as that's only 386+.
You may want to try using an existing assembler and viewing the output from a few simple commands with a hex editor. Look at how it's making the mod/rm byte in relation to the commands used and compare it to the information on the mod/rm byte. Doing it like that kinda helps it all make sense.
Re: Writing an Assembler - How?
I recommend The Art of Assembly, Chapter 4.7, The 80x86 MOV Instruction.
Actually I recommend the whole book, as its introductionary chapters go into great detail about underlying logic, and how the x86 opcodes came to be.
Actually I recommend the whole book, as its introductionary chapters go into great detail about underlying logic, and how the x86 opcodes came to be.
Every good solution is obvious once you've found it.
-
- Member
- Posts: 50
- Joined: Sun Sep 20, 2009 4:03 pm
Re: Writing an Assembler - How?
I put my reply in this attachment due to formatting problems...
- Attachments
-
- encode.txt
- (1.2 KiB) Downloaded 86 times
- Love4Boobies
- Member
- Posts: 2111
- Joined: Fri Mar 07, 2008 5:36 pm
- Location: Bucharest, Romania
Re: Writing an Assembler - How?
You should have used thebitshifter wrote:I put my reply in this attachment due to formatting problems...
Code: Select all
tag.
(This thread was too long for me to read right now so that's all I have to say :-))
"Computers in the future may weigh no more than 1.5 tons.", Popular Mechanics (1949)
[ Project UDI ]
[ Project UDI ]