Page 2 of 3

Posted: Thu Nov 08, 2007 1:01 am
by Candy
JAAman wrote:i have done that... wrote a section in ASM, then translated it into hex -- its quite fun actually, and after a while, you find patters which give incite into the instruction-set design

quite enjoyable, although quite time-consuming also...
I would prefer if it incited insight.

Posted: Thu Nov 08, 2007 6:51 am
by crazygray
I know it would take way to long to write an entire os in Hex it's just the bootloader that I am going to write that way.

Posted: Thu Nov 08, 2007 3:39 pm
by Tyler
Am i the only one that thinks "writing in Hex" sounds as stupid as the classic "computer's speak binary"? It's like saying you speak another language if you use a different set of symbols to represent the same alphabet.

Posted: Thu Nov 08, 2007 4:30 pm
by crazygray
It is not exactly the same, there are some instructions I haven't been able to use with an assembler. I enjoy it anyway, it's an intresting thing to do. :D

Posted: Fri Nov 09, 2007 12:31 am
by os64dev
crazygray wrote:It is not exactly the same, there are some instructions I haven't been able to use with an assembler. I enjoy it anyway, it's an intresting thing to do. :D
I use GAS and have yet to find an instruction i couldnt make.

Posted: Fri Nov 09, 2007 10:13 am
by exkor
there are some undocumented instructions such as SALC (Set AL on Carry).
Some asm compilers don't support long jumps, dont know how its with gas.

But the real use of hex still with self modified code or doing patches to the code(which is about the same). But it'll cost you some execution speed(Pentium4 for instance invalidates several cache(code cache) lines).

Posted: Fri Nov 09, 2007 10:29 am
by Brynet-Inc
Most assemblers allow you to use SALC...

With GAS you can use .byte 0xd6, YASM/NASM and FASM should all support the opcode mnemonic salc..

Later versions of GAS may support the SALC mnemonic, but I'm using the version bundled with OpenBSD.

I still don't see any benefits of writing out opcodes manually though, even assembly code is barely maintainable :P

Posted: Fri Nov 09, 2007 3:52 pm
by crazygray
I don't think there is really a benefit of writing the code in hex, but I think it is sort of a learning expirience.

Posted: Sat Nov 10, 2007 3:58 pm
by crazygray
I wouldn't know about gas I've never used it before.

Posted: Sun Nov 11, 2007 11:32 am
by JAAman
its not that there is a benefit to doing it, its simply fun! have you never done anything just because its fun?

it is also a good exercise, as i mentioned, you can learn from it as well, but the real reason to do it, is because its fun

Posted: Sun Nov 11, 2007 11:40 am
by cyr1x
If you can remeber all the hex numbers then it should be as easy as ASM.

Posted: Sun Nov 11, 2007 1:57 pm
by JAAman
not really... since there arent numbers to represent the ASM -- instead you have partial numbers, with bitfields to represent registers and methods of addressing memory -- so you dont have to memorize numbers, but patterns, and unless you are really skilled at arithmetic and binary/hex conversion in your head, you will need to do it will a calculator in your hand (or running on the computer...)

first you have a selection of override codes, then each instruction opcode can be 1,2 or 3 bytes in length (longer with certain escape codes, but those arnt common), plus some instructions will have a SIB byte, some will have a mod/rm byte, and then some will have immediate/offset/displacement data which (depending on the formation of the mod/rm & SIB bytes, and the particular instruction, can be 1, 2, 4, 6, 8, or 10 bytes -- unless there are some which require more than that...)

the overrides are easy, there are only a few of them, and they are always the same, so all you have to do is remember how they affect each instruction (some are not exactly obvious -- such as a16 affecting the size of eCX in loop instructins...)

many of the opcodes, however, contain bitfields, which must be filled in with the proper size, direction, register, etc fields, for the particular instruction

then the mod/rm byte is very complicated, with 3 bit fields, some of which change meaning based on the specific entries in other fields, and others are not permitted on specific instructions, and some instructions only use some of the fields, with the others containing instruction-specific entries -- and not all instructions have a mod/rm byte at all...

i dont have enough experience with the SIB byte to say much, but its not always there -- its presence is dependent on the specific instruction, and the entries in the mod/rm byte... and it contains more variable bitfields

then the immediate data -- this isnt present for every combination of opcode/mod/rm/SIB -- and is dependent on the specific combination of all of these to whether it is present, how large it is, how it is encoded, and what it means

in all, there are a lot of bitfields, and many contain the same information, encoded in different ways (for example, there are 2 separate bitfields for encoding segment registers -- some instructions use a 2-bit bitfield, which can only encode SS/DS/CS/ES, and others use a 3-bit bitfield which can encode SS/DS/CS/ES/FS/GS and 2 reserved combinations

so its a lot more complicated than just memorizing the opcodes...

Posted: Sun Nov 11, 2007 6:16 pm
by crazygray
so its a lot more complicated than just memorizing the opcodes...

Posted: Sun Nov 11, 2007 8:15 pm
by Dkelly
Brendan wrote:Hi,
something as simple as inserting a few instructions into existing code would involve searching for all CALL, JMP and branch instructions and adjusting the target addresses,
Actually it doesn't... (For the most part), and whomever can tell me why, I'll believe you've hand coded machine language :)

Dan K

Posted: Sun Nov 11, 2007 8:39 pm
by Dkelly
JAAman wrote:not really... since there arent numbers to represent the ASM -- instead you have partial numbers, with bitfields to represent registers and methods of addressing memory -- so you dont have to memorize numbers, but patterns, and unless you are really skilled at arithmetic and binary/hex conversion in your head, you will need to do it will a calculator in your hand (or running on the computer...)
Nice overview :)

For the most part though, when coding by hand, you just remember that xor ax,ax is 31 C0, and mov cx, (byte) is b9 (byte). The bit fields are rarely if ever thought of... in fact there are referenecs that don't even mention them, they just translate every possible permutation into their hex counterpart.

It's all rather pointless though, unless you're prone to making statements like "My dad can beat up your dad", you're better off using assember, and there's really no reason not to... I mean, Every PC that's shipped with microsoft software has shipped with an assembler.... and for the opcodes it doesn't support, then you can drop back to inserting a few dbs as above.

Dan K