Learning GNU AT&T Syntax Assembler ?

Programming, for all ages and all languages.
Post Reply
Perica
Member
Member
Posts: 454
Joined: Sat Nov 25, 2006 12:50 am

Learning GNU AT&T Syntax Assembler ?

Post by Perica »

..
Last edited by Perica on Sun Dec 03, 2006 9:21 pm, edited 1 time in total.
Tim

Re:Learning GNU AT&T Syntax Assembler ?

Post by Tim »

I learned by writing code using Intel syntax, assembling it using NASM, then disassembling it using objdump.
Perica
Member
Member
Posts: 454
Joined: Sat Nov 25, 2006 12:50 am

Re:Learning GNU AT&T Syntax Assembler ?

Post by Perica »

..
Last edited by Perica on Sun Dec 03, 2006 9:22 pm, edited 1 time in total.
Tim

Re:Learning GNU AT&T Syntax Assembler ?

Post by Tim »

There's a short guide on the DJGPP site:

http://www.delorie.com/djgpp/v2faq/faq17_1.html
Schol-R-LEA

Re:Learning GNU AT&T Syntax Assembler ?

Post by Schol-R-LEA »

While I am far more comfortable with the Intel syntax out of long habit, I have to say the AT&T syntax is actually simpler and more regular than the Intel syntax is. Also, if you ever have to write assembly code on anything other than an IA-32 or IA-64, you'll probably by using an AT&T type assembler.

You will probably want to read the manual for the full details, but here are a few key things to look for:

[*]AT&T-syntax code is case sensitive: "MOVL" is not the same as "movl".

[*]Number bases are indicated the same as in C: 1 in decimal, 01 in octal, 0x01 in hex.

[*]C-style escape codes are used for special characters: \n, \", \#, \\, etc.

[*]C-style comments ("/*", "*/") are used, as are shell style line comments (beginning from '#' and going to the end of the line.

[*]Assembler directives begin with a period; for example, ".align 4" means to align on a doubleword boundary.

[*]This applies to space-allocating directives as well: thus ".word 0x1234" is the equivalent of "DW 1234h".

[*]String space is allocated using a special directive, ".ascii"; for example,

msg: .ascii "Hello, World!\n"

the .asciz directive has the the same effect, except that it automatically appends a NULL character ('\00') to the end of the string.

[*]A period by itself indicates the address of the current location in the assembly code, equivalent to the Intel '$' directive.

[*]The .fill directive is roughly equivalent to using 'times db' in Intel syntax. Thus,

.fill 0x1fe - (. - START) , 1, 0

(where '1' is the size fill mask in bytes and START is a label marking the entry point of the code) is equal to

times 1FEh - ($-$$) db 0

The .skip and .space directives can be used in a similar manner.

[*]The .org directive can be used multiple times, allowing the code counter to set at a specified location, as in:

.org 0x1fe + START

(where START is a label marking the entry point of the code). The location-assignment directive, ".=", can be used in the same manner.

[*]The equivalents to [BITS 16] and [BITS 32] directives are .code16 and .code32, respectively.

[*]The .arch directive allows you to select the target CPU; it is a Good Idea to set it, even if you are sure that the default is 'i386'.

[*]While there is no built in preprocessor, either cpp or m4 can be used for macro processing, with cpp being the default.

[*]Label declarations always end in a colon.

[*]When a new identifer appears at the beginning of a line, which doesn't end in a colon, it is assumed to be part of an
equivalence statement, and must be followed by an equals sign and an assigned value. For example:

FOO = 0xF00

[*]A instruction can be ended either by a newline, or with a semi-colon; the latter is primarily seen in macros, to allow multiple lines of code.

[*]A forward slash ('\') acts as a line continuance, just as in C; this also is mostly used in macros.

[*]Registers are always prefixed with a percent sign: %eax, %cs, %sp, etc.

[*] Move, load, store, etc. operations are always 'source, destination', which is opposite of most (but not all) Intel instructions; thus "movl %eax, %ebx" moves the value of %eax into %ebx. This is the part that seems to confuse people the most, but in fact, the AT&T syntax is more consistent on this point than the Intel syntax is.

[*]Operand sizes are always suffixed to instructions (with the exception of ljmp, lcall, and lret on the x86): movb for move byte, movw is move word, movl for move long, etc.

[*]Operands with no prefix are treated as direct-address operands; thus, "movl foo, %eax" moves the contents of memory location "foo" into %eax.

[*]Immediate operands are prefixed with a dollar sign ($) : "pushl $4" pushes 0x00000004 onto the stack. This applies to labels as well: "movl $foo, %eax" moves the value of the label foo (that is, the address of variable foo) into %eax.


[*]Indexed or indirect operands are used in the format:
segment:displacement (base, index, scale)
like so:
movl %eax, %ss:8(%ebp, 2, 3)

is equivalent to

mov dword [ss:ebp + 2 * 3 + 8], eax

that is, it moves the value of %eax to offset (%ebp + (2 *3) + 8 ) in segment %ss. Any of the five operands of an indirect address may be omitted.

[*] Jump and call instructions default to relative (i.e., short) addressing. To use absolute (near) addressing, the operand must be prefixed with an asterisk (*). Far jumps, calls and returns must use the special 'ljmp', 'lcall' and 'lret' instructions.

Sources:
DJGPP AT&T Assembly Tutorial
Linux Assembly HOWTO
GAS/AS End User Help Project
Using as
erebus-

Re:Learning GNU AT&T Syntax Assembler ?

Post by erebus- »

The best book i could find on my journey. It might serve useful:

Programming from the ground up,

pdf:
Byteserving link

html:
HTML link

[edit]Use short links![/edit]
Rajesh Mathew

Re:Learning GNU AT&T Syntax Assembler ?

Post by Rajesh Mathew »

Perica wrote:
Tim Robinson wrote:I learned by writing code using Intel syntax, assembling it using NASM, then disassembling it using objdump.
Yeh, i can do this too. But it would be time consuming, and some things might not be easy to pick up by assembling then dis-assembling code. Are there any documents / tutorials out there at all on learning AT&T syntax assembler ??
User avatar
Neo
Member
Member
Posts: 842
Joined: Wed Oct 18, 2006 9:01 am

Re:Learning GNU AT&T Syntax Assembler ?

Post by Neo »

Schol-R-LEA wrote: ........but here are a few key things to look for:

[*]AT&T-syntax code is case sensitive: "MOVL" is not the same as "movl".

[*]Number bases are indicated the same as in C: 1 in decimal, 01 in octal, 0x01 in hex.

[*]C-style escape codes are used for special characters: \n, ", \#, \\, etc.

[*]C-style comments ("/*", "*/") are used, as are shell style line comments (beginning from '#' and going to the end of the line.
..................
Are these guidelines for usage of 'as/gas' or are these specifications of the AT&T syntax?
Only Human
Post Reply