Page 1 of 1

What is the structure of a.out executable?

Posted: Fri Sep 25, 2020 9:35 am
by antoni
I want to implement some executable format because:

A) Flat binaries are horrible.
B) I want to write programs in C.

I chose a.out. The only reasonable source I found was this:

https://www.freebsd.org/cgi/man.cgi?query=a.out&apropos=0&sektion=0&manpath=NetBSD+1.4&format=html

There are very detailed descriptions of structures used in a.out and information that the file starts with the exec structure in this text. Unfortunately, there is no information where the rest of them is in the file. Moreover, the sizes of these structures read from the exec header do not add up to the total file size. The file size is 80 bytes. The text segment is 20 bytes long. The size of the symbol table is 12 and the size of the text relocation table is 8. Additionally, exec structure has 32 bytes, so the file is 8 bytes longer than it should be. I obtained this file by assembling this code with NASM:

Code: Select all

BITS 32

mov edi, msg
mov eax, 0x125716
call eax
ret

msg:
db 'TEST', 10, 0
by this command:

Code: Select all

nasm -f aout aout_test.asm -o aout_test
Here is hexdump:

Code: Select all

0000000 0107 0064 0014 0000 0000 0000 0000 0000
0000010 000c 0000 0000 0000 0008 0000 0000 0000
0000020 0dbf 0000 b800 5716 0012 d0ff 54c3 5345
0000030 0a54 9000 0001 0000 0004 0400 0004 0000
0000040 0004 0000 000d 0000 0008 0000 736d 0067
0000050
exec header of this file:

Code: Select all

{a_midmag = 6553863, a_text = 20, a_data = 0, a_bss = 0, a_syms = 12, a_entry = 0, a_trsize = 8, a_drsize = 0}

Re: What is the structure of a.out executable?

Posted: Fri Sep 25, 2020 9:55 am
by PeterX
I have the book "Linkers & Loaders". From that the a.out file header is:

Code: Select all

a_midmag = 6553863 -> "magic" to recognize a.out
a_text = 20 -> sizeof code ("text") section
a_data = 0 -> size of data (Maybe you should specify "section .data" before the string def?)
a_bss = 0 -> size of bss
a_syms = 12 -> size of the symbol table
a_entry = 0 -> offset to start point of the code (I'm a bit irritated that this is zero. Maybe you must specify it? Or maybe it is because "section .text" is missing? Or is it relative to text-start?)
a_trsize = 8 -> text relocation size
a_drsize = 0 -> data relocation size
[All values are 32bit.]
The sections containing the actual "stuff" are simply concatenated, right after the file header:

Code: Select all

code
data
bss
symbol table
text reloc info
data reloc info
EDIT: I overlooked your comment on the additional 8 bytes. Sorry. I don'tknow how they are "solved".

EDIT2: I read the book and there is possibly one more header entry? Maybe string table size?

Greetings
Peter

Re: What is the structure of a.out executable?

Posted: Fri Sep 25, 2020 10:05 am
by bzt
antoni wrote:I chose a.out. The only reasonable source I found was this:
What about our wiki? It would be great if you could provide more details on that page as you progress with your development.

Cheers,
bzt

Re: What is the structure of a.out executable?

Posted: Fri Sep 25, 2020 10:10 am
by PeterX
bzt wrote:
antoni wrote:I chose a.out. The only reasonable source I found was this:
What about our wiki? It would be great if you could provide more details on that page as you progress with your development.

Cheers,
bzt
Great info! Unfortunately a.out doesn't show up in the list, so I (and probably others, too) didn't know that this page exists.

Greetings
Peter

Re: What is the structure of a.out executable?

Posted: Fri Sep 25, 2020 10:11 am
by reapersms
The excess is the string table the symbol table references.
The string table consists of an unsigned long length followed by null-
terminated symbol strings. The length represents the size of the entire
table in bytes, so its minimum value (or the offset of the first string)
is always 4 on 32-bit machines.
So, 4 bytes for a total size of the symbol table, length included (00000008), and 4 bytes for the null-terminated "msg"

Re: What is the structure of a.out executable?

Posted: Mon Sep 28, 2020 5:08 am
by antoni
According to wikipedia page about executable formats a.out have extension for 64-bit executables. Unfortunately, the only source about a.out in the bibliography is the page from the BSD manual I mentioned. I wish I could run 64-bit codes. Do you know any assembler/compiler with support od this format, or open source os that implements it? NASM supports only 32-bit a.out.

Re: What is the structure of a.out executable?

Posted: Mon Sep 28, 2020 8:27 am
by bzt
antoni wrote:According to wikipedia page about executable formats a.out have extension for 64-bit executables. Unfortunately, the only source about a.out in the bibliography is the page from the BSD manual I mentioned. I wish I could run 64-bit codes. Do you know any assembler/compiler with support od this format, or open source os that implements it? NASM supports only 32-bit a.out.
According to the gcc docs, gcc, GNU as and GNU ld supports 64 bit a.out.

But as long as assemblers concerned, you can always output in binary format and add the a.out header yourself. You'll need some tricky macros, but absolutely doable. For example MenuetOS, which has it's own, very a.out like header for executables does this.

As for NASM, it looks to me it's rather easy to add 64 bit a.out, see outaout.c. Just replace "fwriteint32_t" with "fwriteint64_t" and use proper magic values. Or use "-fbin" and create header and sections with macros.

Cheers,
bzt

Re: What is the structure of a.out executable?

Posted: Mon Sep 28, 2020 4:05 pm
by Schol-R-LEA
Just curious, but is there a reason you want to use a.out, specifically? While there are plenty of other Executable Formats, and while a.out is one of the simplest, so I can see why it may be appealing, but are you certain that it supports everything you need? For C programming, ELF and PE are both more commonly used today.

I would strongly recommend the John Levine book, Linkers and Loaders, if you can afford it (US $58 new on Amazon, last I checked). The beta version of the book and the accompanying source code is free on Levine's web page, but it isn't complete and has some known errors. While it is almost 25 years old now, it is still accurate to most of the standards currently used (though not on the 64-bit extensions).

Re: What is the structure of a.out executable?

Posted: Tue Sep 29, 2020 8:26 am
by antoni
ELF is difficult to implement. Certainly much more difficult than a.out. I haven't read much about PE, but it's Microsoft's format, so even if it were easy to implement, I wouldn't have a way to generate executables (I don't have Windows).

To load a.out executable, you just need to copy text and data sections to memory and apply relocations.

Re: What is the structure of a.out executable?

Posted: Tue Sep 29, 2020 10:53 am
by bzt
antoni wrote:ELF is difficult to implement.
No, it's not.
antoni wrote:To load a.out executable, you just need to copy text and data sections to memory and apply relocations.
No different to ELF. See here, a simple C code that copies text and data segments to memory, less than 10 SLoC.

The difficulty comes in when you start to implement relocations, but that's no different in a.out. You have to traverse relocation records and patch memory for both.

Cheers,
bzt