Page 1 of 1

issues with COFF files

Posted: Wed Jul 25, 2012 11:57 am
by ishkabible
first off, I'm using the MinGW tool chain which includes binutlits and gcc for windows.

so I'm extracting information from relocatable(ld -r) PE binaries to create my own relocatable format using objcopy, objdump, and nm that will hopefully suit my much simpler needs better. one of the sections that ends up in the list of relocations is the .eh_frame section. from what I've read its used for exception handling, something C doesn't have.

Can I ignore it if I'm just using C and not C++. will I need it if I decide to support C++? what exactly is it used for?

Re: can I ignore .eh_frame?

Posted: Wed Jul 25, 2012 2:55 pm
by Owen
Exception handling. Also, other things built on exception handling on some OSes (pthread_cancel springs to mind). If you're using C and no exceptions, you can discard it.

If you want exception handling, you'll need an appropriate support library, either libgcc or libunwind; and a C++ runtime library, either libsupc++ (from GCC, LGPL) or libcxxrt (from PathScale, X11)

Re: can I ignore .eh_frame?

Posted: Wed Jul 25, 2012 4:27 pm
by ishkabible
awesome, thanks! I'm just going to discard it for now. that will simplify things even more.

Re: can I ignore .eh_frame?

Posted: Wed Jul 25, 2012 7:31 pm
by ishkabible
so, as I've been working on this, I've got a working COFF reader and I no longer use objdump/objcopy to get the information. I'm just reading the information from the file now and it all seems to be working.

I have a question about the symbol table however. certain names occur in the symbol table twice with different values. here's the symbol table printed
(index number, name, value)
0 .file 13
1 extern.c 0
2 _q 0
3 _foo 0
4 0
5 .text 0
6 ¶ 0
7 .data 0
8 0
9 .bss 0
10 ♦ 0
11 .eh_frame 0
12 8 0
13 .file 24
14 foo.c 0
15 _main 20
16 .text 20
17 ∟ 0
18 .data 0
19 0
20 .bss 4
21 0
22 .eh_frame 56
23 8 0
24 _p 16
25 ___main 0
26 0

.text, .data, .bss, .file, and .eh_frame all occur twice in the symbol table with different values. why do they occur twice?

Re: can I ignore .eh_frame?

Posted: Wed Jul 25, 2012 8:54 pm
by Owen
Look at the type and flags. COFF is a convoluted format.

Re: can I ignore .eh_frame?

Posted: Wed Jul 25, 2012 9:51 pm
by ishkabible
ok so .eh_frame is exactly the same except in value and .data is exactly the same. here's the list with all information in the symbol tables. in case it wasn't clear, auxiliary information is directly below the owning symbol

the format:
index name value
data from file in hex(including auxiliary data so if a symbol has auxiliary data, it will show up in this field)

0 .file 13
2e66696c65 0 0 0 d 0 0 0feff 0 067 165787465726e2e63 0 0 0 0 0 0 0 0 0 0
1 extern.c 0
65787465726e2e63 0 0 0 0 0 0 0 0 0 0
2 _q 0
5f71 0 0 0 0 0 0 0 0 0 0 4 0 0 0 3 0
3 _foo 0
5f666f6f 0 0 0 0 0 0 0 0 1 020 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
5 .text 0
2e74657874 0 0 0 0 0 0 0 1 0 0 0 3 114 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0
6 ¶ 0
14 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0
7 .data 0
2e64617461 0 0 0 0 0 0 0 2 0 0 0 3 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
8 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
9 .bss 0
2e627373 0 0 0 0 0 0 0 0 4 0 0 0 3 1 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
10 ♦ 0
4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11 .eh_frame 0
0 0 0 0 e 0 0 0 0 0 0 0 3 0 0 0 3 138 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
12 8 0
38 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
13 .file 24
2e66696c65 0 0 018 0 0 0feff 0 067 1666f6f2e63 0 0 0 0 0 0 0 0 0 0 0 0 0
14 foo.c 0
666f6f2e63 0 0 0 0 0 0 0 0 0 0 0 0 0
15 _main 20
5f6d61696e 0 0 014 0 0 0 1 020 0 2 0
16 .text 20
2e74657874 0 0 014 0 0 0 1 0 0 0 3 11c 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0
17 ∟ 0
1c 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0
18 .data 0
2e64617461 0 0 0 0 0 0 0 2 0 0 0 3 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
19 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
20 .bss 4
2e627373 0 0 0 0 4 0 0 0 4 0 0 0 3 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
21 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
22 .eh_frame 56
0 0 0 0 e 0 0 038 0 0 0 3 0 0 0 3 138 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
23 8 0
38 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
24 _p 16
5f70 0 0 0 0 0 010 0 0 0 0 0 0 0 2 0
25 ___main 0
5f5f5f6d61696e 0 0 0 0 0 0 020 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
26 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

here are the two '.data' entries. they are identical, even in value

7 .data 0
2e64617461 0 0 0 0 0 0 0 2 0 0 0 3 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
18 .data 0
2e64617461 0 0 0 0 0 0 0 2 0 0 0 3 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

but .eh_frame differs *only* in value, which concerns me. granted...I'm not going to use this but it still seems strange to me
11 .eh_frame 0
0 0 0 0 e 0 0 0 0 0 0 0 3 0 0 0 3 138 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
22 .eh_frame 56
0 0 0 0 e 0 0 038 0 0 0 3 0 0 0 3 138 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0

it seems that the value of second name is the length of the section it's referencing. so that's my guess as to what it means. the first 1 is the beginning, the second one is the end.

Re: can I ignore .eh_frame?

Posted: Fri Jul 27, 2012 12:51 pm
by ishkabible
ok so I can't figure out why uninitialized external variables have a section number of zero(meaning they are undefined).

here's the source file

Code: Select all

int p[4];

static int q = 10;

int foo() {
	q = p[0];
	return q;
}

'p' has a section number of 0 which means it's undefined in COFF but as I understand it, it should be defined in the first source file. do you have to initialize a variable to define it or am I missing a peice of information that qualifies this as a defined symbol?

Re: can I ignore .eh_frame?

Posted: Fri Jul 27, 2012 1:04 pm
by Owen
Owen wrote:Look at the type and flags. COFF is a convoluted format.

Re: issues with COFF files

Posted: Fri Jul 27, 2012 1:14 pm
by ishkabible
I have but it ends up being the same as truly undefined symbols.

take "___main" and "_p"
  • both have a section number of 0(meaning undefined as far as I know)
    ___main's type is 32(DT_FCN)(function with no specified return type)
    _p's type is 0(T_NULL)(no specified type)
    both have a storage class of 2(C_EXT) meaning they are external
    neither has any auxiliary information
the only thing different is the type but I have yet to read anywhere that symbols that have a type of T_NULL are defined despite their section number.

that's all the information that the symbol table gives, am I supposed to look elsewhere?

the only flags I'm aware of don't seem to relate to this.
here are the flags I read about, from the text on this page: http://www.delorie.com/djgpp/doc/coff/filhdr.html
0x0001 F_RELFLG If set, there is no relocation information in this file. This is usually clear for objects and set for executables.
0x0002 F_EXEC If set, all unresolved symbols have been resolved and the file may be considered executable.
0x0004 F_LNNO If set, all line number information has been removed from the file (or was never added in the first place).
0x0008 F_LSYMS If set, all the local symbols have been removed from the file (or were never added in the first place).
0x0100 F_AR32WR Indicates that the file is 32-bit little endian
none of those seem to relate to this.

Re: issues with COFF files

Posted: Fri Jul 27, 2012 3:48 pm
by Owen
You are reading the DJGPP COFF specification, yet you are working with MinGW (i.e. Microsoft PE-COFF) files.

In particular, if you had read the Microsoft PE/COFF specification, you would understand that p in the above post is a COMMON symbol.

Re: issues with COFF files

Posted: Fri Jul 27, 2012 4:02 pm
by ishkabible
ok, the MS PE/COFF was a MUCH BETTER source of information, it clear things up quite a bit. I found the following
The symbol record is not yet assigned a section. A value of zero indicates that a reference to an external symbol is defined elsewhere. A value of non-zero is a common symbol with a size that is specified by the value.
...
A value that Microsoft tools use for external symbols. The Value field indicates the size if the section number is IMAGE_SYM_UNDEFINED (0). If the section number is not zero, then the Value field specifies the offset within the section.
so I should choose where all common variables go? I could allocate a section just for common variables and link with them as if they belonged to that section?