issues with COFF files
-
- Member
- Posts: 37
- Joined: Wed Jan 05, 2011 7:35 pm
issues with COFF files
first off, I'm using the MinGW tool chain which includes binutlits and gcc for windows.
so I'm extracting information from relocatable(ld -r) PE binaries to create my own relocatable format using objcopy, objdump, and nm that will hopefully suit my much simpler needs better. one of the sections that ends up in the list of relocations is the .eh_frame section. from what I've read its used for exception handling, something C doesn't have.
Can I ignore it if I'm just using C and not C++. will I need it if I decide to support C++? what exactly is it used for?
so I'm extracting information from relocatable(ld -r) PE binaries to create my own relocatable format using objcopy, objdump, and nm that will hopefully suit my much simpler needs better. one of the sections that ends up in the list of relocations is the .eh_frame section. from what I've read its used for exception handling, something C doesn't have.
Can I ignore it if I'm just using C and not C++. will I need it if I decide to support C++? what exactly is it used for?
Last edited by ishkabible on Fri Jul 27, 2012 12:56 pm, edited 2 times in total.
- Owen
- Member
- Posts: 1700
- Joined: Fri Jun 13, 2008 3:21 pm
- Location: Cambridge, United Kingdom
- Contact:
Re: can I ignore .eh_frame?
Exception handling. Also, other things built on exception handling on some OSes (pthread_cancel springs to mind). If you're using C and no exceptions, you can discard it.
If you want exception handling, you'll need an appropriate support library, either libgcc or libunwind; and a C++ runtime library, either libsupc++ (from GCC, LGPL) or libcxxrt (from PathScale, X11)
If you want exception handling, you'll need an appropriate support library, either libgcc or libunwind; and a C++ runtime library, either libsupc++ (from GCC, LGPL) or libcxxrt (from PathScale, X11)
-
- Member
- Posts: 37
- Joined: Wed Jan 05, 2011 7:35 pm
Re: can I ignore .eh_frame?
awesome, thanks! I'm just going to discard it for now. that will simplify things even more.
-
- Member
- Posts: 37
- Joined: Wed Jan 05, 2011 7:35 pm
Re: can I ignore .eh_frame?
so, as I've been working on this, I've got a working COFF reader and I no longer use objdump/objcopy to get the information. I'm just reading the information from the file now and it all seems to be working.
I have a question about the symbol table however. certain names occur in the symbol table twice with different values. here's the symbol table printed
(index number, name, value)
0 .file 13
1 extern.c 0
2 _q 0
3 _foo 0
4 0
5 .text 0
6 ¶ 0
7 .data 0
8 0
9 .bss 0
10 ♦ 0
11 .eh_frame 0
12 8 0
13 .file 24
14 foo.c 0
15 _main 20
16 .text 20
17 ∟ 0
18 .data 0
19 0
20 .bss 4
21 0
22 .eh_frame 56
23 8 0
24 _p 16
25 ___main 0
26 0
.text, .data, .bss, .file, and .eh_frame all occur twice in the symbol table with different values. why do they occur twice?
I have a question about the symbol table however. certain names occur in the symbol table twice with different values. here's the symbol table printed
(index number, name, value)
0 .file 13
1 extern.c 0
2 _q 0
3 _foo 0
4 0
5 .text 0
6 ¶ 0
7 .data 0
8 0
9 .bss 0
10 ♦ 0
11 .eh_frame 0
12 8 0
13 .file 24
14 foo.c 0
15 _main 20
16 .text 20
17 ∟ 0
18 .data 0
19 0
20 .bss 4
21 0
22 .eh_frame 56
23 8 0
24 _p 16
25 ___main 0
26 0
.text, .data, .bss, .file, and .eh_frame all occur twice in the symbol table with different values. why do they occur twice?
- Owen
- Member
- Posts: 1700
- Joined: Fri Jun 13, 2008 3:21 pm
- Location: Cambridge, United Kingdom
- Contact:
Re: can I ignore .eh_frame?
Look at the type and flags. COFF is a convoluted format.
-
- Member
- Posts: 37
- Joined: Wed Jan 05, 2011 7:35 pm
Re: can I ignore .eh_frame?
ok so .eh_frame is exactly the same except in value and .data is exactly the same. here's the list with all information in the symbol tables. in case it wasn't clear, auxiliary information is directly below the owning symbol
the format:
index name value
data from file in hex(including auxiliary data so if a symbol has auxiliary data, it will show up in this field)
0 .file 13
2e66696c65 0 0 0 d 0 0 0feff 0 067 165787465726e2e63 0 0 0 0 0 0 0 0 0 0
1 extern.c 0
65787465726e2e63 0 0 0 0 0 0 0 0 0 0
2 _q 0
5f71 0 0 0 0 0 0 0 0 0 0 4 0 0 0 3 0
3 _foo 0
5f666f6f 0 0 0 0 0 0 0 0 1 020 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
5 .text 0
2e74657874 0 0 0 0 0 0 0 1 0 0 0 3 114 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0
6 ¶ 0
14 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0
7 .data 0
2e64617461 0 0 0 0 0 0 0 2 0 0 0 3 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
8 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
9 .bss 0
2e627373 0 0 0 0 0 0 0 0 4 0 0 0 3 1 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
10 ♦ 0
4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11 .eh_frame 0
0 0 0 0 e 0 0 0 0 0 0 0 3 0 0 0 3 138 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
12 8 0
38 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
13 .file 24
2e66696c65 0 0 018 0 0 0feff 0 067 1666f6f2e63 0 0 0 0 0 0 0 0 0 0 0 0 0
14 foo.c 0
666f6f2e63 0 0 0 0 0 0 0 0 0 0 0 0 0
15 _main 20
5f6d61696e 0 0 014 0 0 0 1 020 0 2 0
16 .text 20
2e74657874 0 0 014 0 0 0 1 0 0 0 3 11c 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0
17 ∟ 0
1c 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0
18 .data 0
2e64617461 0 0 0 0 0 0 0 2 0 0 0 3 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
19 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
20 .bss 4
2e627373 0 0 0 0 4 0 0 0 4 0 0 0 3 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
21 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
22 .eh_frame 56
0 0 0 0 e 0 0 038 0 0 0 3 0 0 0 3 138 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
23 8 0
38 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
24 _p 16
5f70 0 0 0 0 0 010 0 0 0 0 0 0 0 2 0
25 ___main 0
5f5f5f6d61696e 0 0 0 0 0 0 020 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
26 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
here are the two '.data' entries. they are identical, even in value
7 .data 0
2e64617461 0 0 0 0 0 0 0 2 0 0 0 3 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
18 .data 0
2e64617461 0 0 0 0 0 0 0 2 0 0 0 3 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
but .eh_frame differs *only* in value, which concerns me. granted...I'm not going to use this but it still seems strange to me
11 .eh_frame 0
0 0 0 0 e 0 0 0 0 0 0 0 3 0 0 0 3 138 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
22 .eh_frame 56
0 0 0 0 e 0 0 038 0 0 0 3 0 0 0 3 138 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
it seems that the value of second name is the length of the section it's referencing. so that's my guess as to what it means. the first 1 is the beginning, the second one is the end.
the format:
index name value
data from file in hex(including auxiliary data so if a symbol has auxiliary data, it will show up in this field)
0 .file 13
2e66696c65 0 0 0 d 0 0 0feff 0 067 165787465726e2e63 0 0 0 0 0 0 0 0 0 0
1 extern.c 0
65787465726e2e63 0 0 0 0 0 0 0 0 0 0
2 _q 0
5f71 0 0 0 0 0 0 0 0 0 0 4 0 0 0 3 0
3 _foo 0
5f666f6f 0 0 0 0 0 0 0 0 1 020 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
5 .text 0
2e74657874 0 0 0 0 0 0 0 1 0 0 0 3 114 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0
6 ¶ 0
14 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0
7 .data 0
2e64617461 0 0 0 0 0 0 0 2 0 0 0 3 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
8 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
9 .bss 0
2e627373 0 0 0 0 0 0 0 0 4 0 0 0 3 1 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
10 ♦ 0
4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11 .eh_frame 0
0 0 0 0 e 0 0 0 0 0 0 0 3 0 0 0 3 138 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
12 8 0
38 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
13 .file 24
2e66696c65 0 0 018 0 0 0feff 0 067 1666f6f2e63 0 0 0 0 0 0 0 0 0 0 0 0 0
14 foo.c 0
666f6f2e63 0 0 0 0 0 0 0 0 0 0 0 0 0
15 _main 20
5f6d61696e 0 0 014 0 0 0 1 020 0 2 0
16 .text 20
2e74657874 0 0 014 0 0 0 1 0 0 0 3 11c 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0
17 ∟ 0
1c 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0
18 .data 0
2e64617461 0 0 0 0 0 0 0 2 0 0 0 3 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
19 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
20 .bss 4
2e627373 0 0 0 0 4 0 0 0 4 0 0 0 3 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
21 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
22 .eh_frame 56
0 0 0 0 e 0 0 038 0 0 0 3 0 0 0 3 138 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
23 8 0
38 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
24 _p 16
5f70 0 0 0 0 0 010 0 0 0 0 0 0 0 2 0
25 ___main 0
5f5f5f6d61696e 0 0 0 0 0 0 020 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
26 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
here are the two '.data' entries. they are identical, even in value
7 .data 0
2e64617461 0 0 0 0 0 0 0 2 0 0 0 3 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
18 .data 0
2e64617461 0 0 0 0 0 0 0 2 0 0 0 3 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
but .eh_frame differs *only* in value, which concerns me. granted...I'm not going to use this but it still seems strange to me
11 .eh_frame 0
0 0 0 0 e 0 0 0 0 0 0 0 3 0 0 0 3 138 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
22 .eh_frame 56
0 0 0 0 e 0 0 038 0 0 0 3 0 0 0 3 138 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
it seems that the value of second name is the length of the section it's referencing. so that's my guess as to what it means. the first 1 is the beginning, the second one is the end.
-
- Member
- Posts: 37
- Joined: Wed Jan 05, 2011 7:35 pm
Re: can I ignore .eh_frame?
ok so I can't figure out why uninitialized external variables have a section number of zero(meaning they are undefined).
here's the source file
'p' has a section number of 0 which means it's undefined in COFF but as I understand it, it should be defined in the first source file. do you have to initialize a variable to define it or am I missing a peice of information that qualifies this as a defined symbol?
here's the source file
Code: Select all
int p[4];
static int q = 10;
int foo() {
q = p[0];
return q;
}
'p' has a section number of 0 which means it's undefined in COFF but as I understand it, it should be defined in the first source file. do you have to initialize a variable to define it or am I missing a peice of information that qualifies this as a defined symbol?
- Owen
- Member
- Posts: 1700
- Joined: Fri Jun 13, 2008 3:21 pm
- Location: Cambridge, United Kingdom
- Contact:
Re: can I ignore .eh_frame?
Owen wrote:Look at the type and flags. COFF is a convoluted format.
-
- Member
- Posts: 37
- Joined: Wed Jan 05, 2011 7:35 pm
Re: issues with COFF files
I have but it ends up being the same as truly undefined symbols.
take "___main" and "_p"
that's all the information that the symbol table gives, am I supposed to look elsewhere?
the only flags I'm aware of don't seem to relate to this.
here are the flags I read about, from the text on this page: http://www.delorie.com/djgpp/doc/coff/filhdr.html
take "___main" and "_p"
- both have a section number of 0(meaning undefined as far as I know)
___main's type is 32(DT_FCN)(function with no specified return type)
_p's type is 0(T_NULL)(no specified type)
both have a storage class of 2(C_EXT) meaning they are external
neither has any auxiliary information
that's all the information that the symbol table gives, am I supposed to look elsewhere?
the only flags I'm aware of don't seem to relate to this.
here are the flags I read about, from the text on this page: http://www.delorie.com/djgpp/doc/coff/filhdr.html
none of those seem to relate to this.0x0001 F_RELFLG If set, there is no relocation information in this file. This is usually clear for objects and set for executables.
0x0002 F_EXEC If set, all unresolved symbols have been resolved and the file may be considered executable.
0x0004 F_LNNO If set, all line number information has been removed from the file (or was never added in the first place).
0x0008 F_LSYMS If set, all the local symbols have been removed from the file (or were never added in the first place).
0x0100 F_AR32WR Indicates that the file is 32-bit little endian
- Owen
- Member
- Posts: 1700
- Joined: Fri Jun 13, 2008 3:21 pm
- Location: Cambridge, United Kingdom
- Contact:
Re: issues with COFF files
You are reading the DJGPP COFF specification, yet you are working with MinGW (i.e. Microsoft PE-COFF) files.
In particular, if you had read the Microsoft PE/COFF specification, you would understand that p in the above post is a COMMON symbol.
In particular, if you had read the Microsoft PE/COFF specification, you would understand that p in the above post is a COMMON symbol.
-
- Member
- Posts: 37
- Joined: Wed Jan 05, 2011 7:35 pm
Re: issues with COFF files
ok, the MS PE/COFF was a MUCH BETTER source of information, it clear things up quite a bit. I found the following
so I should choose where all common variables go? I could allocate a section just for common variables and link with them as if they belonged to that section?The symbol record is not yet assigned a section. A value of zero indicates that a reference to an external symbol is defined elsewhere. A value of non-zero is a common symbol with a size that is specified by the value.
...
A value that Microsoft tools use for external symbols. The Value field indicates the size if the section number is IMAGE_SYM_UNDEFINED (0). If the section number is not zero, then the Value field specifies the offset within the section.