Page 6 of 6

Re: How to compile a flat position-independent binary with G

Posted: Tue Jul 19, 2016 5:59 am
by tsdnz
onlyonemac wrote:I'm a bit confused about the linkscript though. Would this linkscript reserve space in the final binary for the global offset table, or would the global offset table "run over" the end of the binary into unallocated memory?

Code: Select all

OUTPUT_FORMAT("elf32-i386")
OUTPUT_ARCH(i386)
ENTRY(start)
SECTIONS
{
	. = 0x00000000;
	.text : { *(.text) }
	.rodata : { *(.rodata) }
	. = ALIGN(4096);
	.data : { *(.data) }
	. = ALIGN(4096);
	.bss : { *(.bss) }
	. = ALIGN(4096);
	_GLOBAL_OFFSET_TABLE_ = .;
	. = ALIGN(4096);
}
Hi, I just put mine at 0x5000 for 32 and 64 bit, was easy to set up in secondary loader and then for the kernel to set up TSS, mine goes to 0x6038 to handle plenty of TSS entries. It looks like it will put at end, the size will depend on the variable you attach to it, and if you set it as global variable with variable set (NOLOAD not associated), no expert on linker script here. I set mine up slightly differently, I would go .GDT ALIGN(0x1000) : { *(.GDT) }. You can test it in code, print the address to screen

Re: How to compile a flat position-independent binary with G

Posted: Tue Jul 19, 2016 6:03 am
by linuxyne
onlyonemac wrote: Would this linkscript reserve space in the final binary for the global offset table, or would the global offset table "run over" the end of the binary into unallocated memory?
I do not think that the elf32-i386 format would truncate the binary such that GOT is cut away. But see below for "binary".


I think the difference seen earlier could be a bug in linking when using OUTPUT_FORMAT("binary").

The binary created using that format has an empty GOT. i.e. the GOT is exactly at the offset equal to the size of the file.
So, if the binary is of 0x106c, the GOT is at offset 0x106c, and thus has to be necessarily empty.

The bug is in the way R_386_GOTOFF from the object is handled.
Below is the snippet.

Code: Select all

00000000 <.data>:
       0:       55                      push   %ebp
       1:       89 e5                   mov    %esp,%ebp
       3:       83 ec 10                sub    $0x10,%esp
       6:       e8 f5 0f 00 00          call   0x1000
       b:       05 61 10 00 00          add    $0x1061,%eax
      10:       c7 45 fc 00 00 00 00    movl   $0x0,-0x4(%ebp)
      17:       81 7d 08 be 00 00 00    cmpl   $0xbe,0x8(%ebp)
      1e:       0f 87 d3 00 00 00       ja     0xf7
      24:       8b 55 08                mov    0x8(%ebp),%edx
      27:       c1 e2 02                shl    $0x2,%edx
      2a:       8b 94 02 1c 01 00 00    mov    0x11c(%edx,%eax,1),%edx
      31:       01 d0                   add    %edx,%eax
      33:       ff e0                   jmp    *%eax
     . . .
     11a:       eb fe                   jmp    0x11a
     11c:       35 00 00 00 f7          xor    $0xf7000000,%eax

     . . .
     147:       00 41 00                add    %al,0x0(%ecx)
     14a:       00 00                   add    %al,(%eax)
Based on the way the jump table is laid out, and the address of the (empty) GOT, the line

Code: Select all

      2a:       8b 94 02 1c 01 00 00    mov    0x11c(%edx,%eax,1),%edx
should be actually

Code: Select all

      2a:       8b 94 02 b0 f0 ff ff    mov    -0xf50(%edx,%eax,1),%edx
The 0x11c is the base of the jump table (which can be considered a local symbol). It is right after our infinite loop at 0x11a.

About R_386_GOTOFF:
determine the distance from the GLOBAL_OFFSET_TABLE to the (local) "symbol"; store said distance in the dword at this location; create an entry in the GOT[]; change this reloc into a R_386_RELATIVE and point it at the GOT[] entry
The distance of the jump table from GOT, as calculated in the working binary, is

Code: Select all

= symbol_addr - got_addr
= 0x11c - 0x106c
= -0xf50
Instead, the linker seems to have used got_addr as 0 for resolving R_386_GOTOFF relocation inside the object file.

Code: Select all

= symbol_addr - got_addr
= 0x11c - 0
= 0x11c. 
Thus, the resultant binary does calculate the address of a non-existent GOT (the eax contains the address of the GOT), but
for some relocations, the linker chose to use 0 as the GOT address instead of 0x106c. This shows inconsistency and possibly a bug.

I am writing to the binutils group to get more clarity.

Edit0: objcopy seems to be the way to go, instead of setting 'binary' directly (or even using output format switches). See here.

Edit1: BFD Information Loss