Relocations and executable format

johnsa · Post by **johnsa** » Sun May 17, 2009 6:27 am

Hey,

Win DLL's too have a standard entry point, so it is possible to execute something once off when the library is loaded.. in fact it gets called under various conditions when a process attaches, loads etc.. so you can use it for that sort of thing.

Back to my original question though, for some reason I don't get ANY sections, only segments with no relocation information in them.. so assuming I load these elf64 exe files the way they've been produced I wouldn't be able to do patches without manually scanning the code and looking for them, which is no different to a flat img.. i must be missing something?

Combuster · Post by **Combuster** » Sun May 17, 2009 7:14 am

Are you telling ld to link everything to an executable file (ld -q), rather than an object file (ld -i)?

johnsa · Post by **johnsa** » Mon May 18, 2009 2:42 am

I'm not using LD, just straight FASM with the format ELF64 executable directive?

johnsa · Post by **johnsa** » Mon May 18, 2009 6:05 am

The segments generated by FASMs ouput to ELF64 executable are linked to work at a specific address only, so the executable isn't exactly relocatable at all? As in:
I could say:
format ELF64 executable at 0xffff0000 ... then by using paging make that the start address of user-space for the exe image
but thats no relocatable in the sense that I need it... I'm assuming no preferred load address..

bontanu · Post by **bontanu** » Mon May 18, 2009 11:12 am

Do not make it an executable directly from inside FASM because this way you might loose the relocations information.

Instead make it an ELF64 OBJ and then either use LD or another linker to link at your desired address OR code your own loader / linker that can read an understand the relocations stored inside the ELF OBJ file at runtime and apply them for your specific run time load address.

Alternatively PE DLL's (and even plain PE) can have relocations stored inside their executable formats exactly for this purpose of loading at another address at runtime.

Alternatively search for an option in FASM that can add relocations to ELF binary executable ("relocations ?")

Dex · Post by **Dex** » Mon May 18, 2009 11:47 am

Or you could do as i do, use a fasm coff and use some thing like coff2rel to spit out a flat bin and relocatable info.
See the program at end this post

johnsa · Post by **johnsa** » Tue May 19, 2009 5:52 am

Dex, thats pretty cool thanks!
For some reason no ELF exec I generate has relocation info in it.. i tried a couple of different ways in fasm to get it to generate .rodata or a got or anything else.. and i just don't get em

I land up with an exe that could only work at the specified address ever, which isn't helpful at all. I'll try the coff2rel method too but i really should be able to get an elf exec with relocation info... i mean what happens for everyone else when they can't load the exec to the default address?

bontanu · Post by **bontanu** » Tue May 19, 2009 9:37 am

... but i really should be able to get an elf exec with relocation info... i mean what happens for everyone else when they can't load the exec to the default address?

In Windows and Linux this never happens because the OS uses paging to make an fixed address space for the application. Hence applications are "tricked" and almost always start at the very same address. This removes the need for applications to contain relocations.

Only DLL's in Windows and .SO in Linux have this problem beacuse they might get loaded in diferent order.

Because of this Windows DLL's have relocations stored inside.

AFAIK .SO use GOT and PLT and position independent code to solve this problem and because of this they might NOT have relocations stored inside.

However COFF/PE or ELF OBJ files must always have relocations because the linker needs them in order to generate the final executable.

Dex · Post by **Dex** » Tue May 19, 2009 11:13 am

Maybe this may help

Tomasz Grysztar wrote:"format ELF executable" is not an object, is an executable - and it doesn't contains relocations.
Use pure "format ELF" to get relocatable object.

Combuster · Post by **Combuster** » Tue May 19, 2009 2:41 pm

"format ELF executable" is not an object, is an executable - and it doesn't contains relocations.
Use pure "format ELF" to get relocatable object.

Which is not completely true.

There are two "kinds" of ELF files - object files and executable files. Both can have relocation info, but that is not a necessity. By default, you tell the compiler to emit object files, and due to the nature of C, those files will usually contain relocations.

The difference is that object files can not be run. They are not checked against dependencies, and have no runtime information. Executable files are complete in the sense that they are pretty much guarantueed to have all information that is necessary. That means it contains the binary data and entry point minimum. Anything extra is not strictly necessary and have to be specifically enabled. That includes relocations.

The advantage of executable files is that they can run at the location they were initially compiled to be (the relocation info is thus only necessary if you want the application to run at some other address than the default)

johnsa · Post by **johnsa** » Wed May 20, 2009 1:25 am

I think it should definitely be an executable file and not an object that I should use. Of course the other thing to bear in mind is that I large program could have 1000's of variables, which would make the necessary relocation info quite large on disk.
If you use paging this whole problem sort of falls away, but even so I'd like to be able to load an executable to any address I want and run it.
I'm going to have a quick look at the old amiga executable format and see what it did.. there was no paging on that machine as far as I remember so it must either be PIC or relocatable exe..

johnsa · Post by **johnsa** » Wed May 20, 2009 2:07 am

No luck with the Amiga hunk format... I see 3 options here:

#1 build a custom exec file that has the relocation/fixup info in it... this isn't ideal because there could be thousands or millions of them if stored in an easy to process format, IE for every instruction in the code that needs to be patched store it's relative address from the beginning of the code segment.

#2 use ONLY PIC... Too restrictive in coding style if used via LEA, or possibly using a base address register...

#3 Create an OS image loader that is an assembler basically and can literally scan the code to determine WHAT opcodes need patching and do so.. possibly load the image into memory, expanded reserved areas, obtain full in-memory image size.. any address that falls in that range is considered a program-address and is patched. This approach has benefits of small executable, easy load but then very complex patching process.

In the end I need something that will work without paging being assumed present.

johnsa · Post by **johnsa** » Thu May 21, 2009 8:21 am

Dex, Just having a look at your coff2rel app.. it seems to do the trick, I'm assuming the dword values in the .REL file are the offsets that need to have the program load address/base added in?
So in the example 0x1 and 0x6 are the two dwords that must be patched by adding 0x1000 (if the prog loaded to 0x1000) ?
I guess this REL file could become pretty huge for a large program.. also is there a 64bit version of coff2rel? else i'll use it as an example to build one myself.

Thanks
John

Dex · Post by **Dex** » Thu May 21, 2009 2:54 pm

Hi here is a example of its use (but yes you are right as to how you use it)

Code: Select all

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; parse header.                                                                    ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
        mov   esi,dword[TempRelocLoadAddrVar]    ; the temp load address of rel file
        mov   edx,[ModuleLoadAddress]                  ; the file load address
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Loop and add the new load address offset.                                        ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
.relloop:       
        lodsd                    ; load a address from the rel file
        cmp   eax,0xFFFFFFFF     ; test for last address, eg: end?
        jz   .donereloc
        mov   ebx,eax            ; no so move it into ebx
        mov   eax,[ds:edx+ebx]   ; get dword to patch in binary image 
        add   eax,edx            ; add program address to it (eg: fix it)
        mov   [ds:edx+ebx],eax   ; store it back
        jmp   .relloop           ; repeat until end of relocations
.donereloc:

As for big files, no i do not find that it gives too big a file, to me it can make them much smaller, as i use a mod ver of COFF2REL that cuts RB, RW, RD etc off.
Hope this helps.

johnsa · Post by **johnsa** » Wed May 27, 2009 4:52 am

Hey,

So i've been working a bit more on the MS64 COFF version of this, sofar working well. Run into a few questions though:

The coff2rel app only supports relocations of type 6.
All the relocations that seem to be generated for me with FASM in 64bit seem to be of type 2.

I couldn't really see any difference in the way it would be handled so I look for type 2 or 6 and write them out as is.
The descriptions for the various relocation types in the pe coff spec don't really explain themselves well EG (for 64bit):

Code: Select all

  public enum COFF_RELOCATION_TYPE
        {
            IMAGE_REL_AMD64_ABSOLUTE=0x0000,	        //The relocation is ignored.
            IMAGE_REL_AMD64_ADDR64	=0x0001,	        //The 64-bit VA of the relocation target.
            IMAGE_REL_AMD64_ADDR32	=0x0002,	        //The 32-bit VA of the relocation target.
            IMAGE_REL_AMD64_ADDR32NB=0x0003,	        //The 32-bit address without an image base (RVA).
            IMAGE_REL_AMD64_REL32	=0x0004,	        //The 32-bit relative address from the byte following the relocation.
            IMAGE_REL_AMD64_REL32_1	=0x0005,	        //The 32-bit address relative to byte distance 1 from the relocation.
            IMAGE_REL_AMD64_REL32_2	=0x0006,	        //The 32-bit address relative to byte distance 2 from the relocation.
            IMAGE_REL_AMD64_REL32_3	=0x0007,	        //The 32-bit address relative to byte distance 3 from the relocation.
            IMAGE_REL_AMD64_REL32_4	=0x0008,	        //The 32-bit address relative to byte distance 4 from the relocation.
            IMAGE_REL_AMD64_REL32_5	=0x0009,	        //The 32-bit address relative to byte distance 5 from the relocation.
            IMAGE_REL_AMD64_SECTION	=0x000A,	        //The 16-bit section index of the section that contains the target. This is used to support debugging information.
            IMAGE_REL_AMD64_SECREL	=0x000B,	        //The 32-bit offset of the target from the beginning of its section. This is used to support debugging information and static thread local storage.
            IMAGE_REL_AMD64_SECREL7	=0x000C,	        //A 7-bit unsigned offset from the base of the section that contains the target.
            IMAGE_REL_AMD64_TOKEN	=0x000D,	        //CLR tokens.
            IMAGE_REL_AMD64_SREL32	=0x000E,	        //A 32-bit signed span-dependent value emitted into the object.
            IMAGE_REL_AMD64_PAIR	=0x000F,	        //A pair that must immediately follow every span-dependent value.
            IMAGE_REL_AMD64_SSPAN32	=0x0010	            //A 32-bit signed span-dependent value that is applied at link time.
        }

What exactly is meant by 32bit address relative to byte distance 1/2/3/4/5 from the relocation? does that mean that you'd have to sub 1/2/3/4/5 etc to get the correct patching address? (it seems not).

Also none of the 64bit code that i've tried sofar outputs a 64bit address, they're all 32bit.. So I'm assuming that variables and stuff with the image and the image itself are limited to 4Gb/32bit.. IE
you couldn't have a 6Gb reserved area in the code/data.. if you wanted something that size you'd have to do a mem-alloc from the OS and then use that pointer in a register?

Lastly, Dex you mentioned that you trim the RB/RW/RD BSS type stuff which I wouldn't like to add now.. however the obj file's output from fasm just replaces all ? and RB's with 0's... so how do you know that was a reservation and not a normal data declaration?

I've put this into a little C#.NET console app so once it does the reservations and more relocation types I'll put it out here for anyone to use/modify etc..

OSDev.org

Relocations and executable format

Re: Relocations and executable format

Re: Relocations and executable format

Re: Relocations and executable format

Re: Relocations and executable format

Re: Relocations and executable format

Re: Relocations and executable format

Re: Relocations and executable format

Re: Relocations and executable format

Re: Relocations and executable format

Re: Relocations and executable format

Re: Relocations and executable format

Re: Relocations and executable format

Re: Relocations and executable format

Re: Relocations and executable format

Re: Relocations and executable format