Relocatable ELF Object Help

bloodline · Post by **bloodline** » Fri Jul 16, 2021 12:55 am

Hi all,

Now my OS has FAT32 support, it is time to allow it to load programs from disk.

Since I don't use a virtual address space (yet...), the ELF Executable format is rather unsuited to my needs, and after reading the ELF spec it seem the basic relocatable object (as output by a simple compilation) will be fine for what I need right now.

All I need is essentially a single compiled function with access to a single external symbol (to be resolved at load time).

My test object is very simple:

Code: Select all

typedef struct{
    void (*Call)(void);
} exec_t;

extern exec_t* sysbase;

void main(){
    
    sysbase->Call();
    
}

My ELF Loader can see the sections, and I can get the section names from the e_shstrndx section... but I'm not sure how the the relocation information is stored...

I want to load the main function into memory fixup the sysbase symbol and then call the main.

Edit: The sections are thus:

Code: Select all

.text
.data
.bss
.text.startup
.rel.text.startup
.comment
.symtab
.strtab
.shstrtab

I'm guessing my main function will be found in section .text.startup with the reloc information at rel.text.startup

pvc · Post by **pvc** » Fri Jul 16, 2021 1:54 am

It's just my opinion, but when I started with OS programming, I found that PE format (one from Windows) was easier to relocate, use dynamic linking on and understand in general. MS provides nice documentation for it as well. So you might try that instead. Look for something like 'portable executable file format' or something like that. You'll find plenty of info.

ELF is nice too but is kind of weird in a way that it is divided into segments and sections at the same time, which in turn may cause some confusion. Some features work on sections while some work on segments. Also in-memory data alignment vs. in-file alignment isn't exactly clear to me either. ELF provides more flexibility but, because of that, it can be also a little bit trickier, since there is more things to consider.

And nothing stops you from using both formats

iansjack · Post by **iansjack** » Fri Jul 16, 2021 1:58 am

https://www.intezer.com/blog/malware-an ... locations/

bloodline · Post by **bloodline** » Fri Jul 16, 2021 3:01 am

pvc wrote:It's just my opinion, but when I started with OS programming, I found that PE format (one from Windows) was easier to relocate, use dynamic linking on and understand in general. MS provides nice documentation for it as well. So you might try that instead. Look for something like 'portable executable file format' or something like that. You'll find plenty of info.

ELF is nice too but is kind of weird in a way that it is divided into segments and sections at the same time, which in turn may cause some confusion. Some features work on sections while some work on segments. Also in-memory data alignment vs. in-file alignment isn't exactly clear to me either. ELF provides more flexibility but, because of that, it can be also a little bit trickier, since there is more things to consider.

And nothing stops you from using both formats

Yeah, I have had a look at the PE format, and it is probably better suited to my current task... But I will need ELF at some point, so now is as good a time as any to learn it, after all it is the point of my Hobby OS to learn things I wouldn't normally.

bloodline · Post by **bloodline** » Fri Jul 16, 2021 3:03 am

iansjack wrote:https://www.intezer.com/blog/malware-analysis/executable-and-linkable-format-101-part-3-relocations/

OK, this looks pretty good! Cheers for the link, I'll have a proper read tonight.

klange · Post by **klange** » Sat Jul 17, 2021 6:18 am

I just rewrote my relocatable object loader that I use for kernel modules today, if you want a small reference. Notably that's Elf64, but the old one was Elf32.

bloodline · Post by **bloodline** » Sat Jul 17, 2021 4:18 pm

klange wrote:I just rewrote my relocatable object loader that I use for kernel modules today, if you want a small reference. Notably that's Elf64, but the old one was Elf32.

Many thanks for sharing this with me, I’ll have a read over it tomorrow. I’m finding the ELF format quite a struggle… I think that actually what I want from the format are “dynamic” executables

zaval · Post by **zaval** » Sat Jul 17, 2021 4:33 pm

klange, so you use plain object files as loadable entities (system modules)? could you tell about your reasons of not using shared objects for this purpose? and also, I don't quite get - say a driver consists of N source code files (C files). that would generate N obj files. how then you make them into a single object file, is there some linker option for this?

since I don't want to ditch the idea of supporting MIPS, I decided to overcome the lack of PE capable compiler for it by using ELF for this architecture. And others with the same situation (others mean RISCV and PPC, haha, I know, it's ridiculously many of targets). (but if I could choose, I'd definitely go with PE, - it's to the OP).

nexos · Post by **nexos** » Sat Jul 17, 2021 4:49 pm

zaval wrote:(but if I could choose, I'd definitely go with PE, - it's to the OP).

PE is much better the ELF in some ways (mainly dynamic linking). Loading an ELF, however, is very simple, so hence they both have there advantages and disadvantages.
@bloodline - For one thing, if you build an ELF app as position independent in GCC, then you can relocate it just like a simple object file. Loading only object files will prevent growth of your OS, hence I recommend looking into PIC (position independent code). Also, if you aren't making a Unix clone, you could trying looking into PE or Mach-O. PE is probably the simplest out of the bunch. I know nothing about Mach-O, so there is no help from me there

TL;DR Don't just look at ELF because it is the de facto standard, there are other good formats out there you should check out

davmac314 · Post by **davmac314** » Sat Jul 17, 2021 6:17 pm

zaval wrote:... say a driver consists of N source code files (C files). that would generate N obj files. how then you make them into a single object file, is there some linker option for this?

Yes, for GNU ld there is

Code: Select all

-r

However:

bloodline wrote:and after reading the ELF spec it seem the basic relocatable object (as output by a simple compilation) will be fine for what I need right now.

You almost definitely don't want that. Use a dynamic executable or dynamic shared library instead. The output of simple compilation is relocatable, yes, but the relocations may include many inter-object fixups which aren't necessary if you just produce an executable/shared library.

You can use `--allow-shlib-undefined` at link time if you need to refer to a symbol that will be provided by the system at load time. Alternatively, `--warn-unresolved-symbols`. I haven't experimented with this myself so YMMV, but documentation suggests this should be possible.

Compile with `-fpic` or `-fPIC` to produce a shared library. For an executable, you can use `-fpie` or `-fPIE` to produce an executable that won't need relocations, or use `-q`/`--emit-relocs` if you want to be able to perform relocations at load time.

nullplan · Post by **nullplan** » Sun Jul 18, 2021 12:00 am

davmac314 wrote:You almost definitely don't want that. Use a dynamic executable or dynamic shared library instead. The output of simple compilation is relocatable, yes, but the relocations may include many inter-object fixups which aren't necessary if you just produce an executable/shared library.

He did say he was making a kernel module, right? Kernel modules tend to call functions of the main kernel. So yes, inter-object fixups may be necessary.

Also, fixing up relocations isn't all that hard. Linux uses normal relocatable files for its kernel modules. You do it in two steps: First you walk through the symbol table and determine the value of each symbol (recording it in st_value of the symbol table), then you walk the relocation table and apply the relocations. (If you want to look it up, the function to do the first step is called simplify_symbols(), and the other one is called apply_relocations(), both in kernel/module.c. But apply_relocations() will call the arch-specific functions to actually do the calculations).

klange · Post by **klange** » Sun Jul 18, 2021 4:21 am

davmac314 wrote:You almost definitely don't want that. Use a dynamic executable or dynamic shared library instead. The output of simple compilation is relocatable, yes, but the relocations may include many inter-object fixups which aren't necessary if you just produce an executable/shared library.

Shared libraries have relocations, too. The only difference tends to be where they are and how many there are, but "how many there are" does not affect the difficulty of the task of performing them.

davmac314 · Post by **davmac314** » Sun Jul 18, 2021 5:27 am

nullplan wrote:He did say he was making a kernel module, right?

No...

nullplan wrote:Also, fixing up relocations isn't all that hard. Linux uses normal relocatable files for its kernel modules. You do it in two steps: First you walk through the symbol table and determine the value of each symbol (recording it in st_value of the symbol table), then you walk the relocation table and apply the relocations

Sure. It'll be a lot more relocations than if you had linked to a s.o. or executable, of course. I guess it doesn't matter that much, though.

alexfru · Post by **alexfru** » Sun Jul 18, 2021 6:53 pm

klange wrote:I just rewrote my relocatable object loader that I use for kernel modules today, if you want a small reference. Notably that's Elf64, but the old one was Elf32.

So, it looks like you're loading object files, not executable ones?
This is similar to how I statically link ELF32 objects in Smaller C's linker.

Do I understand it correctly that a proper ELF executable that can be loaded anywhere (because of ASLR or lack of page translation) needs to have a whole bunch of extra sections for the dynamic loader, even where a simple object file would suffice?

klange · Post by **klange** » Sun Jul 18, 2021 7:19 pm

alexfru wrote:So, it looks like you're loading object files, not executable ones?

Yes, you'll find that my relocatable object loader loads relocatable objects.

There is an executable loader in the function below that, though. There's also a dynamic linker but that is in userspace...

alexfru wrote:Do I understand it correctly that a proper ELF executable that can be loaded anywhere (because of ASLR or lack of page translation) needs to have a whole bunch of extra sections for the dynamic loader, even where a simple object file would suffice?

Don't confuse position independence with dynamic executables: you can have one, the other, both, or neither. Time for some rambling:

Object files can represent any sort of code: executable, library; static, dynamic; position-independent or not... But regardless of what's in them, you absolutely need a linker to turn them into a final product - they are only an intermediary format. Object files consist only of sections, and lack the program headers found in executables and shared objects.

On most architectures, position-independent executables and libraries can actually be built statically and without any extra sections through the use of "thunks" and native IP-relative addressing schemes. On x86-32 these were notably slower at runtime than patched up absolute code, which was a very good reason to pick relocatable objects for device drivers. x86-64 and various other architectures don't suffer from the same speed impact, with x86-64 in particular introducing RIP-relative instructions.

The main benefit behind position-independent code, even when it does involve relocations - which it will in the case of a dynamic library that references other libraries as relocations are what fills in the addresses of runtime-resolved symbols - is that the relocations are moved away from the code so that the number of pages that need to be modified per-process is minimized (usually through a global offset table). This means you can "load once, run anywhere" in the sense that you can have a shared library in physical memory once and then map it into userspace memory layouts for every process that references that library, and they'll only need their own private copies of the small set of pages with relocations.

OSDev.org

Relocatable ELF Object Help

Relocatable ELF Object Help

Re: Relocatable ELF Object Help

Re: Relocatable ELF Object Help

Re: Relocatable ELF Object Help

Re: Relocatable ELF Object Help

Re: Relocatable ELF Object Help

Re: Relocatable ELF Object Help

Re: Relocatable ELF Object Help

Re: Relocatable ELF Object Help

Re: Relocatable ELF Object Help

Re: Relocatable ELF Object Help

Re: Relocatable ELF Object Help

Re: Relocatable ELF Object Help

Re: Relocatable ELF Object Help

Re: Relocatable ELF Object Help