Page 1 of 1

small a.out or ELF loader implementation

Posted: Thu Sep 17, 2020 7:43 am
by antoni
Can anyone recommend small, easy to integrate implementation of some simple executable format (like a.out) loader? It can also be ELF if it's easy to integrate with my own operating system. It's important that this executable format is supported by gcc (i.e. a.out or ELF), that implementation is small (code didn't compile for an hour) and easy to integrate (only few hooks like malloc or mutex are required). It would be nice if it'll be 64-bit, but it can also be 32-bit. I can handle it.

Re: small a.out or ELF loader implementation

Posted: Thu Sep 17, 2020 11:31 am
by bzt
antoni wrote:Can anyone recommend small, easy to integrate implementation of some simple executable format (like a.out) loader? It can also be ELF if it's easy to integrate with my own operating system.
Well, yes and no.

If you're looking for the simplest implementation possible, then you're looking for the xv6 boot loader. This code basically just loads the segments out of the elf file into their final positions:

Code: Select all

  // Read 1st page off disk
  readseg((uchar*)elf, 4096, 0);

  // Is this an ELF executable?
  if(elf->magic != ELF_MAGIC)
    return;  // let bootasm.S handle error

  // Load each program segment (ignores ph flags).
  ph = (struct proghdr*)((uchar*)elf + elf->phoff);
  eph = ph + elf->phnum;
  for(; ph < eph; ph++){
    pa = (uchar*)ph->paddr;
    readseg(pa, ph->filesz, ph->off);
    if(ph->memsz > ph->filesz)
      stosb(pa + ph->filesz, 0, ph->memsz - ph->filesz);
  }

  // Call the entry point from the ELF header.
  // Does not return!
  entry = (void(*)(void))(elf->entry);
  entry();
But for an OS, this isn't going to be enough: you'll have to implement a shared library loader, interpret relocation records, add a dynamic symbol resolver etc.

Here is my implementation, but I won't call it simple nor easy to integrate.
elf_getfile() - returns the cache for an elf file (loads it from disk if it's not already in the cache)
elf_unload() - removes an elf from the cache
elf_load() - gets an elf file from the cache, and maps its segments in the current address space. It also parses PT_NEEDED records and recursively calls itself to load the referenced shared libraries.
elf_rtlink() - interprets relocation records and run-time links elf segments in the current address space.

It does a lot address space trickery, because elf segments are not loaded at their final positions. The reason for this is the POSIX standard, which states that if something goes wrong with the exec() system call (like unable to resolve a symbol at the last step for example), then the control must be returned to the original code segment. So it must preserve that, and only overwrite with the new segment when it can be absolutely sure that exec() won't fail.

Cheers,
bzt