Page 1 of 1

Loading ELF with mmap()

Posted: Thu Jun 09, 2022 12:17 pm
by Barry
Hello,

Currently, my OS loads elf files like this:

Code: Select all

mmap(ph.address, ph.memsz, ph.flags, MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
lseek(fd, ph.offset, SEEK_SET);
read(fd, ph.address, ph.filesz);
where ph is the current program header it's loading, and fd is the executable's file descriptor.

I would like to do something more like this though:

Code: Select all

mmap(ph.address, ph.memsz, ph.flags, MAP_PRIVATE, fd, ph.offset);
but I don't know how to deal with alignment issues.

mmap() will truncate addresses to page boundaries, and the program header might not align that way (e.g. the data section of my test program requests 0x1e74).
The other issue is that I'd pass memsz to mmap() so that it allocates a region large enough, but then it might over-read from the backing file, since memsz can be larger than filesz in the program header.
How should I deal with this? Or should I just keep doing it by allocating anonymous memory and reading the file into it.
I'm not really bothered about remaining POSIX compatible, so if someone has a completely different solution that's neater, I'll happily accept/adapt it.

Thanks,
Barry

Re: Loading ELF with mmap()

Posted: Thu Jun 09, 2022 2:25 pm
by nullplan
You need multiple mappings. The first mapping you already have mostly correct, except you need it to be only up to p_filesz bytes. Then, if the memsize is larger than the file size, you calculate the page break beyond the mapping and set all of the memory after the end of p_filesz up to the page break to zero. And finally, if even more memsize is needed, add an anonymous mapping after the page break that has the right length. So taking it all together:

Code: Select all

mmap(ph.address, ph.filesz, ph.flags, MAP_PRIVATE, fd, ph.offset);
if (ph.memsz > ph.filesz) {
  pagebrk = (ph.address + ph.filesz + PAGE_SIZE - 1) & -PAGE_SIZE;
  memset(ph.address + ph.filesz, 0, pagebrk - ph.address - ph.filesz);
  if (ph.memsz > pagebrk - ph.address)
    mmap(pagebrk, ph.memsz - (pagebrk - ph.address), ph.flags, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
}
Of course, you need to add error handling.

Re: Loading ELF with mmap()

Posted: Thu Jun 09, 2022 2:46 pm
by Barry
Thanks nullplan.

That makes a lot of sense. Shame it can't be done with only a single region for each program header.

Thanks,
Barry

EDIT:
I also have to round the ph.offset down to the page boundary in the mmap() call, or it gets mapped weirdly (since mmap() rounds ph.address).