Page 1 of 1

Dynamic Linker design and the role of ld.so

Posted: Mon Feb 03, 2025 12:13 pm
by venos
I'm at the point in which I can start to contemplate a dynamic linker, which I find quite exciting as its the furthest I've ever gotten before.

The design, however, seems to elude me slightly - there's a lot of conflicting information out there, and much of it is quite sparse.

AIUI there's two steps to dynamic linking, which makes sense - load the libraries, parse them, and put them in address space being step 1, and populating the addresses in the relocation table/GOT being step 2. So far, so good.

This can be done statically by the kernel at load time (I think), which is quite possibly the simplest way to go about it, but (1) advice all suggests move step 2 to runtime, not load time, but never elaborates why, and (2) existing systems make use of ld.so, and nowhere online do I see elaborated which of the 2 steps ld.so is responsible for, nor what the interface for it is, and how (and, for that matter, why) to use it.

Ergo, some clarifying questions:

1.) why not do step 2 at load time? Why is conventional advice "install a helper stub that can go pull the information from the relocation table"? Realistically I could see it as a latency optimisation, but right now that's the only use I can conceive.
2.) what does ld.so do? Resolve symbols, or load libraries *and* resolve symobls? `man ld.so` suggests both, but information is contradictory elsewhere.
3.) what is the ld.so interface? This is documented nowhere that I've found thus far - if I want to use it, how? There is, perhaps, a nonzero chance this would be overkill for hobby OS, but it must surely serve some purpose or the larger *NIXen wouldn't go down that path.
4.) see above, por que ld.so over putting it in the kernel?

For now I'll probably just do everything at load time, until it doesn't work, but I would like to get a better understanding of why this isn't typically done in order to aid my design.

Thanks all,
Venos :3

Re: Dynamic Linker design and the role of ld.so

Posted: Mon Feb 03, 2025 12:37 pm
by iansjack
The kernel doesn’t know what dynamic libraries user programs will need until they are run. So it can’t load those libraries, and carry out the relocations at load time (I presume you mean kernel boot time). If by load time you mean when the program is loaded, it’s better to carry out the relocations only when they are needed. This boosts the program start up time - and many of the relocations may not be needed during a particular run of the program.

Re: Dynamic Linker design and the role of ld.so

Posted: Mon Feb 03, 2025 12:43 pm
by venos
Apologies, that was unclear on my part. Load time in this context == inside execve()

Re: Dynamic Linker design and the role of ld.so

Posted: Mon Feb 03, 2025 12:44 pm
by nullplan
venos wrote: Mon Feb 03, 2025 12:13 pm (1) advice all suggests move step 2 to runtime, not load time, but never elaborates why
What you call step 2, most sources call "relocation". Moving it to run time instead of load time (called "lazy linking" or "late binding") has one main advantage: You only relocate the functions that are actually called. If there are relocations in rarely-used paths, you don't need to look up the corresponding symbols unnecessarily.

The drawback is that it is possible for the lazy lookup to fail, and then you have no recourse but to abort the program, after at least some of it has already run. Consider a word processor: You just spent an hour composing a letter, then you go to save it, and there is a symbol in the saving code that cannot be located, and now the program just crashes, your work scattering in the RAM.

For this reason, I counsel against lazy techniques, because most of them have similar outcomes on failure. With immediate linking, the program never starts running, and you get to debug the problem immediately.

Another drawback is that the code connecting the call site with the lazy relocation can be quite tricky to get right, and is a large amount of architecture (and in some cases sub-architecture) specific code. Code you don't need if you only support immediate binding.
venos wrote: Mon Feb 03, 2025 12:13 pm (2) existing systems make use of ld.so, and nowhere online do I see elaborated which of the 2 steps ld.so is responsible for, nor what the interface for it is, and how (and, for that matter, why) to use it.
What Linux and all of the BSDs (and probably others, but I haven't looked at those) do is this: If an ELF file is executed, the PT_LOAD segments are mapped as per usual, but if it also contains a PT_INTERP segment, then the contents of that segment name the interpreter, which is another ELF file that is also mapped into the same address space and then its entry point is jumped to instead. In order to communicate this fact, the kernel sets a couple of aux headers. AT_PHDR, AT_PHNUM, and AT_PHENT always define the main program's program headers, so that the interpreter can read those. AT_ENTRY contains the main program's entry point, and AT_BASE contains the base address of the interpreter.

Most dynamic linkers also support being spawned as commands and mapping the program themselves. In that case they notice that AT_BASE isn't set, so they know to map the main program like it was a library.

The dynamic linker is usually responsible for both loading and relocation. Indeed, it typically also provides the functions from dlfcn.h, because it is a shared library. It just happens to also have an entry point.
venos wrote: Mon Feb 03, 2025 12:13 pm 1.) why not do step 2 at load time? Why is conventional advice "install a helper stub that can go pull the information from the relocation table"? Realistically I could see it as a latency optimisation, but right now that's the only use I can conceive.
That is basically it. It is for this reason that musl libc does not implement lazy binding.
venos wrote: Mon Feb 03, 2025 12:13 pm 2.) what does ld.so do? Resolve symbols, or load libraries *and* resolve symobls? `man ld.so` suggests both, but information is contradictory elsewhere.
No, the latter is right. Load libraries and resolve relocations. Also bear in mind that lazy binding is only possible for functions whose address is not taken directly, so some amount of symbol resolution always has to happen at load time.
venos wrote: Mon Feb 03, 2025 12:13 pm 3.) what is the ld.so interface? This is documented nowhere that I've found thus far - if I want to use it, how? There is, perhaps, a nonzero chance this would be overkill for hobby OS, but it must surely serve some purpose or the larger *NIXen wouldn't go down that path.
See above. The data is mostly just transmitted in aux headers.
venos wrote: Mon Feb 03, 2025 12:13 pm 4.) see above, por que ld.so over putting it in the kernel?
You should only put code in the kernel that ought to run at elevated privilege. And once the initial file mapping is out of the way, the remaining code does not need to run at higher privilege. Sure, it needs to perform syscalls to map the libraries, but parsing the tables and writing the relocs is something it can just do at normal privilege.

This also reduces the risk of someone exploiting your kernel. If I manage to exploit your kernel, I have access to everything in your machine.

Re: Dynamic Linker design and the role of ld.so

Posted: Mon Feb 03, 2025 12:47 pm
by nullplan
Oh, and one more thing on why it should not be in the kernel: Finding the dependencies always involves a policy decision of what paths to search, and what files to accept. The kernel should not be performing policy decisions, but rather leave them up to user space. This makes it more flexible when the policies need updating.

Re: Dynamic Linker design and the role of ld.so

Posted: Sun Feb 16, 2025 7:55 am
by venos
And just like that, I can link to my libc and printf works! Thanks for the advice all, that worked a treat. ld.so works, libc works, dynamic linking works, and I can write software for userspace that looks like somewhat vaguely normal software :)

Re: Dynamic Linker design and the role of ld.so

Posted: Sun Feb 16, 2025 3:25 pm
by thewrongchristian
venos wrote: Sun Feb 16, 2025 7:55 am And just like that, I can link to my libc and printf works! Thanks for the advice all, that worked a treat. ld.so works, libc works, dynamic linking works, and I can write software for userspace that looks like somewhat vaguely normal software :)
Fab, well done!

Did you port an existing libc ld.so, or did you implement your own?

Re: Dynamic Linker design and the role of ld.so

Posted: Wed Feb 26, 2025 8:22 am
by venos
Right now, mlibc, but I don't rule out rolling my own dynamic linker someday