I am presently designing a small operating system for small computers. Since small computers don't have a lot of RAM, the system has to optimize for memory usage. One of the ways it does so is by sharing code common to different processes by means of shared libraries.
As I understand it, the immutable parts of a library, the .text segments seem mostly straightforward to share: it is only a question of adjusting addresses depending on the location where the library is loaded. The sharing of the mutable parts is complicated by the fact that each process has to have its own copy of it. If not, the processes that import the same library could read and write the same global variable and locks would have to be implemented to prevent race conditions, something I would rather not like to do. But if the mutable segments are at different places depending on the current process, a mechanism needs to be implemented that permits each library to know where are its mutable segments for the current process. With CPUs that have a Memory Management Unit (MMU), this is not a big problem: particular mapping schemes can give the illusion to the code present in each library that its global variables are always in the same place no matter the process in activity. But since the computers I target don't all have a MMU, I cannot rely on it to implement shared libraries.
So my question is: How to implement a shared library so that it can find the instances of its global variables that are pertinent to the current process in computers that don't have a MMU?
For example: Without the help of a MMU, how can the global variable errno have different value for each process that use libc if libc is a shared library.
A maybe important note: I plan to use a dialect of the Forth programming language that uses indirect threading, this means that the code is essentially a list of addresses of subroutines, the majority of which will pertain to shared libraries. For this reason, ideally, library calls would not suffer a too high penalty. Also, shared libraries will use other shared libraries recursively.
Shared libraries without a MMU
Re: Shared libraries without a MMU
Short answer: ELF FDPIC!
Longer answer: Basic idea is that you enable sharing of non-writable sections of a file in the kernel. So you do have mmap(), but whenever the user wants PROT_WRITE without MAP_ANONYMOUS, you fail the request.
Next, for memory management purposes, each process mapping a shared lib maps the file's non-writable parts with mmap and allocates the writable parts. Since that means the writable section is no longer a fixed distance from the code section, you also need a platform ABI that designates one register as "pointer to data". And then everything is just loaded from there.
In such a system, the sharable parts of the library are loaded only once, and the non-sharable parts are loaded once for each process that uses them.
One detail: Function descriptors. You can no longer call functions just by the address of their first instruction. Instead, you need that address, and the value to set the data pointer reg to. So function pointers are now pointers to data structures containing two words, where the first is the address of the function and the second is the data pointer, and a function call consists of loading the data pointer correctly before jumping.
Longer answer: Basic idea is that you enable sharing of non-writable sections of a file in the kernel. So you do have mmap(), but whenever the user wants PROT_WRITE without MAP_ANONYMOUS, you fail the request.
Next, for memory management purposes, each process mapping a shared lib maps the file's non-writable parts with mmap and allocates the writable parts. Since that means the writable section is no longer a fixed distance from the code section, you also need a platform ABI that designates one register as "pointer to data". And then everything is just loaded from there.
In such a system, the sharable parts of the library are loaded only once, and the non-sharable parts are loaded once for each process that uses them.
One detail: Function descriptors. You can no longer call functions just by the address of their first instruction. Instead, you need that address, and the value to set the data pointer reg to. So function pointers are now pointers to data structures containing two words, where the first is the address of the function and the second is the data pointer, and a function call consists of loading the data pointer correctly before jumping.
Carpe diem!
Re: Shared libraries without a MMU
Thank you very much for your answer.
I will illustrate my point. Let's say there are two libraries. The first one uses a function to increment a global variable; the second one uses a function to increment a distinct global variables and to call the function of the first library. A program then calls the function from the second library and exits.
Library 1:
Library 2:
Program:
When the program launches, it reserves space for global_var1 and global_var2. It then sets the register that should contain the pointer to .data to the address of the beginning of the reserved space and calls increment_vars(). By inspecting the register, increment_vars() can increment global_var2. But how can lib2 know to which value the register should be assigned before the call to increment_var1()? There is no reason to not touch it either, since chances are that global_var2 would be incremented a second time. There is obviously something I have not understood.
I don't get however how can mmap() be implemented without virtual memory and without duplicating a lot of pages in physical memory.Longer answer: Basic idea is that you enable sharing of non-writable sections of a file in the kernel. So you do have mmap(), but whenever the user wants PROT_WRITE without MAP_ANONYMOUS, you fail the request.
Also, I don't understand how can, with this scheme, a library functions call other library functions with global variables. I hope my comprehension is not too naive.Next, for memory management purposes, each process mapping a shared lib maps the file's non-writable parts with mmap and allocates the writable parts. Since that means the writable section is no longer a fixed distance from the code section, you also need a platform ABI that designates one register as "pointer to data". And then everything is just loaded from there.
I will illustrate my point. Let's say there are two libraries. The first one uses a function to increment a global variable; the second one uses a function to increment a distinct global variables and to call the function of the first library. A program then calls the function from the second library and exits.
Library 1:
Code: Select all
#include "lib1.h"
int global_var1 = 0;
void increment_var1(void)
{
global_var1++;
}
Code: Select all
#include "lib1.h"
#include "lib2.h"
int global_var2 = 0;
void increment_vars(void)
{
global_var2++;
increment_var1();
}
Code: Select all
#include "lib2.h"
int main(void)
{
increment_vars();
return 0;
}
Re: Shared libraries without a MMU
If it is a shared or read-only mapping, then you can just share the pages. Say /lib/libc.so is mapped into memory the first time, you find that address 0x12345000 would work out the best, reserve that memory and read the file there. Since there is no MMU you cannot fault the file in. Next process that tries to map /lib/libc.so just gets 0x12345000 returned. With no MMU, that address means the same in both processes, and since the mapping is read-only, this is safe in both cases. If it was shared and writable, it would also be safe, since then the processes want shared memory semantics.
When the program is initialized, the dynamic linker generates three memory blocks, one for the main module, one for lib1, and one for lib2. It generates function descriptors for everything in lib1 to use the data block for lib1, and everything in lib2 to use the data block for lib2. Calling a function then means stashing your current data pointer on stack, loading the correct one from the function descriptor, and then jumping to the code address. On return, you just reload your data pointer from stack.ldp wrote: ↑Wed Sep 04, 2024 3:14 pm When the program launches, it reserves space for global_var1 and global_var2. It then sets the register that should contain the pointer to .data to the address of the beginning of the reserved space and calls increment_vars(). By inspecting the register, increment_vars() can increment global_var2. But how can lib2 know to which value the register should be assigned before the call to increment_var1()?
Depending on tooling this can become even more optimized. E.g. you can put the data pointer spill and reload into the PLT stub, so if the function turns out to be local, the call can just be direct. E.g. let's imagine we used PowerPC64 with r2 as the data pointer and function descriptors. For the stuff below, the compiler would generate
Code: Select all
bl increment_vars
nop
Code: Select all
std r2, 24(r1)
ld r12, increment_vars@got(r2)
ld r2, 8(r12)
ld r12, 0(r12)
mtctr r12
bctr
Oh, and accessing another module's global data goes through the GOT, as usual.
Carpe diem!