kashimoto wrote:So like the kernel has its own page-dir and the user-space app also has its own(inheriting the kernel's page-dir entries, etc).
I load the binary, i read the headers, figure out the segments, etc. all in kernel memory.
This entirely depends on your kernel design, but when a system call or interrupt occurs you don't need to switch back into the kernel's page directory. Since the user program's page directory inherits your kernel's page tables (I assume the kernel doesn't use the entire address space), the kernel is capable of operating perfectly while still using the program's address space. It just runs in the kernel tables that are linked in, but the user's tables are still there.
On a Unix-like OS, the exec() family of calls is used to load a new program, and it gets loaded in the same address space (same page directory, although a lot of the mappings may be changed). When the currently running program issues the exec() system call, the kernel takes over and is still executing in that program's context, and has access to that program's memory. From there, all you need to do is read the new executable, parse it for sections, then place them in user-space memory. The only difference between user memory and kernel memory is the protections on the pages, and possibly that the kernel tables are global. When the program is loaded, just return back from the exec() call to it's entry point.
On other operating systems, you may have a spawn() system call or similar that will run an executable as a new process. For that call, the kernel will create a new address space (with the kernel tables mapped in), switch to it, then load the executable sections there, again putting them in user memory.
For actually loading the executable you have a couple of options:
- Read the content of the sections into the correct memory location during the exec()
This means that the program will take up physical memory space, and is ready to be run completely, although it may allocate more memory at runtime.
The advantage of this method is that you won't run out of physical memory while the program is running (preventing it from running), that error will occur when loading.
- Use your kernel's virtual memory structures to describe the file mappings in memory, letting it get loaded as it runs by page-fault handling (lazy loading)
This has several advantages, among them:
- Loading is quicker as you don't need to actually read the file, which is potentially quite large
- Fits seamlessly with other memory mappings, such as for the heap and stack areas
- Makes implementing shared libraries easier
- Physical memory will only be used for parts of the file that run, as any parts that don't run (e.g. parts that are only used on other architectures/operating systems) don't get loaded
The only downside is you may encounter out of memory errors when loading the file at runtime, so you need a way to resolve them. Alternatively you could reserve the correct number of pages at load time, ensuring that error never arises.
This method requires more functionality from your kernel (or services, in the case of a micro-kernel), so is harder to implement.
If you haven't got quite a lot of your virtual memory or virtual file system implemented yet, consider the first option as a placeholder until you're equipped for the second option.
In an older kernel I wrote, I used separate page directories for every thread, but threads of the same process shared virtual memory mappings. So if one thread tried to access and area of memory, whether it was the heap or part of the executable, it would cause a page fault, the kernel would check the memory mappings and see that those pages already in memory (in use by the sibling threads), and would map them into that thread's page directory. The nice thing about doing it this way was that each thread could have a stack at the same position, but different stack content, which made creating new threads so much easier. I also used this to put kernel stacks outside of global memory (still supervisor only), and to have TLS support easily. This design fell short when I wanted to share data that was on a thread's stack as it involved copying it to the heap. But having the page fault handler take care of it was a lot easier than manually linking the pages in the exec()/clone()/fork() calls, I'd definitely recommend the lazy loading approach if your kernel can support it.