executable loader and process create confusion

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
szhou42
Member
Member
Posts: 67
Joined: Thu Apr 28, 2016 12:40 pm
Contact:

executable loader and process create confusion

Post by szhou42 »

I am currently trying to load an elf executable from hard disk and run it as a process. But I need to clear some confusion before I start coding.

Q1
Should the loader run in kernel space or user space
In other words, does the loader run in the process's address space or kernel's address space ?

Q2
If the loader runs in kernel space, suppose I have a function in my kernel called void create_process(char * filename) { }
By saying "load the executable file into memory", what we actually do, is first create an 4gb virtual address space for the new process, and then load the file into this address space, right? But now that the loader is in kernel space, how does it load the file into a totally different address space ?

The solution I come up with is this:
0 Say the executable is of size 1 mb (256 pages)

1 The loader allocates 1mb of memory chunk using kmalloc , read the file into this memory chunk, then map these 256 pages into the new process's address space. Suppose in default a executable is loaded to virtual address 0x08040000, then the 256 pages starting from 0x08040000 points to the same frames where the executable file was loaded into memory.

2 The loader allocates 4kb of memory chunk, and use it as the new process's stack, then map this 4kb page into the new process's address space.
Suppose in default the user stack is located on virtual address 0xc0000000 - 0x1000, then the 1 page starting from 0xc0000000 - 0x1000 points to the same frame where user space stack was allocated in kernel space

3 kfree the memory chunk

4 kfree the user stack

5 free the 256 pages in kernel's page table (the new process's page table still has reference to these pages)

6 free the 1 kb page in kernel's page table

5 Do a context switch to user space, with eip pointing to the new process's code segment, esp pointing to the user stack

Is there any problem with my solution, any other good ideas/ what 's the typical way of implementing this ?


Thanks in advance!!!!!!
Last edited by szhou42 on Thu Aug 11, 2016 4:56 pm, edited 1 time in total.
User avatar
~
Member
Member
Posts: 1228
Joined: Tue Mar 06, 2007 11:17 am
Libera.chat IRC: ArcheFire

Re: executable loader and process create confusion

Post by ~ »

szhou42 wrote:I am currently trying to load an elf executable from hard disk and run it as a process. But I need to clear some confusion before I start coding.

Q1
Should the loader run in kernel space or user space
In other words, does the loader run in the process's address space or kernel's address space ?
In a separate sub-process of the kernel space. It can have access to pages, memory and kernel services. If it fails it won't make the whole kernel crash and if it does it can be restarted or fix itself and discard the bogus context.

szhou42 wrote:Q2
If the loader runs in kernel space, suppose I have a function in my kernel called void create_process(char * filename) { }
By saying "load the executable file into memory", what we actually do, is first create an 4gb virtual address space for the new process, and then load the file into this address space, right?
Make a list of the ELF file format header/section fields as they are and then explain how you are using them.

Make a schematic of how you intend to lay the executable in memory (at this point assume that you are looking at the executable image through paging so don't take it into account right now; what matters here is determining in exact technical terms how to dispose it in memory transparently).
szhou42
Member
Member
Posts: 67
Joined: Thu Apr 28, 2016 12:40 pm
Contact:

Re: executable loader and process create confusion

Post by szhou42 »

~ wrote:
szhou42 wrote:I am currently trying to load an elf executable from hard disk and run it as a process. But I need to clear some confusion before I start coding.

Q1
Should the loader run in kernel space or user space
In other words, does the loader run in the process's address space or kernel's address space ?
In a separate sub-process of the kernel space. It can have access to pages, memory and kernel services. If it fails it won't make the whole kernel crash and if it does it can be restarted or fix itself and discard the bogus context.

szhou42 wrote:Q2
If the loader runs in kernel space, suppose I have a function in my kernel called void create_process(char * filename) { }
By saying "load the executable file into memory", what we actually do, is first create an 4gb virtual address space for the new process, and then load the file into this address space, right?
Make a list of the ELF file format header/section fields as they are and then explain how you are using them.

Make a schematic of how you intend to lay the executable in memory (at this point assume that you are looking at the executable image through paging so don't take it into account right now; what matters here is determining in exact technical terms how to dispose it in memory transparently).

Hi ~
I'll later take your first suggestion to make the loader a separate kernel process, at this moment i just want to make things work first

I've not considered the details about elf format and executable layout in memory yet, i am just trying to figure out the whole general process how a kernel space elf-loader can load a file from hard disk into a new process's address space.
Can you give some suggestions/ideas to my thoughts about this?
Thanks!
heat
Member
Member
Posts: 103
Joined: Sat Mar 28, 2015 11:23 am
Libera.chat IRC: heat

Re: executable loader and process create confusion

Post by heat »

szhou42 wrote:
~ wrote:
szhou42 wrote:I am currently trying to load an elf executable from hard disk and run it as a process. But I need to clear some confusion before I start coding.

Q1
Should the loader run in kernel space or user space
In other words, does the loader run in the process's address space or kernel's address space ?
In a separate sub-process of the kernel space. It can have access to pages, memory and kernel services. If it fails it won't make the whole kernel crash and if it does it can be restarted or fix itself and discard the bogus context.

szhou42 wrote:Q2
If the loader runs in kernel space, suppose I have a function in my kernel called void create_process(char * filename) { }
By saying "load the executable file into memory", what we actually do, is first create an 4gb virtual address space for the new process, and then load the file into this address space, right?
Make a list of the ELF file format header/section fields as they are and then explain how you are using them.

Make a schematic of how you intend to lay the executable in memory (at this point assume that you are looking at the executable image through paging so don't take it into account right now; what matters here is determining in exact technical terms how to dispose it in memory transparently).

Hi ~
I'll later take your first suggestion to make the loader a separate kernel process, at this moment i just want to make things work first

I've not considered the details about elf format and executable layout in memory yet, i am just trying to figure out the whole general process how a kernel space elf-loader can load a file from hard disk into a new process's address space.
Can you give some suggestions/ideas to my thoughts about this?
Thanks!
First off, executable loaders usually run in kernel-mode on monolithic kernels, as it's faster than making a syscall to the kernel, then some IPC calls to the loader process.

You basically want to make your loader do the following:

1 - Detect if the file exists, if not, set errno (or similar) and return.
2 - If does exist, get the file size and allocate a buffer of x size. Check for a NULL pointer.
3 - Read the file onto the buffer. If there were I/O errors, set errno and return.
4 - Parse the executable format. Start allocating memory the process needs, for example 0x8003000 for code(so R-X) and 0x8004000 for .rodata (R--).
5 - Allocate a stack for the program.
6 - Allocate all the meta-data for the process.
7 - Create a user-space thread with the esp and eip (like create_thread(esp, eip, is_user);).
8 - Associate the user-space thread to the new process.

Note that I left out the address space questions since I don't know if you're trying to make an exec(2) like interface, or more of a CreateProcess() interface. Also somewhere in the middle you might wanna handle some more complicated questions like dynamic linking and such.
If some of you people keep insisting on having backwards compatibitity with the stone age, we'll have stone tools forever.
My Hobby OS: https://github.com/heatd/Onyx
szhou42
Member
Member
Posts: 67
Joined: Thu Apr 28, 2016 12:40 pm
Contact:

Re: executable loader and process create confusion

Post by szhou42 »

heat wrote: First off, executable loaders usually run in kernel-mode on monolithic kernels, as it's faster than making a syscall to the kernel, then some IPC calls to the loader process.
You basically want to make your loader do the following:
1 - Detect if the file exists, if not, set errno (or similar) and return.
2 - If does exist, get the file size and allocate a buffer of x size. Check for a NULL pointer.
3 - Read the file onto the buffer. If there were I/O errors, set errno and return.
4 - Parse the executable format. Start allocating memory the process needs, for example 0x8003000 for code(so R-X) and 0x8004000 for .rodata (R--).
5 - Allocate a stack for the program.
6 - Allocate all the meta-data for the process.
7 - Create a user-space thread with the esp and eip (like create_thread(esp, eip, is_user);).
8 - Associate the user-space thread to the new process.
Note that I left out the address space questions since I don't know if you're trying to make an exec(2) like interface, or more of a CreateProcess() interface. Also somewhere in the middle you might wanna handle some more complicated questions like dynamic linking and such.
Hi heat
I am writing a CreateProcess() interface, I now have a better understanding a and a more clear solution to this problem, can you give some comment on it?

Code: Select all

create_process(char * file name) {
  Create an address space for the new process, map kernel code and heap into the address space
  Create a process control block, and insert it to the scheduler queue. 
  At this point, the starting address of the process should point to some function called load_program()
  The scheduler sees a newly created process in the queue and switch to it (still in kernel mode after the switch, but in the process's address space)
  load_program starts executing
  load_program reads the program from hard disk and place it in the new address space
  After the loading is done, do context switch again, enter user space, with EIP pointing to the code segment of the loaded executable
}
How does this sound ? :lol: :lol: 8) 8)
alexfru
Member
Member
Posts: 1112
Joined: Tue Mar 04, 2014 5:27 am

Re: executable loader and process create confusion

Post by alexfru »

There's a certain benefit to running the loader in the user space with user privileges. If you screw it up (or if the executable is intentionally malformed), the process crashes, the kernel lives on. Also, this doesn't affect interrupts or scheduling as the loader will get preempted as any other program and accounting may be a bit more accurate if process creation is billed to the process itself and not hidden somewhere in the kernel.

This still has the problem of loading the loader, though (and it may be useful to have a replaceable loader (at least while still actively developing the system) and not just something hard-coded in the kernel). But it's the same problem with any asynchronous I/O, which you will want to implement. You will have to have I/O with memory in a given address space.

If you use x86 I/O ports only, then your in/out instructions can only write/read memory of the current address space. So, either these should happen in the right address space from the beginning or you have to copy to/from the right address space after/before the I/O.

If you use DMA, which works with physical memory and is not by itself subject to virtual to physical address translation, you might avoid switching the address space, but you need to know the physical addresses underlying the address space.

Just some thoughts.
Post Reply