forking a kernel process

Caleb1994 · Post by **Caleb1994** » Sat Jan 26, 2013 10:11 am

Hey guys,

I was having a problem at the initial stages of multitasking where initially transferring control to a user process. Here's the flow of things real quick:

Initial "process" (pre-multitasking)
Multitasking init, spawns a new task and keeps initial task as the idle task (pid=0).
The new task (pid=1) is used to initialize the last of the kernel level things (loading drivers, and things of that sort).
The new task is forked (pid=2), and the initial user process is executed (ATM it is just a basic shell, but that irrelevant)
The other task (pid=1) is killed, therefore only the user process and the idle task are alive.

Now, my problem comes in when forking the first multitasking task. This task is a kernel task, and is therefore using a kernel stack. This kernel stack is located in shared memory (kernel heap). When forking the task, I cannot simply copy the contents of that stack, as all kernel stacks are not located in the same positions. The EBP values are all screwy on the new kernel stack. I did this to rectify it:

Code: Select all

// the EBP values on the stack need to be fixed for the new stack! This risks clobbering good data, but its the best I can do at the moment...
for(u32 i = 0; i < (child->kernel_stack_size/4); ++i){
     if( ((u32*)parent->kernel_stack)[i] >= ((u32)parent->kernel_stack) && ((u32*)parent->kernel_stack)[i] < ((u32)parent->kernel_stack +parent->kernel_stack_size) )
          ((u32*)child->kernel_stack)[i] += -((u32)parent->kernel_stack) + ((u32)child->kernel_stack);
}

task_t::kernel_stack is the base of the stack space, not the ending (as the pointer would be set to).

This works, but I'm afraid I might accidentally clobber good data (this would have to be a coincidence of data lining up in that area, but it is possible).

My Question: Is this a large flaw in my design or is this predicament normal?

Thanks for all your help; you guys are great.

-Caleb

thepowersgang · Post by **thepowersgang** » Sat Jan 26, 2013 10:49 am

This is a design flaw, but one that most will end up making (I did for quite a while)
Instead of forking in kernel, it's just as easy to create a new process starting from a blank slate (and pass it a function to call, similar to how pthread_create works in the POSIX userland).

Brendan · Post by **Brendan** » Sat Jan 26, 2013 10:50 am

Hi,

Caleb1994 wrote:My Question: Is this a large flaw in my design or is this predicament normal?

Normally you'd have "kernel threads" and not "kernel processes".

For "kernel threads", you'd just create a new (empty) kernel stack for the new thread and make the new thread jump to wherever the thread is meant to start executing.

Cheers,

Brendan

Caleb1994 · Post by **Caleb1994** » Sat Jan 26, 2013 6:16 pm

thepowersgang wrote:This is a design flaw, but one that most will end up making (I did for quite a while)
Instead of forking in kernel, it's just as easy to create a new process starting from a blank slate (and pass it a function to call, similar to how pthread_create works in the POSIX userland).

That does sound much easier. So then, the fork function should be a purely user-land function, therefore opening up user-land type assumptions (e.g. the kernel stack).

Brendan wrote: Normally you'd have "kernel threads" and not "kernel processes".

For "kernel threads", you'd just create a new (empty) kernel stack for the new thread and make the new thread jump to wherever the thread is meant to start executing.

Here, do you use the words "thread" and "process" loosely? To my understanding, you venture to mean that there is no difference between kernel threads and normal processes except that they run in kernel mode, use a special kernel stack, and use the kernel page directory. I only ask because before kernel development, my idea was that threads were operated separately from process (yet, within a process), so a thread was one path of execution for a specific process. Otherwise, would you mean create thread of execution from a kernel process? Sorry, this was always a fuzzy area for me.

Love4Boobies · Post by **Love4Boobies** » Sat Jan 26, 2013 10:11 pm

What he's saying is that fork doesn't make any sense in the context of a kernel. A process is an independent instance of a program running on top of a kernel. It may have one or more threads, which execute concurrently. They are the actual execution streams. Whether they are treated as part of processes by the scheduler is implementation-defined. If a kernel wants to do things concurrently, it may also implement threads.

POSIX defines an environment under which applications can run. It describes the behavior they should expect. It doesn't describe kernels, their designs, or how they work behind the scenes. (A sane implementation will likely be constrained by certain aspects of this environment, but that's a different story.)

It's always a good idea to know what something is before rushing to implement it. Otherwise, not only is it virtually impossible to implement, but the decision of whether it should be implemented or not becomes arbitrary. You can find detailed explanations of what programs, processes, and threads are in these books.

Brendan · Post by **Brendan** » Sat Jan 26, 2013 11:34 pm

Hi,

Caleb1994 wrote:
Brendan wrote:Normally you'd have "kernel threads" and not "kernel processes".

For "kernel threads", you'd just create a new (empty) kernel stack for the new thread and make the new thread jump to wherever the thread is meant to start executing.
Here, do you use the words "thread" and "process" loosely? To my understanding, you venture to mean that there is no difference between kernel threads and normal processes except that they run in kernel mode, use a special kernel stack, and use the kernel page directory. I only ask because before kernel development, my idea was that threads were operated separately from process (yet, within a process), so a thread was one path of execution for a specific process. Otherwise, would you mean create thread of execution from a kernel process? Sorry, this was always a fuzzy area for me.

When you fork a process you create a clone of all of the resources the process is using (virtual memory, file handles, etc) so that the clone is almost exactly the same as the original. This works because processes run on top of abstractions (e.g. its relatively easy to clone virtual pages because they're virtual).

If you think of the kernel as a process, then forking a kernel would mean creating a clone of all the resources its using. The kernel's resources are bare hardware (and not abstractions), and include things like physical memory, CPUs, interrupt controller chips, timers, IO ports, PCI configuration space, etc. Software can't make hardware magically appear, therefore you can't clone the kernel's resources, therefore you can't fork a kernel.

When you spawn a new thread, the new thread shares the resources of whatever spawned it (and doesn't get a new/separate copy of those resources). This means that a kernel can spawn a new "kernel thread", and there won't be any problem - the new kernel thread shares the resources/hardware with other kernel threads and doesn't have a separate copy of the resources/hardware.

Basically, if your tasks are running as part of the kernel and not using abstractions (e.g. virtual memory) then you have to use kernel threads because you can't fork the kernel.

Alternatively, if your tasks are not part of the kernel (e.g. running in their own virtual address space like a normal process would) then you can fork them because they aren't part of the kernel; even if they are running at CPL=0. In this case (processes running at CPL=0 in their own virtual address spaces) the solution to your problem would be to have the task's stack in the task's virtual address space (and not in kernel space).

However; "fork()" is an expensive thing (because of the hassle/overhead of cloning the original process' resources) and it is typically followed by "exec()" which is another expensive thing (because of the hassle/overhead of destroying/freeing all of the resources). It doesn't take much thought to realise that (even though it's standard for POSIX systems) the "fork() then exec()" idea is extremely stupid and involves a lot of pointless hassle/overhead for no sane reason. It would be a very good idea to provide a "spawn_process()" feature as an alternative, that can be used to create a new process (without cloning resources and then destroying the cloned resources).

Cheers,

Brendan

Owen · Post by **Owen** » Sun Jan 27, 2013 12:22 am

Brendan wrote:It would be a very good idea to provide a "spawn_process()" feature as an alternative, that can be used to create a new process (without cloning resources and then destroying the cloned resources).

You mean posix_spawn?

Brendan · Post by **Brendan** » Sun Jan 27, 2013 1:22 am

Hi,

Owen wrote:
Brendan wrote:It would be a very good idea to provide a "spawn_process()" feature as an alternative, that can be used to create a new process (without cloning resources and then destroying the cloned resources).
You mean posix_spawn?

That's one of the many possible potential implementations of the idea.

To be honest, I'd prefer something cleaner (without "posix_spawn_file_actions_t" and "posix_spawnattr_t") that guarantees that no information (e.g. including file handles) can leak from parent to child regardless of how its used.

The main point is that "fork()" is not the only way to solve the original poster's problem (you'd be able to "spawn_process()" without copying any stack).

Cheers,

Brendan

OSDev.org

forking a kernel process

forking a kernel process

Re: forking a kernel process

Re: forking a kernel process

Re: forking a kernel process

Re: forking a kernel process

Re: forking a kernel process

Re: forking a kernel process

Re: forking a kernel process