Implementing FORK

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
Whatever5k

Implementing FORK

Post by Whatever5k »

Well, I've reached the stage to implement the *real* *nice* system calls - the first one is the good, old FORK. It's neat since it doesn't require a file system ;)
fork() just copies the virtual address space and creates a totally identical copy of the parent process. Ok, so that's what I have in mind:

Code: Select all

pid_t old;

void fork_done(void)
{ 
   if (current == old)
       return child_pid;
   else
     return 0;
}

pid_t do_fork(void)
{
     process_t *proc;

     old = current->pid;
     dir = copy_addr_space();

     proc.func = (int) fork_done;
     proc.cr3 = (addr_t) dir;
     proc.cs = USER_CS;
     ...
     child = create_process(&proc, PROC_USER);
     if (child == -1)
          return -1;
}
Well, the problem is: When the function fork_done() is called, I do an IRET. But that doesn't work since create_process() creates a totally new stack for the process instead of copying the old one. So my question is: how can I copy the old stack? I may have ESP, but that doesn't help me, I need the *whole* stack :-)
Any idea?
Tim

Re:Implementing FORK

Post by Tim »

You don't just need to copy the stack, but all the memory from the old process. The brute-force way of doing this would be to allocate one page for each page the parent process had, and copy them.

The faster way is to mark every page in the parent process copy-on-write and make them read-only. Then, when either process has a page fault on such a page, you allocate a new physical page and copy the old page into there. This way pages are copied only if necessary.
Whatever5k

Re:Implementing FORK

Post by Whatever5k »

Well, how do I know which pages the parent process used?
Tim

Re:Implementing FORK

Post by Tim »

Because the memory manager is keeping track of them. If not in some list, then in the page tables.
Whatever5k

Re:Implementing FORK

Post by Whatever5k »

The page tables only contain the mapping and indicate wether a page is present or not. How would I be able to find out which memory the process allocates/needs? The mapping doesn't necessarily stand for the memory which a process needs.

edit:
Wouldn't it be sufficient to have a process structure similar to:

Code: Select all

typedef struct {
   pid_t pid;
   int level;
   int state;
   int prior;
   ...
   void *kernel_stack_base;
   void *user_stack_base;
} process_t;
So when forking a process, I could retrieve the stack base from this information and just copy it, creating the new context on top of this stack:

Code: Select all

copy_stack(new_stack, current->user_stack_base, LENGTH);
context = new_stack + LENGTH;
context->eip = ...;
context->cs = .... = ...;
What about this?
Tim

Re:Implementing FORK

Post by Tim »

No, I'm saying that the stack is just another type of memory that the process has. You should be able to treat all memory the same way.

In other words, make an exact copy of all of the original process's address space.
Whatever5k

Re:Implementing FORK

Post by Whatever5k »

Tim Robinson wrote: In other words, make an exact copy of all of the original process's address space.
Well, that's what I am doing. I'm copying the virtual address space.
User avatar
Pype.Clicker
Member
Member
Posts: 5964
Joined: Wed Oct 18, 2006 2:31 am
Location: In a galaxy, far, far away
Contact:

Re:Implementing FORK

Post by Pype.Clicker »

then your stack is using the same addresses in the new process as in the old one, but in another address space. Map the physical memory page that holds the top of the stack in the new process and do whatever you wish with it.

it mostly depends on how you're programming your task switching, but if you have

process A:kernel stack:in fork():
pid_t parent=0x1234;

---> cloning
(created):kernel stack:in fork():
pid_t parent=0x1234;

when the process A will execute "if (getPid() == parent)" it will return 'true' and when the process B will execute the same code, it will return 'false' ...


And for the page tables, they can also include some additionnal bits. For instance, the USER/SUPERVISOR bit can be used to determine whether a page is kernel-related (and therefore do need no cloning) or user-related (and therefore a new copy must be created for the new process, preferably using the copy-on-write policy).

You could also assume that read-only user page do not need a physical copy and that the same physical page can be used for all the process instance (interresting when you come to shared libraries).

What is done in Clicker is that i have a pointer associated with every page table to a structure that 'knows' how to handle page cloning, handle page faults etc. for the objects in that table. There is also an AVL tree that handles regions of similar pages, (more compact than a table, but longer to access ...)
Whatever5k

Re:Implementing FORK

Post by Whatever5k »

OK, that's what I am going to do:

A process has a kernel stack AND a user stack. When creating a kernel thread, kernel_stack = user_stack. When the scheduler picks a process, it always returns the user_stack. So when FORKing a process, I will first of all copy the whole virtual address space of the parent. Next, I will copy the parent's *user* stack (that is: the actually used stack) and set the user_stack-pointer of the new child process to the just copied stack. The kernel_stack will *not* be copied, but I will set up an own kernel_stack for the process (is that OK?).
I think it should work, because when the parent will come have the CPU, the statement (if current == old) will return TRUE (that is the PID of child) and if child has CPU, it will return FALSE (that is, 0).
What do you think? Anything against this approach?
Tim

Re:Implementing FORK

Post by Tim »

Sounds good.

1. Create the child process
2. Copy the parent process's address space into the child's
3. Create a new thread with the same registers as the calling thread.

This doesn't solve the question of what should happen if there are multiple threads in the parent process -- the traditional fork() semantics don't take this into consideration. AFAIK Posix doesn't let a process fork if it has more than one thread.
Whatever5k

Re:Implementing FORK

Post by Whatever5k »

Tim Robinson wrote: AFAIK Posix doesn't let a process fork if it has more than one thread.
Really? Sounds a bit strange, I doubt it's true - but never mind ;)
However, thanks for your help :)
Tim

Re:Implementing FORK

Post by Tim »

The problem with forking a process with multiple threads is what each of the other threads should do. It's not a problem for the calling thread, because it's blocked inside a call to fork() at the time. The other threads are in an unknown state. If you wanted to reproduce these threads in the child process, you can't say for sure what state they should appear in within the child process.
Whatever5k

Re:Implementing FORK

Post by Whatever5k »

Yes, that is right and sounds logical - but I cannot believe a process with multiple thread is not allowed to fork.
Well, I tested it on a Linux system, and it worked: a process with two threads was allowed to FORK, there was no error.

However, I really am not sure if this is the POSIX standard - when coming the point where implementing multiple threads, I will inform myself better ;)
Whatever5k

Re:Implementing FORK

Post by Whatever5k »

Ok, I had a look into the fork() manual page and here's what it sais:
POSIX Threads
In applications that use the POSIX threads API rather than the Solaris threads API ( applications linked with -lpthread, whether or not linked with -lthread), a call to fork() is like a call to fork1(), which replicates only the calling thread. There is no call that forks a child with all threads and LWPs duplicated in the child.
Alright, so it's clear: a fork() on a process with multiple threads only duplicates the current thread.
Post Reply