Hi,
Caleb1994 wrote:As I am beginning to plan out my multitasking scheme, I was just thinking about how when fork is called, the two processes share the same address space, and therefore share global variables, and upon fork the file descriptor table is simply copied, and all file system node reference counting is increment respectfully.
No - for "fork()" two processes don't share an address space, but one is given a separate copy of the other's address space. Typically this is done with "copy on write" paging tricks, where all pages are marked as "read only" and writes to these pages cause a page fault where the page fault handler creates a copy of the page and marks it as "read/write". Of course if a newly forked process calls "fork()" then you can have 3 processes all sharing the same page (and if the third process calls "fork()" you can have 4 processes sharing a page, etc). Basically you end up with an unlimited number of processes sharing the same page. This means you need reference counting - if the number of processes sharing a page is 1 then the page fault handler can make it "read/write" again without creating a copy of the page; and if the number of processes sharing the page is 2 or more then you have to allocate a new page, copy the data and make the copy of the page "read/write". Also, if a process frees the page then the number of processes sharing the page can be decremented without creating a copy of it; and if a process terminates (or calls "exec()") you get to decrement the "the number of processes sharing the page" for every page. Also, don't forget that any of these pages might be on swap space or part of a memory mapped file. You can see how this can be expensive (extra overhead all over the place) and very complicated.
Spawning a thread is simple - the new thread shares the same address space, and you don't need to clone the address space.
Spawning a process is simple too - the new thread has an entirely new address space, and you don't need to clone the address space.
Caleb1994 wrote:If the process to fork a task and to spawn a task thread is similar enough, would it not be possible to have a function which spawns a new thread, simply by calling fork, then setting a few flags or fields in the task structure to indicate it is a child thread of another process?
You can have a "meta thing". For example, the kernel could have a low level function to create a task, that accepts a bunch of other variables that tell it to either create a new address space, clone an existing address space or use an existing address space; and tell it to either clone file handles or not; and tell it to either clone signal handling or not; etc. This is how Linux does it.
You can also decide that "fork()" is a stupid pain in the neck, and avoid massive amounts of complexity and overhead by refusing to support it (and only supporting spawning threads and spawning processes). This is how I do it (but I don't care about POSIX compatibility).
Cheers,
Brendan