difference between fork() and threads...
difference between fork() and threads...
Hello everybody!
I want to ask something which annoys me.
What is the difference between fork() and a thread? or which similarities and relationships do they have?
fork() is integrated on the OS and is called with int 0x80 and works on the same level of kernel-threads.
And what about execve()? It should also use fork() for creating a new process, or am I misunderstanding something?
Can anybody help me?
Thanks
I want to ask something which annoys me.
What is the difference between fork() and a thread? or which similarities and relationships do they have?
fork() is integrated on the OS and is called with int 0x80 and works on the same level of kernel-threads.
And what about execve()? It should also use fork() for creating a new process, or am I misunderstanding something?
Can anybody help me?
Thanks
Re: difference between fork() and threads...
Hi,
The method for creating a new process is fork() followed by execve(). Take a look here. Also, fork() is a well defined function, whereas a thread is a programming concept.
IIRC, Linux has no notion of threads, just processes (please correct me if I'm wrong or out of date - *ducksandruns*). So, with fork() you spawn a child process identical to the parent except process ID. You then use the PID to determine whether you are the parent or child process. The child process then uses execve to actually load the new binary and jumps to the entry point.
Cheers,
Adam
ps: please don't use colour in your posts - not everyone uses the same theme.
The method for creating a new process is fork() followed by execve(). Take a look here. Also, fork() is a well defined function, whereas a thread is a programming concept.
IIRC, Linux has no notion of threads, just processes (please correct me if I'm wrong or out of date - *ducksandruns*). So, with fork() you spawn a child process identical to the parent except process ID. You then use the PID to determine whether you are the parent or child process. The child process then uses execve to actually load the new binary and jumps to the entry point.
Cheers,
Adam
ps: please don't use colour in your posts - not everyone uses the same theme.
Re: difference between fork() and threads...
Consider you told. NPTL has been integrated with the kernel for the 2.6 release.AJ wrote:IIRC, Linux has no notion of threads, just processes (please correct me if I'm wrong or out of date - *ducksandruns*).
Every good solution is obvious once you've found it.
Re: difference between fork() and threads...
I think fork() duplicate the who process space except a few items(see man page). It sound more practical to just duplicate the page table (and retain resource handles), setup the child-specific items, and continue execution (by return from fork function), upon child do changes to memory they are handled with copy-on-write.
Re: difference between fork() and threads...
Hi,
When a thread is created, the same address space (and other resources) are used "as is". This should be faster as the OS/kernel doesn't need to setup cloned versions of the resources.
Note: Some OS's also have some sort of "spawnProcess()", which works like "fork()" and "exec()" combined. The benefit of this is that a new virtual address space is created (and the old address space is not cloned and then discarded) and other resources (file handles, etc) don't need to be shared; which is simpler and faster. Some OS's only have "spawnProcess()" and don't support "fork()" at all; which is a lot easier to implement (no need for the OS/kernel to support things like address space cloning, file handles that are shared by multiple processes, etc).
Cheers,
Brendan
For "fork()" the entire virtual address space is cloned (and then typically discarded soon after when a variation of "exec()" is called). This is typically done using "copy on write" - e.g. everything in the virtual address spaces is marked as "read only", and any write causes a page fault where a new copy of the page is allocated/created and changed to "read/write". Various other resources are also (temporarily, until "exec()"?) shared, including things like environment variables, file handles, signal handling, etc. It's relatively expensive.skandalOS wrote:What is the difference between fork() and a thread?
When a thread is created, the same address space (and other resources) are used "as is". This should be faster as the OS/kernel doesn't need to setup cloned versions of the resources.
Note: Some OS's also have some sort of "spawnProcess()", which works like "fork()" and "exec()" combined. The benefit of this is that a new virtual address space is created (and the old address space is not cloned and then discarded) and other resources (file handles, etc) don't need to be shared; which is simpler and faster. Some OS's only have "spawnProcess()" and don't support "fork()" at all; which is a lot easier to implement (no need for the OS/kernel to support things like address space cloning, file handles that are shared by multiple processes, etc).
If I understand it correctly; internally Linux has a "meta-fork()" where the caller tells it what to do with various resources. For example, the "fork()" function would call "meta-fork()" and tell it to clone the parent process' virtual address space, while "spawnThread()" would call "meta-fork()" and tell it to re-use the existing address space. Basically Linux doesn't support threads, but does support processes that "share the same everything" (and therefore behave identically to threads).AJ wrote:IIRC, Linux has no notion of threads, just processes (please correct me if I'm wrong or out of date - *ducksandruns*).
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
- xenos
- Member
- Posts: 1121
- Joined: Thu Aug 11, 2005 11:00 pm
- Libera.chat IRC: xenos1984
- Location: Tartu, Estonia
- Contact:
Re: difference between fork() and threads...
Windows, for example, has API functions like CreateProcess, CreateThread and so on, which create a new process from an executable file or a new thread within the same process. Actually this is what I implemented in my kernel, since it appears more logical to me and, as you said, requires no expensive address space cloning and discarding. I wonder why fork / exec has survived such a long time in Unix / POSIX operating systems. I read that the original reason was somehow related to pipes and filters, but I can hardly imagine that they are harder to implement with something like CreateProcess.Brendan wrote:Note: Some OS's also have some sort of "spawnProcess()", which works like "fork()" and "exec()" combined. The benefit of this is that a new virtual address space is created (and the old address space is not cloned and then discarded) and other resources (file handles, etc) don't need to be shared; which is simpler and faster. Some OS's only have "spawnProcess()" and don't support "fork()" at all; which is a lot easier to implement (no need for the OS/kernel to support things like address space cloning, file handles that are shared by multiple processes, etc).
- Owen
- Member
- Posts: 1700
- Joined: Fri Jun 13, 2008 3:21 pm
- Location: Cambridge, United Kingdom
- Contact:
Re: difference between fork() and threads...
Fork is useful because it lets you do some setup in the context of the child process before handing over control. It provides a lot of flexibility that CreateProcess doesn't for doing things like massaging file descriptors.
Theres no reason, as I see it, not to support both. For simple tasks, CreateProcess can be more efficient; for complex ones, fork() can be useful.
For file descriptors: Fork shares all of them with its parent process. However, file descriptors can be marked with F_CLOEXEC, which closes them when exec is invoked.
Theres no reason, as I see it, not to support both. For simple tasks, CreateProcess can be more efficient; for complex ones, fork() can be useful.
For file descriptors: Fork shares all of them with its parent process. However, file descriptors can be marked with F_CLOEXEC, which closes them when exec is invoked.
- Combuster
- Member
- Posts: 9301
- Joined: Wed Oct 18, 2006 3:45 am
- Libera.chat IRC: [com]buster
- Location: On the balcony, where I can actually keep 1½m distance
- Contact:
Re: difference between fork() and threads...
I decided to kick out the limiting abstractions altogether and went for a system consisting of CreateAddressSpace/CreateThread/TransferPage as the relevant system calls. (Which are powerful enough to implement any variation of the fork/createprocess calls)Theres no reason, as I see it, not to support both. For simple tasks, CreateProcess can be more efficient; for complex ones, fork() can be useful.
- gravaera
- Member
- Posts: 737
- Joined: Tue Jun 02, 2009 4:35 pm
- Location: Supporting the cause: Use \tabs to indent code. NOT \x20 spaces.
Re: difference between fork() and threads...
Same here, spawnProcess(), spawnThread().
17:56 < sortie> Paging is called paging because you need to draw it on pages in your notebook to succeed at it.
-
- Member
- Posts: 595
- Joined: Mon Jul 05, 2010 4:15 pm
Re: difference between fork() and threads...
Isn't CreateProcess exactly what CreateAddressSpace does? These calls tends to differ between microkernels and monolithic. Monolithic wants to store more byrocracy like child-parent relationship, environment variables, access rights and so on. With microkernels much of this is moved to user space for example the process manager in QNX, then CreateProcess in the kernel becomes much simpler. Also as the process manager is the actual process that creates/fork a new process, no one else does, then then fork cannot work the same way (the kernel API).Combuster wrote:I decided to kick out the limiting abstractions altogether and went for a system consisting of CreateAddressSpace/CreateThread/TransferPage as the relevant system calls. (Which are powerful enough to implement any variation of the fork/createprocess calls)Theres no reason, as I see it, not to support both. For simple tasks, CreateProcess can be more efficient; for complex ones, fork() can be useful.
fork is like one of these Unix institutions itself and I've not fully understood it yet. Win32 has survived well without a fork API and I have no plans introducing one in my kernel since I cannot find a use case for it. CreateProcess does well for me.
- Combuster
- Member
- Posts: 9301
- Joined: Wed Oct 18, 2006 3:45 am
- Libera.chat IRC: [com]buster
- Location: On the balcony, where I can actually keep 1½m distance
- Contact:
Re: difference between fork() and threads...
Textbook CreateProcess() does three things: create a new address space, load a program from disk into that address space, create a thread in the new address space. fork() follows the same steps except for copying itself rather than loading a new program at the call.
CreateAddressSpace, as per the name, only performs the first step. The calling program can then load a program as a copy of itself, shared from itself, an entirely different program, and can also set up debugging facilities and patch the result before an actual thread is created and started as the last step. In fact, CreateAddressSpace is little more than a security feature: a new program may very well be loaded to share the caller's address space.
The concept is also independent from mono/micro considerations: CPU management is in the end always done by the kernel, memory management is not necessarily specified, and the code to load an actual program may be part of a dedicated system call. Remote memory modifications are also not limited to either concept (though microkernel designs have a bigger tendency to require it)
CreateAddressSpace, as per the name, only performs the first step. The calling program can then load a program as a copy of itself, shared from itself, an entirely different program, and can also set up debugging facilities and patch the result before an actual thread is created and started as the last step. In fact, CreateAddressSpace is little more than a security feature: a new program may very well be loaded to share the caller's address space.
The concept is also independent from mono/micro considerations: CPU management is in the end always done by the kernel, memory management is not necessarily specified, and the code to load an actual program may be part of a dedicated system call. Remote memory modifications are also not limited to either concept (though microkernel designs have a bigger tendency to require it)
- gravaera
- Member
- Posts: 737
- Joined: Tue Jun 02, 2009 4:35 pm
- Location: Supporting the cause: Use \tabs to indent code. NOT \x20 spaces.
Re: difference between fork() and threads...
Fork is yet another "gift" we have been bestowed with from *nix, and like all the other gifts from *nix, at the time it was invented, it made sense. And when it obviously came out of practicality, it was stubbornly clung to. Concurrent servers do not any longer need fork. There are threads and IPC now. Welcome to 2011.
Meanwhile, in Australia...
Meanwhile, in Australia...
17:56 < sortie> Paging is called paging because you need to draw it on pages in your notebook to succeed at it.
Re: difference between fork() and threads...
POSIX also has vfork(2), which doesn't copy the virtual memory of the parent process, which is blocked while the child is using its resources. If a child process is used to call exec*(2), fork(2) won't be the best idea.
- xenos
- Member
- Posts: 1121
- Joined: Thu Aug 11, 2005 11:00 pm
- Libera.chat IRC: xenos1984
- Location: Tartu, Estonia
- Contact:
Re: difference between fork() and threads...
I see the benefits. Well, my idea is that CreateProcess may create a new process either in an active or inactive state, and a handle / PID is returned to the caller. When the new process is created in an inactive state, the caller can thus do things like granting resources to the new process and finally switch its state to active. I guess this would provide a similar functionality.Owen wrote:Fork is useful because it lets you do some setup in the context of the child process before handing over control. It provides a lot of flexibility that CreateProcess doesn't for doing things like massaging file descriptors.
Re: difference between fork() and threads...
What would be the problem to create a new process with a flag (start or waiting)? So you then could to all things you would do between a fork() and an exec(). This is the way I´m doing it.