Copy-on-write and nested forks
Posted: Wed Jan 03, 2018 6:41 pm
Hi,
I am in the middle of re-implementing my page directory clone function and I had a thought about copy-on-write.
I would like to add this feature to my memory management for my processes as currently, I'm simply cloning
all page frames from parent to child process address space.
So i thought, unless I want to explicitly clone the data, I might as well set the entries to read only as per
description of the copy-on-write mechanism.
Then it struck me...
What if a fork calls more forks which in turn call more forks and at the end, all those clones want to do some writes?
From what I understand, when a page fault occurs and the copy-on-write mechanism is in place,
the faulting process should make a copy of the failed page frame into its own address space.
This all makes perfect sense when I'm dealing with 2 processes, singe parent and a forked clone but
makes much less sense with nested forking and clones of clones.
Say for example:
process [A] called fork - giving birth to process
process called fork 2 times - giving birth to processes [C], [D]
Now...
[D] tries to write to a memory location, say malloc is invoked which tries to
allocate some memory from the process'es heap (which is currently 1 page large and set as read-only)
Page fault occurs...
[D] clones the page into its own address space and returns back to where it was
[C] attempts to write... page fault... clone... return.
attempts to write... page fault... clone... return.
Finally, [A] tries to write to the same memory location
It's r/w bit is still read-only from the original fork()
What does it do?
[A] is the original owner of the directory and the associated frames in memory,
surely it should not make copies of its own data (that would leave the original data hanging somewhere in physical memory).
Should an original fork mark who is the rightful owner of the directory and who is the borrower of mappings?
I just can't think of a different approach.
Then again, there is a dilemma of what to do if parent dies before children.
Should it forcefully kill children (least appealing option), should it check if any of its children
are clones (not called execv yet - share underlying physical frames) and designate one of them to be
the owner? (It can't really free the physical memory since any one of the children might wanna write, page fault and clone it
and that brings us back to the hanging/dangling frames without a process - child clones stuff and returns leaving memory without a parent process)
dborowiec10
I am in the middle of re-implementing my page directory clone function and I had a thought about copy-on-write.
I would like to add this feature to my memory management for my processes as currently, I'm simply cloning
all page frames from parent to child process address space.
So i thought, unless I want to explicitly clone the data, I might as well set the entries to read only as per
description of the copy-on-write mechanism.
Then it struck me...
What if a fork calls more forks which in turn call more forks and at the end, all those clones want to do some writes?
From what I understand, when a page fault occurs and the copy-on-write mechanism is in place,
the faulting process should make a copy of the failed page frame into its own address space.
This all makes perfect sense when I'm dealing with 2 processes, singe parent and a forked clone but
makes much less sense with nested forking and clones of clones.
Say for example:
process [A] called fork - giving birth to process
process called fork 2 times - giving birth to processes [C], [D]
Now...
[D] tries to write to a memory location, say malloc is invoked which tries to
allocate some memory from the process'es heap (which is currently 1 page large and set as read-only)
Page fault occurs...
[D] clones the page into its own address space and returns back to where it was
[C] attempts to write... page fault... clone... return.
attempts to write... page fault... clone... return.
Finally, [A] tries to write to the same memory location
It's r/w bit is still read-only from the original fork()
What does it do?
[A] is the original owner of the directory and the associated frames in memory,
surely it should not make copies of its own data (that would leave the original data hanging somewhere in physical memory).
Should an original fork mark who is the rightful owner of the directory and who is the borrower of mappings?
I just can't think of a different approach.
Then again, there is a dilemma of what to do if parent dies before children.
Should it forcefully kill children (least appealing option), should it check if any of its children
are clones (not called execv yet - share underlying physical frames) and designate one of them to be
the owner? (It can't really free the physical memory since any one of the children might wanna write, page fault and clone it
and that brings us back to the hanging/dangling frames without a process - child clones stuff and returns leaving memory without a parent process)
dborowiec10