Page 1 of 1

Copy-on-write and nested forks

Posted: Wed Jan 03, 2018 6:41 pm
by dborowiec10
Hi,

I am in the middle of re-implementing my page directory clone function and I had a thought about copy-on-write.
I would like to add this feature to my memory management for my processes as currently, I'm simply cloning
all page frames from parent to child process address space.
So i thought, unless I want to explicitly clone the data, I might as well set the entries to read only as per
description of the copy-on-write mechanism.

Then it struck me...
What if a fork calls more forks which in turn call more forks and at the end, all those clones want to do some writes?
From what I understand, when a page fault occurs and the copy-on-write mechanism is in place,
the faulting process should make a copy of the failed page frame into its own address space.
This all makes perfect sense when I'm dealing with 2 processes, singe parent and a forked clone but
makes much less sense with nested forking and clones of clones.

Say for example:
process [A] called fork - giving birth to process
process called fork 2 times - giving birth to processes [C], [D]

Now...
[D] tries to write to a memory location, say malloc is invoked which tries to
allocate some memory from the process'es heap (which is currently 1 page large and set as read-only)
Page fault occurs...
[D] clones the page into its own address space and returns back to where it was
[C] attempts to write... page fault... clone... return.
attempts to write... page fault... clone... return.

Finally, [A] tries to write to the same memory location
It's r/w bit is still read-only from the original fork()
What does it do?
[A] is the original owner of the directory and the associated frames in memory,
surely it should not make copies of its own data (that would leave the original data hanging somewhere in physical memory).
Should an original fork mark who is the rightful owner of the directory and who is the borrower of mappings?
I just can't think of a different approach.

Then again, there is a dilemma of what to do if parent dies before children.
Should it forcefully kill children (least appealing option), should it check if any of its children
are clones (not called execv yet - share underlying physical frames) and designate one of them to be
the owner? (It can't really free the physical memory since any one of the children might wanna write, page fault and clone it
and that brings us back to the hanging/dangling frames without a process - child clones stuff and returns leaving memory without a parent process)

dborowiec10

Re: Copy-on-write and nested forks

Posted: Thu Jan 04, 2018 12:06 pm
by dborowiec10
It took me a while but I think I found a working solution.
It will require me to change my physical allocator from a simple bitmap to a bit bigger
array of structs. Each struct contains an address and a reference count.
On shared physical pages, ref-count will be > 1 and free pages will have ref-count of 0.
Process will be able to determine whether a frame is shared or not and thus act accordingly.
Not saying this is the only way to handle this but it works for me and hopefully will help others.

FYI: It is rather bulky, 40bits per page of physical memory, grows linearly as the amount
of physical memory increases. I think it is possible to further reduce it to 8 bits per page
(or however many bits you want per reference count value) and store only that rather than
a struct per page (more of a byte map rather than bitmap).

It doesn't sound complex and is definitely less advanced than a Linux Buddy but
works for me as I have a conceptual limit of a max alloc size of a single page.

dborowiec10

Re: Copy-on-write and nested forks

Posted: Thu Jan 04, 2018 1:12 pm
by Korona
Yeah, you pretty much have to resort to some sort of ref counting, either on the page itself or on the mapping (depending on the granularity that you use to resolve copy-on-write).

For the record: Linux uses 64 bytes per physical page (in its struct page). If you can do it with only 5 bytes, that is pretty good.