Page 1 of 1

phy page copying - disabling paging vs temporarily mapping

Posted: Sat Oct 22, 2016 8:16 pm
by stdcall
Hi.
I'm writing now my PDT clone function, and I need to do several page frame copies to un-mapped physical memory.
I saw in JamesM tutorial that he presents an assembly physical frame copy function which disables paging, copies a frame and re-enables paging.
Is this the preferred approach, alternatively I can just temporarily map the target physical page to kernel space and unmap it just after I'm done.

I suspect that Jamesm method might have problems with other cores, as paging is disabled for all SMP's. is that right ?

Re: phy page copying - disabling paging vs temporarily mappi

Posted: Sat Oct 22, 2016 8:39 pm
by Brendan
Hi,
stdcall wrote:I suspect that Jamesm method might have problems with other cores, as paging is disabled for all SMP's. is that right ?
No, each logical CPU is independent; and (with hyper-threading) you can even have paging enabled in one logical CPU and paging disabled in another logical CPU in the same core.
stdcall wrote:Is this the preferred approach, alternatively I can just temporarily map the target physical page to kernel space and unmap it just after I'm done.
It's not the preferred approach. The problem is that if you temporarily disable paging you wipe out all TLB entries for everything in that logical CPU, and when you enable paging again you get a large quantity of TLB misses (that could've been avoided). Mostly, it's a performance disaster.
stdcall wrote:I'm writing now my PDT clone function, and I need to do several page frame copies to un-mapped physical memory.
Are you sure you need to do this?

For example; assuming "plain 32-bit paging"; so that I can add/remove page tables for kernel-space (where all page directories need to be changed at the same time), I map all page directories into a big "array of page directories" in kernel-space. To create a new process I map a new physical page into that big "array of page directories" in kernel-space, then copy page directory entries into it, then create the process' initial thread (that uses that same physical page as a page directory).


Cheers,

Brendan

Re: phy page copying - disabling paging vs temporarily mappi

Posted: Sat Oct 22, 2016 8:59 pm
by stdcall
When forking a process, the currently loaded page directory is the parent process page directory.
After creating and mapping a new page directory, I need to actually copy individual page tables (code, data, stack).
Are you sure you need to do this?
I don't understand your statement regarding the above. is there other way ?
In your description you just mentioned kernel pages, which needs to be linked, not copied. right ?

Re: phy page copying - disabling paging vs temporarily mappi

Posted: Sun Oct 23, 2016 1:20 am
by Brendan
Hi,
stdcall wrote:When forking a process, the currently loaded page directory is the parent process page directory.
After creating and mapping a new page directory, I need to actually copy individual page tables (code, data, stack).
Ah - I thought a "PDT clone function" meant a function to clone page directory (tables) and not a function to clone page tables.
stdcall wrote:
Are you sure you need to do this?
I don't understand your statement regarding the above. is there other way ?
Sadly, "fork()" is an extremely inefficient nightmare of stupidity (clone everything, just so you can delete it all almost immediately when someone does the inevitable "exec()") and should be banned. To make "fork()" barely tolerable you have to use "copy on write".

If you are using "copy on write", then you have a choice:
  • Simpler/slower: Clone all page tables when you "fork()", and only clone pages when they're written to (when you get a page fault caused by writing to a "read-only" page that still needs to be cloned). For "plain 32-bit paging" with 3 GiB of user-space; this involves allocating up to 3 MiB of RAM and copying up to 3 MiB of data.
    Complex/faster: Only clone the page directory when you "fork()"; and only clone page tables if/when you must (when you get a page fault caused by writing to a "read-only" page where both the page table and the page itself still need to be cloned). Because "fork()" is typically followed by "exec()" soon after, this typically avoids the need to clone most page tables (and therefore typically avoids a lot of overhead).
In either case (regardless of when you're cloning page table/s), for cloning a page table I'd:
  • allocate a physical page and map it somewhere temporarily (preferably in a "CPU local" area of kernel space to avoid "multi-CPU TLB shootdown" and avoid the need for any "temporary area shared by multiple CPUs is currently in use" locking)
  • copy the old page table's entries into the new page table (possibly while changing pages to "read only" in their page table entries)
Note that this "temporary mapping" will involve TLB invalidation and cause a TLB miss; however, because this would happen immediately after you've mapped the page the info needed by CPU (e.g. page table entry, etc) will still be in the CPU's cache, therefore the TLB miss will be very fast (not a normal/slow "fetch everything all the way from RAM" TLB miss).

However, note that you have to do a whole lot more to make "copy on write" work; including:
  • Some kind of reference counting scheme, so that when you fork an already forked virtual address space you can do "number of copies++;" to keep track of how many processes are still using each "copy on write" page
  • Something to find the last remaining process and "un-share" pages (and page tables?), when "number of copies" is reduced to 1.
  • Solving/avoiding a bunch of race conditions (e.g. to keep things sane/sychronised when a second thread running on a different CPU writes to the same "copy on write" area that you're in the middle of working on).

Cheers,

Brendan

Re: phy page copying - disabling paging vs temporarily mappi

Posted: Wed Oct 26, 2016 3:41 pm
by Schol-R-LEA
While I agree that fork-exec semantics was and remains a terrible idea, I would say that it is more nuanced than it might seem at first; in the original Unix, fork had a specific purpose (creating multiple processes of the same program, being primarily meant for setting up new terminal sessions). The fault lies with Dennis Ritchie's 'brillant' brainwave of using fork-exec semantics for launching new programs from the shell, rather than incorporating process creation into the exec program loading system call.

AFAICT, the reasoning was that a) they needed fork for things like TTY creation anyway, b) it meant that the child process could inherit the environment of the parent process without passing any kind of links to it, c) they thought that separating process creation from program loading was a reasonable separation of concerns (it was, but not at the expense of an additional 'spawn' system call which would combine the two actions, IMAO), d) they thought having two different system calls for process creation would add memory costs and complicate the scheduler, since it would be getting new processes from two sources instead of one, and e) it was never going be be seen by anyone outside of Bell Labs so it didn't matter, anyway.

I am guessing that they were also thinking that it would make it possible to overlay programs manually, too, for when one program was invoking another as its final action (sort of like TCO, I guess, as a way to reduce virtual memory overhead), but that's just speculation.

What made sense in 1969 for a hand-coded assembly language OS meant for running a single video game made lot less sense in 1976 for a timesharing system running on several different types of mainframes, less still in 1984 for a workstation, even less in 1991 for a grad student's OS coursework project turned Unix-workalike server OS, and none at all in 2016 for a hobby OS that isn't even designed to be POSIX compatible. That we are now stuck with this, at least as far as POSIX compatible programs go, is further proof that even systems meant for one-time use need to be designed sensibly, because in the world of software, the only thing that is eternal is a one-off program.