Hi, all
Sorry if my questions appears too naive. ;D I am just ramping up on OS dev after years of Bios development.
I came across the following paragraphs while reading IA32 Manual Vol3 Chap 3:
"Memory management software has the option of using one page directory for all programs and tasks, one page directory for each task, or some combination of the two."
I have also heard some others saying that sharing page directory can remove the overhead of hardware switching & TLB flushing. But I doubt how an os adopting this policy could ever protect one process's memory space from being trespassing by another process now that the latter could see the former's pages too.
Can anyone help? Thanks in advance.
Best regards,
Cody
Question on Page Tables
- Pype.Clicker
- Member
- Posts: 5964
- Joined: Wed Oct 18, 2006 2:31 am
- Location: In a galaxy, far, far away
- Contact:
Re:Question on Page Tables
well, you're simply free to "manually edit" the current page directory rather than switching to a new directory. That makes little sense when there are plenty of entries to be rewritten, but it _can_ make sense if many of your running program share most of their code (through libraries) and data (shared stuff).
In that case, there will be no security penalty since after a switch is taken, all you got is an address space with stuffs that are either yours (just brought in) or stuff you cannot modify (copy-on-write data, read-only code, kernel stuff, etc).
btw, it's amusing to see intel writing "some combination of the two" when the two are actually incompatible. I guess what they mean is that "you may have groups of programs having the same directory, but different page tables, which is distinct from the directory of another group of programs".
In that case, there will be no security penalty since after a switch is taken, all you got is an address space with stuffs that are either yours (just brought in) or stuff you cannot modify (copy-on-write data, read-only code, kernel stuff, etc).
btw, it's amusing to see intel writing "some combination of the two" when the two are actually incompatible. I guess what they mean is that "you may have groups of programs having the same directory, but different page tables, which is distinct from the directory of another group of programs".
Re:Question on Page Tables
thats what the global bit is for...
most of the time, there are significant portions of the address space that are shared -- the global bit (introduced on the P5, iirc) allows these to not change when updating the page directory
most of the time, there are significant portions of the address space that are shared -- the global bit (introduced on the P5, iirc) allows these to not change when updating the page directory
your discription is somewhat confusing, but this is exactly what most OSs do -- the 'groups' which share directories are usually called 'threads' -- if your OS keeps threads in the same directory (most do -- but not all), but isolates processes in separate directories, then you are using "some combination of the two"btw, it's amusing to see intel writing "some combination of the two" when the two are actually incompatible. I guess what they mean is that "you may have groups of programs having the same directory, but different page tables, which is distinct from the directory of another group of programs".
pype.clickers answer is the general way, but there is another (it isnt used much -- it gets very complicated) that is to use segmentation to isolate the processes, within a single page directory (most people use 'flat mode' and ignore segmentation)But I doubt how an os adopting this policy could ever protect one process's memory space from being trespassing by another process now that the latter could see the former's pages too.
Re:Question on Page Tables
There is another way, which I think is used by the microsoft researchers 'singularity' operating system. Software isolated processes I think they are called - the idea is that all the code is not actually run, but is interpretted by the OS, allowing the OS to define custom protection excatly as it once (as well as bypassing hardware switching overheads).have also heard some others saying that sharing page directory can remove the overhead of hardware switching & TLB flushing. But I doubt how an os adopting this policy could ever protect one process's memory space from being trespassing by another process now that the latter could see the former's pages too.
You can see the website for more details:
http://research.microsoft.com/os/singularity/
- Colonel Kernel
- Member
- Posts: 1437
- Joined: Tue Oct 17, 2006 6:06 pm
- Location: Vancouver, BC, Canada
- Contact:
Re:Question on Page Tables
Nope. There is no interpreting going on whatsoever -- that would be terribly slow. Instead, safe languages are used and the output of the compiler is an intermediate language that can be verified by the OS before (and after, in later phases of the project) that IL is translated into native machine code. This verification occurs at installation time once for each application.0Scoder wrote:the idea is that all the code is not actually run, but is interpretted by the OS
Top three reasons why my OS project died:
- Too much overtime at work
- Got married
- My brain got stuck in an infinite loop while trying to design the memory manager
Re:Question on Page Tables
Yes, that will make it stay in the TLB. But what puzzles me most is what will happen when I loaded into CR3 a page directory that has changed the mapping of the page marked as global? Intel's manuals listed two ways to invalidate Global entries 1) Clear PGE flag and invalidate the TLB. 2) Execute INVLPG. But they didn't mention whether the page will be automatically invalidated in the case I propose. Shall the OS task switching part do it explicitly or not?most of the time, there are significant portions of the address space that are shared -- the global bit (introduced on the P5, iirc) allows these to not change when updating the page directory
Yes, for threads that makes sense. But the sayings in the manual seems to suggest it be kept during task switches. I am sort of confused over the concept of "Task" and "Threads", "Process". It seems to me that "Task" is somewhat equivalent to "Process". But on the other side, "Thread" means execution path switching and in the processor realm, "Tasking Switching"(either hardware or software) is the only way so "Threads" can be treated as "Task" in a broad sense. Am I wrong?your discription is somewhat confusing, but this is exactly what most OSs do -- the 'groups' which share directories are usually called 'threads' -- if your OS keeps threads in the same directory (most do -- but not all), but isolates processes in separate directories, then you are using "some combination of the two"
So as a summary, different processes (these forked or copy-on-write excluded) shall have different page directory mappings yet they all contain some entries that are common among all which are for shared resources (code or data, such as interrupt call, system kernel data). Is that right?
Thanks for your reply!
Best regards,
Cody
Re:Question on Page Tables
when the page is marked as global in the TLB (clearing it in the tables will not affect it if it is currently in the TLB), it will not be updated on a CR3 write, you must invlpg the page to change it (that is why the instruction was created -- it came to exist at the same time as the global bit)But what puzzles me most is what will happen when I loaded into CR3 a page directory that has changed the mapping of the page marked as global?
when the CPU needs to access memory, it first looks in the TLB, if there is no entry, it loads the page table into a TLB -- when you reload CR3, all TLBs are marked as empy, so they will be reloaded as needed -- unless the TLB is marked global -- then it will retain its previous values, and the CPU wont even notice the changed table (all invlpg does is mark the TLB as invalid)
this confusion is normal -- everyone has a different definition of process and thread
Yes, for threads that makes sense. But the sayings in the manual seems to suggest it be kept during task switches. I am sort of confused over the concept of "Task" and "Threads", "Process". It seems to me that "Task" is somewhat equivalent to "Process". But on the other side, "Thread" means execution path switching and in the processor realm, "Tasking Switching"(either hardware or software) is the only way so "Threads" can be treated as "Task" in a broad sense. Am I wrong?
some OSs treat both threads and processes exactly the same -- without any difference, but most keep threads as processes which share address space -- they are handled by the task-switch, but they are handled differently (no CR3 load if switch is a new thread in the same process), so yes, there is a task-switch, and the address space stays the same through it, and other times (when switching to a different process) it does change, making this a true hybred solution
yes, i think that is correct (ignoring, for now, threads -- which some treat the same as processes anyway)So as a summary, different processes (these forked or copy-on-write excluded) shall have different page directory mappings yet they all contain some entries that are common among all which are for shared resources (code or data, such as interrupt call, system kernel data). Is that right?
i hope ive been able to help you
Re:Question on Page Tables
Hi, JAAman
I remembered when I programmed on Sun's Sparc II there seems to be no definition of "thread" and everything there is process based. But later when I switched to ReadHat, they introduced some library such as "thread.o" so if you want to use functions such as "CreateThread()" you will have to link with this extension explicitly.
So it appears to me linux just treat the two without biases. There are differences between the two, but that you just won't notice the differences when talking in the kernel switching mechanisms.
Thanks again for your detailed explanation. It's really great to be able to get help from guys like you!!!! ;D
Best regards,
Cody
Yes, you have been helping a lot. It's really good to discuss with u. I have begun to read Linux Kernel's source code trying to figure out how it utilizes various features of Intel's processors.i hope ive been able to help you
I checked Linux's source code "Sched.c" and the key functions "context_switch()". After a rough reading, I believe Linux, just as you have mentioned, treat thread and process in exactly the same way. And every task switch(linux used what is termed as 'Soft Switch') includes CR3 refreshing and TSS's Esp0 updating.some OSs treat both threads and processes exactly the same -- without any difference, but most keep threads as processes which share address space -- they are handled by the task-switch, but they are handled differently (no CR3 load if switch is a new thread in the same process), so yes, there is a task-switch, and the address space stays the same through it, and other times (when switching to a different process) it does change, making this a true hybred solution
I remembered when I programmed on Sun's Sparc II there seems to be no definition of "thread" and everything there is process based. But later when I switched to ReadHat, they introduced some library such as "thread.o" so if you want to use functions such as "CreateThread()" you will have to link with this extension explicitly.
So it appears to me linux just treat the two without biases. There are differences between the two, but that you just won't notice the differences when talking in the kernel switching mechanisms.
Thanks again for your detailed explanation. It's really great to be able to get help from guys like you!!!! ;D
Best regards,
Cody