Confused about Paging...
-
- Member
- Posts: 2566
- Joined: Sun Jan 14, 2007 9:15 pm
- Libera.chat IRC: miselin
- Location: Sydney, Australia (I come from a land down under!)
- Contact:
Confused about Paging...
I'm really confused about this whole paging concept. All that I understand, after hours of reading, is that it allows me to have multiple address spaces within the physical memory. I've looked at memory allocators, tried to understand them and failed.
How exactly am I meant to allocate pages (which I think is a good start), and then map the page directories and tables to the right places? And what order do I do things in when I have to create a new process in a new address space?
How exactly am I meant to allocate pages (which I think is a good start), and then map the page directories and tables to the right places? And what order do I do things in when I have to create a new process in a new address space?
- Kevin McGuire
- Member
- Posts: 843
- Joined: Tue Nov 09, 2004 12:00 am
- Location: United States
- Contact:
Have you looked at this tutorial?
http://www.osdever.net/tutorials/pdf/memory1.pdf
http://www.osdever.net/tutorials/pdf/memory1.pdf
Be careful here. A page directory contains page directory entries (PDEs). Each PDE contains information about a page table. Likewise, a page table contains page table entries(PTEs) that contain information about pages. It's important to note that a page table is not the same thing as a PDE. When I first started with paging, it took me a couple months to realize this which caused lots of confusion.I know that pages are 4 kb blocks of memory. I know that a page directory is full of page tables which are full of mapping information. I don't know much else.
Let's see how a processor performs the translation of an address in this situation:
"mov [0x010B71A8], al" (in Intel syntax)
Let's translate the linear address 0x010B71A8
From left to right, we have bits 31 through 0.
The bits 22 through 31, which are 0000 0001 00, select the entry (read: page table) from the PD (Page Directory). So it uses the 4th PT (Page Table) from the PD. The first 10 bits are just an offset into the PD to select a PT.
The bits 12 through 21, which are 00 1011 0111, are an offset into the PT to select the page we're looking up.
And then, the other bits (bits 0 through 11) select the physical address into the page. (You could see it as an offset)
PAE is let out of the example to simplify the explanation of Linear Address Translation.
Let's clarify all this with an image from the Intel Manual (Architecture Software Developer’s Manual Volume 3A: System Programming Guide, Part 1), see the attachment.
(Anyone, correct me if I'm wrong)
"mov [0x010B71A8], al" (in Intel syntax)
Let's translate the linear address 0x010B71A8
Code: Select all
0000 0001 0000 1011 0111 0001 1010 1000
0 1 0 B 7 1 A 8
The bits 22 through 31, which are 0000 0001 00, select the entry (read: page table) from the PD (Page Directory). So it uses the 4th PT (Page Table) from the PD. The first 10 bits are just an offset into the PD to select a PT.
The bits 12 through 21, which are 00 1011 0111, are an offset into the PT to select the page we're looking up.
And then, the other bits (bits 0 through 11) select the physical address into the page. (You could see it as an offset)
PAE is let out of the example to simplify the explanation of Linear Address Translation.
Let's clarify all this with an image from the Intel Manual (Architecture Software Developer’s Manual Volume 3A: System Programming Guide, Part 1), see the attachment.
(Anyone, correct me if I'm wrong)
- Attachments
-
- linear-address-translation.png (39.09 KiB) Viewed 2032 times
Bughunter is right. Before you read the whole ' story' below I suggest you get the intel manuals at hand. The images in these manuals are self-explanatory.
B.t.w.; I hope you have the full understanding of pointers!
A pagedirectory / pagetable is just a big array of pointers to pages.
A pagedirectory is ofcourse a page itself and so are pagetables. That is why these tables need to lie on a pageboundary, there are some other reasons to, but that is not important right now.
But keep in mind, the entries of pagedirectory and pagetables are pointers to physical addresses. An entry of a pagedirectory should contain a valid physical address to a memory location when it needs to be accessed. The same here for pagetables. When the pagetable pointer is used for a lookup by the processor, the pageindex should contain a valid physical address.
The structure of paging can be confusing indeed. So, a bit of applying logic can become handy, so think about it when you can!
A page without any extension enabled is 4K in size. 2^12 = 4096. That means that 12 bits are reserved for the offset within the page. These 12 bits are the lower part of a virtual address. Next, we are left with 20 bits.
The page index is specified by bits 21..12, exactly 10 bits. 2^10=1024, the same number as the number pagetable entries. The other bits, 31..22, are meant to be used as pagedirectory index, which also contains 1024 entries. That is why a virtual address is split in three as:
[31..22][21..12][11..0] => [PDE INDEX][PTE INDEX][OFFSET IN PAGE]
So with paging enabled, the processor uses the pagedir and pagetable structures as one big index, like an index in a book for example. Make sure all pages needed are present by setting the present bit, make sure you can find the pagedirectory and it's pagetables and you are ready to go. The theory of paging pretty easy, the complications really can be found in memory management itself.
The reason to use paging should be obvious. Back in the old days for example, when computers didn't had much RAM, paging as a memory management solution was introduced. Programs grew bigger and so did the data that programs used. It just didn't fit anymore.
Paging comes with one big advantage: just let the process think it has all the memory in the world. In the background operating systems can swap pages in and out while the program runs in memory. When a page is needed but isn't available in memory, a pagefault is generated. The operating system can fetch the page that is needed, or allocate more memory, whatever the reason for the pagefault is.
B.t.w.; I hope you have the full understanding of pointers!
A pagedirectory / pagetable is just a big array of pointers to pages.
A pagedirectory is ofcourse a page itself and so are pagetables. That is why these tables need to lie on a pageboundary, there are some other reasons to, but that is not important right now.
But keep in mind, the entries of pagedirectory and pagetables are pointers to physical addresses. An entry of a pagedirectory should contain a valid physical address to a memory location when it needs to be accessed. The same here for pagetables. When the pagetable pointer is used for a lookup by the processor, the pageindex should contain a valid physical address.
The structure of paging can be confusing indeed. So, a bit of applying logic can become handy, so think about it when you can!
A page without any extension enabled is 4K in size. 2^12 = 4096. That means that 12 bits are reserved for the offset within the page. These 12 bits are the lower part of a virtual address. Next, we are left with 20 bits.
The page index is specified by bits 21..12, exactly 10 bits. 2^10=1024, the same number as the number pagetable entries. The other bits, 31..22, are meant to be used as pagedirectory index, which also contains 1024 entries. That is why a virtual address is split in three as:
[31..22][21..12][11..0] => [PDE INDEX][PTE INDEX][OFFSET IN PAGE]
So with paging enabled, the processor uses the pagedir and pagetable structures as one big index, like an index in a book for example. Make sure all pages needed are present by setting the present bit, make sure you can find the pagedirectory and it's pagetables and you are ready to go. The theory of paging pretty easy, the complications really can be found in memory management itself.
The reason to use paging should be obvious. Back in the old days for example, when computers didn't had much RAM, paging as a memory management solution was introduced. Programs grew bigger and so did the data that programs used. It just didn't fit anymore.
Paging comes with one big advantage: just let the process think it has all the memory in the world. In the background operating systems can swap pages in and out while the program runs in memory. When a page is needed but isn't available in memory, a pagefault is generated. The operating system can fetch the page that is needed, or allocate more memory, whatever the reason for the pagefault is.
Only if you leave the PSE (Page Size Extensions) bit of CR0 cleared. If you enable the PSE bit, you will have 4-MB pages. When you use 4-MB pages, you will have 1024 entries in your PD (Page Directory) which point to pages (and give information about the flags of course) instead of page tables.hailstorm wrote: So with paging enabled, the processor uses the pagedir and pagetable structures as one big index, like an index in a book for example.
The main reason for paging to an OSdever should be to give each process (read: user application) a separate virtual memory addressing space, thus protecting each process' virtual memory address space from being corrupted by another process.hailstorm wrote: The reason to use paging should be obvious. Back in the old days for example, when computers didn't had much RAM, paging as a memory management solution was introduced. Programs grew bigger and so did the data that programs used. It just didn't fit anymore.
Ofcourse you're right, but for simplicity, I left page extension details out of my story, because the workings of the extensions are different from the basic model. B.t.w., when you have a 80386 at your hand, these extensions are not available.
The reason for implementing a mmu that supports paging, historically lies in the fact that memory was scarce. But, since pde's and pte's contain some management bits like the read/write-bit and superuser/user-bit, you can indeed protect each process virtual address space.
One note: I stronly agree with you that an OS developer should implement page protection...
The reason for implementing a mmu that supports paging, historically lies in the fact that memory was scarce. But, since pde's and pte's contain some management bits like the read/write-bit and superuser/user-bit, you can indeed protect each process virtual address space.
One note: I stronly agree with you that an OS developer should implement page protection...
Yeah I kindda thought you did that on purpose, but I thought it would be worth noticing it for completeness (e.g. maybe it helps someone searching the forum about paging)hailstorm wrote:Ofcourse you're right, but for simplicity, I left page extension details out of my story, because the workings of the extensions are different from the basic model. B.t.w., when you have a 80386 at your hand, these extensions are not available.
-
- Member
- Posts: 2566
- Joined: Sun Jan 14, 2007 9:15 pm
- Libera.chat IRC: miselin
- Location: Sydney, Australia (I come from a land down under!)
- Contact:
Another reason: you don't have to do messy relocation work on flat binaries (not so sure about ELF... I'm writing a relocation module for ELF anyway).bughunter wrote:The main reason for paging to an OSdever should be to give each process (read: user application) a separate virtual memory addressing space, thus protecting each process' virtual memory address space from being corrupted by another process.
Why do you think it wouldn't be possible with ELF?pcmattman wrote:Another reason: you don't have to do messy relocation work on flat binaries (not so sure about ELF... I'm writing a relocation module for ELF anyway).bughunter wrote:The main reason for paging to an OSdever should be to give each process (read: user application) a separate virtual memory addressing space, thus protecting each process' virtual memory address space from being corrupted by another process.
(I don't say it is or isn't, I don't know, just asking you )
-
- Member
- Posts: 2566
- Joined: Sun Jan 14, 2007 9:15 pm
- Libera.chat IRC: miselin
- Location: Sydney, Australia (I come from a land down under!)
- Contact:
With ELF you have the program header (and section headers, and more) to contend with, you have to do relocation anyway. With a flat binary relocation is much harder, because you'd have to search for opcodes and then handle them appropriately. Giving an address space starting at 0 for each process helps.