Confused about Paging...

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
pcmattman
Member
Member
Posts: 2566
Joined: Sun Jan 14, 2007 9:15 pm
Libera.chat IRC: miselin
Location: Sydney, Australia (I come from a land down under!)
Contact:

Confused about Paging...

Post by pcmattman »

I'm really confused about this whole paging concept. All that I understand, after hours of reading, is that it allows me to have multiple address spaces within the physical memory. I've looked at memory allocators, tried to understand them and failed.

How exactly am I meant to allocate pages (which I think is a good start), and then map the page directories and tables to the right places? And what order do I do things in when I have to create a new process in a new address space?
User avatar
Kevin McGuire
Member
Member
Posts: 843
Joined: Tue Nov 09, 2004 12:00 am
Location: United States
Contact:

Post by Kevin McGuire »

What do you already know about paging?
pcmattman
Member
Member
Posts: 2566
Joined: Sun Jan 14, 2007 9:15 pm
Libera.chat IRC: miselin
Location: Sydney, Australia (I come from a land down under!)
Contact:

Post by pcmattman »

Not a lot.

I know that pages are 4 kb blocks of memory. I know that a page directory is full of page tables which are full of mapping information. I don't know much else.
frank
Member
Member
Posts: 729
Joined: Sat Dec 30, 2006 2:31 pm
Location: East Coast, USA

Post by frank »

Have you looked at this tutorial?
http://www.osdever.net/tutorials/pdf/memory1.pdf
User avatar
deadmutex
Member
Member
Posts: 85
Joined: Wed Sep 28, 2005 11:00 pm

Post by deadmutex »

I know that pages are 4 kb blocks of memory. I know that a page directory is full of page tables which are full of mapping information. I don't know much else.
Be careful here. A page directory contains page directory entries (PDEs). Each PDE contains information about a page table. Likewise, a page table contains page table entries(PTEs) that contain information about pages. It's important to note that a page table is not the same thing as a PDE. When I first started with paging, it took me a couple months to realize this which caused lots of confusion.
pcmattman
Member
Member
Posts: 2566
Joined: Sun Jan 14, 2007 9:15 pm
Libera.chat IRC: miselin
Location: Sydney, Australia (I come from a land down under!)
Contact:

Post by pcmattman »

@frank: thanks heaps for that link, it was pretty much just what I was looking for. I don't know how I missed it earlier :?

@deadmutex: thanks for the tip 8)
User avatar
Bughunter
Member
Member
Posts: 94
Joined: Mon Dec 18, 2006 5:49 am
Location: Netherlands
Contact:

Post by Bughunter »

Let's see how a processor performs the translation of an address in this situation:

"mov [0x010B71A8], al" (in Intel syntax)

Let's translate the linear address 0x010B71A8

Code: Select all

0000 0001    0000 1011    0111 0001    1010 1000
   0    1       0    B       7    1       A    8
From left to right, we have bits 31 through 0.

The bits 22 through 31, which are 0000 0001 00, select the entry (read: page table) from the PD (Page Directory). So it uses the 4th PT (Page Table) from the PD. The first 10 bits are just an offset into the PD to select a PT.

The bits 12 through 21, which are 00 1011 0111, are an offset into the PT to select the page we're looking up.

And then, the other bits (bits 0 through 11) select the physical address into the page. (You could see it as an offset)

PAE is let out of the example to simplify the explanation of Linear Address Translation.

Let's clarify all this with an image from the Intel Manual (Architecture Software Developer’s Manual Volume 3A: System Programming Guide, Part 1), see the attachment.

(Anyone, correct me if I'm wrong)
Attachments
linear-address-translation.png
linear-address-translation.png (39.09 KiB) Viewed 2037 times
User avatar
hailstorm
Member
Member
Posts: 110
Joined: Wed Nov 02, 2005 12:00 am
Location: The Netherlands

Post by hailstorm »

Bughunter is right. Before you read the whole ' story' below I suggest you get the intel manuals at hand. The images in these manuals are self-explanatory.
B.t.w.; I hope you have the full understanding of pointers!

A pagedirectory / pagetable is just a big array of pointers to pages.
A pagedirectory is ofcourse a page itself and so are pagetables. That is why these tables need to lie on a pageboundary, there are some other reasons to, but that is not important right now.
But keep in mind, the entries of pagedirectory and pagetables are pointers to physical addresses. An entry of a pagedirectory should contain a valid physical address to a memory location when it needs to be accessed. The same here for pagetables. When the pagetable pointer is used for a lookup by the processor, the pageindex should contain a valid physical address.
The structure of paging can be confusing indeed. So, a bit of applying logic can become handy, so think about it when you can! :)
A page without any extension enabled is 4K in size. 2^12 = 4096. That means that 12 bits are reserved for the offset within the page. These 12 bits are the lower part of a virtual address. Next, we are left with 20 bits.
The page index is specified by bits 21..12, exactly 10 bits. 2^10=1024, the same number as the number pagetable entries. The other bits, 31..22, are meant to be used as pagedirectory index, which also contains 1024 entries. That is why a virtual address is split in three as:
[31..22][21..12][11..0] => [PDE INDEX][PTE INDEX][OFFSET IN PAGE]

So with paging enabled, the processor uses the pagedir and pagetable structures as one big index, like an index in a book for example. Make sure all pages needed are present by setting the present bit, make sure you can find the pagedirectory and it's pagetables and you are ready to go. The theory of paging pretty easy, the complications really can be found in memory management itself.

The reason to use paging should be obvious. Back in the old days for example, when computers didn't had much RAM, paging as a memory management solution was introduced. Programs grew bigger and so did the data that programs used. It just didn't fit anymore.

Paging comes with one big advantage: just let the process think it has all the memory in the world. In the background operating systems can swap pages in and out while the program runs in memory. When a page is needed but isn't available in memory, a pagefault is generated. The operating system can fetch the page that is needed, or allocate more memory, whatever the reason for the pagefault is.
User avatar
Bughunter
Member
Member
Posts: 94
Joined: Mon Dec 18, 2006 5:49 am
Location: Netherlands
Contact:

Post by Bughunter »

hailstorm wrote: So with paging enabled, the processor uses the pagedir and pagetable structures as one big index, like an index in a book for example.
Only if you leave the PSE (Page Size Extensions) bit of CR0 cleared. If you enable the PSE bit, you will have 4-MB pages. When you use 4-MB pages, you will have 1024 entries in your PD (Page Directory) which point to pages (and give information about the flags of course) instead of page tables.
hailstorm wrote: The reason to use paging should be obvious. Back in the old days for example, when computers didn't had much RAM, paging as a memory management solution was introduced. Programs grew bigger and so did the data that programs used. It just didn't fit anymore.
The main reason for paging to an OSdever should be to give each process (read: user application) a separate virtual memory addressing space, thus protecting each process' virtual memory address space from being corrupted by another process.
User avatar
hailstorm
Member
Member
Posts: 110
Joined: Wed Nov 02, 2005 12:00 am
Location: The Netherlands

Post by hailstorm »

Ofcourse you're right, but for simplicity, I left page extension details out of my story, because the workings of the extensions are different from the basic model. B.t.w., when you have a 80386 at your hand, these extensions are not available.

The reason for implementing a mmu that supports paging, historically lies in the fact that memory was scarce. But, since pde's and pte's contain some management bits like the read/write-bit and superuser/user-bit, you can indeed protect each process virtual address space.
One note: I stronly agree with you that an OS developer should implement page protection...
User avatar
Bughunter
Member
Member
Posts: 94
Joined: Mon Dec 18, 2006 5:49 am
Location: Netherlands
Contact:

Post by Bughunter »

hailstorm wrote:Ofcourse you're right, but for simplicity, I left page extension details out of my story, because the workings of the extensions are different from the basic model. B.t.w., when you have a 80386 at your hand, these extensions are not available.
Yeah I kindda thought you did that on purpose, but I thought it would be worth noticing it for completeness (e.g. maybe it helps someone searching the forum about paging) :D
pcmattman
Member
Member
Posts: 2566
Joined: Sun Jan 14, 2007 9:15 pm
Libera.chat IRC: miselin
Location: Sydney, Australia (I come from a land down under!)
Contact:

Post by pcmattman »

bughunter wrote:The main reason for paging to an OSdever should be to give each process (read: user application) a separate virtual memory addressing space, thus protecting each process' virtual memory address space from being corrupted by another process.
Another reason: you don't have to do messy relocation work on flat binaries (not so sure about ELF... I'm writing a relocation module for ELF anyway).
User avatar
Bughunter
Member
Member
Posts: 94
Joined: Mon Dec 18, 2006 5:49 am
Location: Netherlands
Contact:

Post by Bughunter »

pcmattman wrote:
bughunter wrote:The main reason for paging to an OSdever should be to give each process (read: user application) a separate virtual memory addressing space, thus protecting each process' virtual memory address space from being corrupted by another process.
Another reason: you don't have to do messy relocation work on flat binaries (not so sure about ELF... I'm writing a relocation module for ELF anyway).
Why do you think it wouldn't be possible with ELF?

(I don't say it is or isn't, I don't know, just asking you ;))
pcmattman
Member
Member
Posts: 2566
Joined: Sun Jan 14, 2007 9:15 pm
Libera.chat IRC: miselin
Location: Sydney, Australia (I come from a land down under!)
Contact:

Post by pcmattman »

With ELF you have the program header (and section headers, and more) to contend with, you have to do relocation anyway. With a flat binary relocation is much harder, because you'd have to search for opcodes and then handle them appropriately. Giving an address space starting at 0 for each process helps.
User avatar
Bughunter
Member
Member
Posts: 94
Joined: Mon Dec 18, 2006 5:49 am
Location: Netherlands
Contact:

Post by Bughunter »

Oh yes indeed, now I get it :)
Post Reply