slacker wrote:
does the start of a page have an offset of 0? if so, once a page is full, the next page will have an offset of 0 and won't this mess up a large program?
I think I understand what you're asking, but I may be wrong so I apologise in advance if I am.
In protected mode with paging the processor goes through a couple of stages of address translation. I'll go from scratch to make things easier to follow.
Logical address (Sometimes called virtual address):
This is the address that is specified in your code. It always consists of two parts, the
selector and the
offset. In many memory operations the selector is implied, not specified, but there is always one associated with the operation.
The processor uses the selector to look up the relevant segment descriptor in the GDT (This is a simplification of what is actually happening, but will suffice for explanation purposes). This segment descriptor contains quite a bit of information, but the relevant one to address translation is the segment base address.
The segment base address is added to the offset to produce the
linear address. In a flat memory model all segments have a base of 0, so effectively logical address = linear address.
Linear address
This is what you're left with after passing through the segment related part of address translation. If you aren't using paging then Linear address = Physical address.
If you are using paging (Which is what you asked) then the following applies.
The x86 architecture uses 3 structures to implement paging when using 4096 byte pages. The first is the page
directory, the second the page
table, and the third is the page table
entry.
The page directory contains a list of pointers to 1024 page tables, each page table contains 1024 page table entries, each page table entry describes one 4096 byte page of memory.
First step in translation is that the
linear address gets split up.
Bits 22->31 are used as an offset into the page directory, which provides the processor with a pointer to the start of the relevant page table.
Bits 12->21 are used as an offset into the page table, which provides the processor the relevant page table entry.
Bits 0->11 are used as the offset into the page.
So the processor uses bits 12-31 to find the page table entry that corresponds to the 4096 byte page which contains the linear address (Bit of a mouthful that sentence, hope it makes sense).
Bits 12->31 of each page table entry contain the base address for that page in memory. The offset into the page (Which was determined from the linear address earlier) is added to this base to give the final physical address.
Physical address
This is the one that appears on the address lines when the processor accesses the physical RAM of the machine.
***
Ok, example time (Protected mode, with segments, with paging).
Consider trying to address the byte at 0xf0001234 (This is something like 3.7gb) in segment 0x10. This is our
logical address
The processor goes to the GDT, looks 0x10 bytes into it (Which is the 3rd GDT entry) and gets the base address of the segment descriptor it finds there. For our purposes let's say the base address of this segment is 0x3000.
This leaves us with a
linear address of 0xf0004234.
Binary of this is (Which I've already split up):
1111000000 0000000100 001000110100
The processor gets the address for the start of the page directory from the CR3 register.
1111000000 (960) represents the page table we need from the page directory.
0000000100 (4) represents the page table entry we need from the page table.
So the processor happily wanders off and gets us the page base address from the 4th entry in the 960th page table. For our purposes let's say this page base address is 0x1000.
001000110100 (564 or 0x234) represents the offset we require into this page of memory.
So the final physical address is 0x1234 (Worked out nice ;D).
So here you see one of the beauties of paging. Our original logical address was something like 3.7Gb into the address space. Most machines don't have anything like this amount of physical RAM, but through paging we have mapped this logical address into one that is less than 8k from the start of physical RAM.
***
Big post, hope it helps. Certainly helped me get things clear just by explaining it. If there are any errors hopefully one of the gurus will correct 'em.