Page 1 of 1

andrew tanenbaums book (2nd edition)

Posted: Tue Jan 13, 2009 5:25 pm
by sweetgum
I'm having tons of trouble understanding page 326, the part about 0x00403004 i dont understand how PT1=1 PT2=3 and Offset=4 could someone please explain this to me?

This is as close as I could get to "understanding" the situation
from this number 10000000011000000000100 10000000 means 1(decimal) 01100000 means 3(decimal) and 0000100 means 4(decimal)

although i cant understand why the 1 would be backwards

Re: andrew tanenbaums book (2nd edition)

Posted: Tue Jan 13, 2009 6:56 pm
by clange
Hi

Just checked my version of the book (only a quick glance). It seems like he is just explaining standard paging. Try reading the Intel manuals or find some paging tutorials. Then everything should make much more sense. Paging is one of those things that you have to "wrap your brain around", and when you get it it seems so simple and easy. Seeing more than one example should help you a lot.

Good luck

clange

Re: andrew tanenbaums book (2nd edition)

Posted: Tue Jan 13, 2009 10:26 pm
by Firestryke31
Here's how I understand paging: an address contains several parts, all packed into a 32 bit or 64 bit int depending on architecture. For simplicity (and due to the fact that I've never worked on 64 bit paging) I'll explain based on 32 bit numbers.

A single page can contain 4096 bytes, addressable as 0x0000 to 0x0x0FFF. This means that the lowest 12 bits can be seen as an offset into the page.
A single page table can hold up to 1024 pages, addressable as 0x0000 to 0x03FF. This means that the next 10 bits are an offset into a page table.
Finally, a page directory can hold up to 1024 page tables, also addressable as 0x0000 to 0x03FF. This means that the last 10 bits are an offset into your page directory.

In other words, the following will give you the page table, page, and offset into the page from an address (written in C/C++):

Code: Select all

offset = addr & 0x0FFF;
page = (addr >> 12) & 0x03FF;
table = (addr >> 24) & 0x03FF;
This should give you the numbers you have.

It also helps to think of each part as an index into an array. To get the byte at an address 'addr':
table = directory[(addr >> 24) & 0x03FF];
page = table[(addr >> 12) & 0x03FF];
byte = page[addr & 0x0FFF];

So an address is really 3 different (tightly packed) indexes. This info is very useful in, say, a page fault handler, because with a couple of shifts and masks you hhave all of the info you need to either tell the user, or correct the mistake.

Re: andrew tanenbaums book (2nd edition)

Posted: Tue Jan 13, 2009 10:48 pm
by JohnnyTheDon
For x86_64, there are 2 aditional levels of mapping, and some other differences.

Pages are still 4096 (4kb) and take up the 12 least significant bits. However, there are now 512 pages in a page table. This means that nine bits are used as an index into the page table. Page directories point to 512 page tables. Page Directory Pointer Tables point to 512 directories. And finally, the PML4 points to 512 Page Directory Pointer Tables.

The C code for finding these indexes would be:

Code: Select all

pml4Index = (addr>>39)&0x1FF;
pdpIndex = (addr>>30)&0x1FF;
pdIndex = (addr>>21)&0x1FF;
ptIndex = (addr>>12)&0x1FF;
pageIndex = addr&0xFFF;
64-bit systems also use canonical addressing. This means there are two 128 TB regions of memory: 0x0000000000000000 to 0x00007FFFFFFFFFFF and 0xFFFF800000000000 to 0xFFFFFFFFFFFFFFFF. Anything in between 0x00007FFFFFFFFFFF and 0xFFFF800000000000 is invalid. Basically, 64-bit addresses are sign extended from 48-bits to 64-bits.