Clarification on some aspects of paging.
Clarification on some aspects of paging.
Hello,
I have just finished reading the entire section on paging in the Intel manuals
(Many times T_T) and would like some verification that this is correct.
For my paging mechanism on initiation I am going to do something like:
Map all the user memory as RW, U, P
Map some global memory for my libraries etc (Does the "Global" flag do this or should I just make it U, RW, P)
Map the kernel memory as R,S,P
Load the PDBR into CR3
Enable paging
Now a few questions arrise before implementation: Bits 12-31 in the PDEs and PTEs show
the address field now is this virtual of physical? I don't get where the phys2virt
conversion goes on or how to map physical addresses to virtual ones.
Also which "permissions" take priority over what? eg(If a PDE is set as RW, U can a PTE in
the PDE be set as R, S?)
When in ring0 does the paging mechanism even do permission checks? Or can the kernel
access anything regardless of what flags are set?
Now can someone please explain to me when a VMM might become useful?
The only thing I can consider it for is finding a physical adddress for a page and
adding it to the PDE then Invalidating the PTE with whatever flags. Maybe as a sub
function of a "MapPage(phys,virt,flags)" kind of thing.
That's all I can think of for now. But hey atleast I'm clearer on paging than I was
a few months ago.
Thanks,
Nelson
I have just finished reading the entire section on paging in the Intel manuals
(Many times T_T) and would like some verification that this is correct.
For my paging mechanism on initiation I am going to do something like:
Map all the user memory as RW, U, P
Map some global memory for my libraries etc (Does the "Global" flag do this or should I just make it U, RW, P)
Map the kernel memory as R,S,P
Load the PDBR into CR3
Enable paging
Now a few questions arrise before implementation: Bits 12-31 in the PDEs and PTEs show
the address field now is this virtual of physical? I don't get where the phys2virt
conversion goes on or how to map physical addresses to virtual ones.
Also which "permissions" take priority over what? eg(If a PDE is set as RW, U can a PTE in
the PDE be set as R, S?)
When in ring0 does the paging mechanism even do permission checks? Or can the kernel
access anything regardless of what flags are set?
Now can someone please explain to me when a VMM might become useful?
The only thing I can consider it for is finding a physical adddress for a page and
adding it to the PDE then Invalidating the PTE with whatever flags. Maybe as a sub
function of a "MapPage(phys,virt,flags)" kind of thing.
That's all I can think of for now. But hey atleast I'm clearer on paging than I was
a few months ago.
Thanks,
Nelson
Re:Clarification on some aspects of paging.
You can map the code as R only, which helps prevent corrupting code etc. Also, you can mark the data NX (which is an AMD64 / EM64T feature).Nelson wrote: Map all the user memory as RW, U, P
Map it in every address space at the same location and map it like normal memory, but with global flag. The global flag indicates to the TLB buffer that it should consider these always present.Map some global memory for my libraries etc (Does the "Global" flag do this or should I just make it U, RW, P)
That's unpractical. Map the code rsp, map the data rwsp.Map the kernel memory as R,S,P
This is physical. You'd need a page table to translate them...Now a few questions arrise before implementation: Bits 12-31 in the PDEs and PTEs show
the address field now is this virtual of physical?
Say, you have virtual address 0x12345678 in non-PAE mode. Then, 0x123 >> 2 = 0x048 is the index into the PDT (CR3, physical address). It looks up which page is referenced there (physical address) and looks in it for item 0x345 & 0x3FF = 0x345. That item references a physical page. In that physical page it references the address 0x678, which is the byte you wanted.I don't get where the phys2virt
conversion goes on or how to map physical addresses to virtual ones.
You have to self-map the page tables and directories, that is, make them point to themselves. The easiest way is at the top of memory. Place the PDT at 0xFFFFF000. It doubles as PT for itself, and as well as page for itself. Just by creating that one entry can you use it to add to the page tables.
Restrictions take priority over permissions. U / R + S / RW = S / R.Also which "permissions" take priority over what? eg(If a PDE is set as RW, U can a PTE in
the PDE be set as R, S?)
It doesn't matter at which level these permissions are set, just that they are somewhere in the tree.
The kernel can not access readonly paged memory. With a given bit it can do this for userlevel memory. There are more, iirc, but too many to list. See the one document you've been reading already .When in ring0 does the paging mechanism even do permission checks? Or can the kernel
access anything regardless of what flags are set?
Keeping track of free pages, mapping and unmapping pages, keeping track of pages shared between processes (COW, copy on write), allowing pages to be swapped out...Now can someone please explain to me when a VMM might become useful?
The only thing I can consider it for is finding a physical adddress for a page and
adding it to the PDE then Invalidating the PTE with whatever flags. Maybe as a sub
function of a "MapPage(phys,virt,flags)" kind of thing.
Good luck on getting that in one function (and if you succeed, it won't be good ).
Re:Clarification on some aspects of paging.
And in simpler terms: VirtualAddress = (PDE * 1024 + PTE) * 4096Candy wrote:Say, you have virtual address 0x12345678 in non-PAE mode. Then, 0x123 >> 2 = 0x048 is the index into the PDT (CR3, physical address). It looks up which page is referenced there (physical address) and looks in it for item 0x345 & 0x3FF = 0x345. That item references a physical page. In that physical page it references the address 0x678, which is the byte you wanted.I don't get where the phys2virt
conversion goes on or how to map physical addresses to virtual ones.
So PageDirectory[1]->PageTable[1] = (1 * 1024 + 1) * 4096 = 4,198,400 bytes
IIRC, you've got that backwards, Ring 0 code ignores the user/supervisor and readonly/writable unless you enable the 'Kernel level paging protection' feature which you need to find with CPUID and enable by setting a bit in one of the control registers, present is checked but I don't know about NX.The kernel can not access readonly paged memory. With a given bit it can do this for userlevel memory. There are more, iirc, but too many to list. See the one document you've been reading already .When in ring0 does the paging mechanism even do permission checks? Or can the kernel
access anything regardless of what flags are set?
Re:Clarification on some aspects of paging.
So then would it be more practical implementation wise to map entire ranges of memory instead of eachCandy wrote: Say, you have virtual address 0x12345678 in non-PAE mode.
Then, 0x123 >> 2 = 0x048 is the index into the PDT (CR3, physical address).
It looks up which page is referenced there (physical address) and looks in it for item 0x345 & 0x3FF = 0x345.
That item references a physical page.
In that physical page it references the address 0x678, which is the byte you wanted.
You have to self-map the page tables and directories, that is, make them point to themselves.
The easiest way is at the top of memory. Place the PDT at 0xFFFFF000.
It doubles as PT for itself, and as well as page for itself.
Just by creating that one entry can you use it to add to the page tables.
one individually. For example have something like "MapPageRange(Start, End, Phys, Virt, Flags)" so that you split up the
address into the fields needed and not necessarilly adding entries in order created but according to which address it is.
Hmm, thats very easy to understand! Thanks.
Ah I didn't plan to put that all in one function maybe I was thinking outloud but anyway as said aboveCandy wrote: Keeping track of free pages, mapping and unmapping pages, keeping track of pages shared between processes
(COW, copy on write), allowing pages to be swapped out...
Good luck on getting that in one function (and if you succeed, it won't be good ).
I may have a more practical implementation.
Would this be practical to enable if detected as a supported feature? I think I can see this protectingAR wrote: IIRC, you've got that backwards, Ring 0 code ignores the user/supervisor
and readonly/writable unless you enable the 'Kernel level paging protection'
feature which you need to find with CPUID and enable by setting a bit in one of the control registers,
present is checked but I don't know about NX.
the kernel from the user and the user from the kernel as well which might not be such a bad thing
Thanks for your help Candy and AR !
Re:Clarification on some aspects of paging.
Ring 3 code does use the protection flags (ie. user code cannot ever access a page that is set to "supervisor", nor can it ever write to a page marked "readonly" but in Ring 0 [the kernel and the kernel only] do those have no bearing at all). The protection feature I mentioned only applies to force the CPU to use Readonly/Writable in kernel mode (Again, I don't know about NX though).Would this be practical to enable if detected as a supported feature? I think I can see this protecting the kernel from the user and the user from the kernel as well which might not be such a bad thingIIRC, you've got that backwards, Ring 0 code ignores the user/supervisor and readonly/writable unless you enable the 'Kernel level paging protection' feature which you need to find with CPUID and enable by setting a bit in one of the control registers, present is checked but I don't know about NX.
Re:Clarification on some aspects of paging.
NX/XD/PE/Feature_With_A_Lot_Of_Names, is checked when the CPU loads a page into the instruction TLB: if NX is enabled, and set, it will refuse to load the page into the TLB (all other checks (except P) are made after it enters the TLB -- P & NX/XD/PE are the only permisions that cause the page to fail to load into the TLB), causing a #PF (AMD#2, pg173 (rev. 3.07))
Re:Clarification on some aspects of paging.
that would be more like :JAAman wrote: NX/XD/PE/Feature_With_A_Lot_Of_Names, is checked when the CPU loads a page into the instruction TLB: if NX is enabled, and set, it will refuse to load the page into the TLB (all other checks (except P) are made after it enters the TLB -- P & NX/XD/PE are the only permisions that cause the page to fail to load into the TLB), causing a #PF (AMD#2, pg173 (rev. 3.07))
Code: Select all
if (type == code && NX) || (!p) then
cause_pf();
else
....
end if;
The NX bit doesn't prevent the tlb from loading data pages. If AMD says otherwise, it's a pretty pointless feature if also applied to the data tlb.
On the R/W + U/S + rw_in_s bit, if all is usually writable for the kernel and given that bit the userlevel bit is marked readonly, what's the entire point of marking something R/S ? It'd be a readonly thing that nothing would ever enforce. My guess is thus that that is an error, so that R/S for kernel-level would always be enforced.
Didn't test it though.
Re:Clarification on some aspects of paging.
no, the CPU has 2 different types of TLBs:
-instruction TLBs- and
-data TLBs-
edit: Intel vol. 3 pg 10-1 & 2
if you want to read/write data to a page that your currently executing, the page table must be reread into the data TLB to use it
and a data page table must be reloaded into a instruction TLB to be executed
so its more like
all other permisions are calculated on the stored values already in the appropriate TLB
however I have heard that some CPUs (not sure which one) DO store (!P) entries in TLBs(and thus require INVLPG on a (!P) page!)
-instruction TLBs- and
-data TLBs-
edit: Intel vol. 3 pg 10-1 & 2
if you want to read/write data to a page that your currently executing, the page table must be reread into the data TLB to use it
and a data page table must be reloaded into a instruction TLB to be executed
so its more like
Code: Select all
if (((destinationTLB == codeTLB) && NX) || (!P)) then
#PF
else
destinationTLB=LoadPageTable(address)
endif
however I have heard that some CPUs (not sure which one) DO store (!P) entries in TLBs(and thus require INVLPG on a (!P) page!)
Re:Clarification on some aspects of paging.
I didn't quite follow that, but the point of clearing Writable and User (R/S) is precisely none, both flags are ignored in Ring 0. You need to enable the Write Protect flag I was referring too, only when that is on does the writable flag matter in Ring 0.Candy wrote: On the R/W + U/S + rw_in_s bit, if all is usually writable for the kernel and given that bit the userlevel bit is marked readonly, what's the entire point of marking something R/S ? It'd be a readonly thing that nothing would ever enforce. My guess is thus that that is an error, so that R/S for kernel-level would always be enforced.
(IA-32 Intel Architecture Software Developer?s Manual Volume 3: System Programming Guide, pg 2-13)Write Protect (bit 16 of CR0). Inhibits supervisor-level procedures from writing into user-level read-only pages when set; allows supervisor-level procedures to write into user-level read-only pages when clear. This flag facilitates implementation of the copy-on-write method of creating a new process (forking) used by operating systems such as UNIX*.
(IA-32 Intel Architecture Software Developer?s Manual Volume 3: System Programming Guide, pg 4-31)The page-level protection mechanism recognizes two page types:When the processor is in supervisor mode and the WP flag in register CR0 is clear (its state following reset initialization), all pages are both readable and writable (write-protection is
- Read-only access (R/W flag is 0).
- Read/write access (R/W flag is 1).
ignored). When the processor is in user mode, it can write only to user-mode pages that are read/write accessible. User-mode pages which are read/write or read-only are readable; supervisor-mode pages are neither readable nor writable from user mode. A page-fault exception is generated on any attempt to violate the protection rules.
Re:Clarification on some aspects of paging.
Thanks for all the help guys. Now comes the time of implementation. I better get my reading glasses on and a nice tall one since this is looking to be a long night.
Re:Clarification on some aspects of paging.
Okay I get how the virtual addresses are split up to find indexes in the PDE and PTEs, now when mapping do I put the full (physical)address wanted to be mapped in both the PDE and the PTE?
Re:Clarification on some aspects of paging.
You put the physical page addresses in the PT, you put the PT physical addressses in the PD and you put the PD physical address in CR3. Then, enable paging.Nelson wrote: Okay I get how the virtual addresses are split up to find indexes in the PDE and PTEs, now when mapping do I put the full (physical)address wanted to be mapped in both the PDE and the PTE?
Oh, and do self-map your paging structures. You'll want access to them too (and this is where the fun starts ). If you map it at 0xFFFFF000 and thus set the 0x3FFth entry to itself, you can for mapping a page just map the PT (if necessary) and then map the page itself.
Re:Clarification on some aspects of paging.
From Intel Manual Volume 3: System Programming
Section 3.6.4
Linear Address translation splits up the address and finds the index to PDEs and PTEs. Okay.
PHYSICAL addresses of the Page table go into the PDE and PHYSICAL addresses of the Page itself go into the PTE
What I am confused about is say you have
How would I get the Physical address to "point" to the virtual addresses upon translation?
Thanks,
Nelson
Section 3.6.4
Let me summerize what I know:(Page-table entries for 4-KByte pages.) Specifies the physical address of the
first byte of a 4-KByte page. The bits in this field are interpreted as the 20 mostsignificant
bits of the physical address, which forces pages to be aligned on
4-KByte boundaries.
(Page-directory entries for 4-KByte page tables.) Specifies the physical
address of the first byte of a page table. The bits in this field are interpreted as
the 20 most-significant bits of the physical address, which forces page tables to
be aligned on 4-KByte boundaries.
Linear Address translation splits up the address and finds the index to PDEs and PTEs. Okay.
PHYSICAL addresses of the Page table go into the PDE and PHYSICAL addresses of the Page itself go into the PTE
What I am confused about is say you have
Code: Select all
int MapMemory(unsigned long phys, unsigned long virt, unsigned long flags);
Thanks,
Nelson
Re:Clarification on some aspects of paging.
Nelson wrote: Linear Address translation splits up the address and finds the index to PDEs and PTEs. Okay.
PHYSICAL addresses of the Page table go into the PDE and PHYSICAL addresses of the Page itself go into the PTE
What I am confused about is say you have
How would I get the Physical address to "point" to the virtual addresses upon translation?Code: Select all
int MapMemory(unsigned long phys, unsigned long virt, unsigned long flags);
Thanks,
Nelson
Code: Select all
int MapMemory(unsigned long phys, unsigned long virt, unsigned long flags) {
unsigned int *pd = (unsigned int *)0xFFFFF000;
unsigned int *pt = (unsigned int *)0xFF300000;
int pd_offs = virt >> 22;
int pt_offs = virt >> 12;
// if there's no page table present
if (pd[pd_offs] & 1 == 0) {
// map one
MapMemory(GetFreePage(), 0xFF300000 + (pd_offs << 12), some_flags);
}
// if there is a page present
if (pt[pt_offs] & 1 == 1) {
// unmap it and remap this one, or something else.
AddFreePage(UnMapMemory(virt));
}
pt[pt_offs] = (phys & 0xFFFFF000) | flags;
return 0;
}
Re:Clarification on some aspects of paging.
Okay that clears it up.
Maybe once I get atleast basic paging working, I can implement things like self mapping the Page tables and Page Directories.
Is there a difference between a page being global and simply mapping it into every processes address space? I know you set a bit in a control register and put a flag in the PTE (I think, I'll have to check my notes) but other than that I fail to see any difference.
Time to get coding . Thanks Candy.
Maybe once I get atleast basic paging working, I can implement things like self mapping the Page tables and Page Directories.
Is there a difference between a page being global and simply mapping it into every processes address space? I know you set a bit in a control register and put a flag in the PTE (I think, I'll have to check my notes) but other than that I fail to see any difference.
Time to get coding . Thanks Candy.