kernel address space

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
zebanilinux
Posts: 1
Joined: Thu Jul 17, 2008 5:13 pm

kernel address space

Post by zebanilinux »

hi all,

is kernel using direct access to physical memory?

lets say in C we have the code :

void *address = 0xabcd1234;

if we use this address in kernel mode, is it directly accessing to physical address?
User avatar
bewing
Member
Member
Posts: 1401
Joined: Wed Feb 07, 2007 1:45 pm
Location: Eugene, OR, US

Re: kernel address space

Post by bewing »

My kernel does, but most kernels do not. It depends on the kernel.
Most kernels run in virtual memory. I personally think that is a mistake.
User avatar
Adek336
Member
Member
Posts: 129
Joined: Thu May 12, 2005 11:00 pm
Location: Kabaty, Warszawa
Contact:

Re: kernel address space

Post by Adek336 »

Why do you think so?
User avatar
01000101
Member
Member
Posts: 1599
Joined: Fri Jun 22, 2007 12:47 pm
Contact:

Re: kernel address space

Post by 01000101 »

I too use physical memory access.
I will withold my comments on virtual memory.
User avatar
bewing
Member
Member
Posts: 1401
Joined: Wed Feb 07, 2007 1:45 pm
Location: Eugene, OR, US

Re: kernel address space

Post by bewing »

Adek336 wrote:Why do you think so?
One of the biggest reasons why the modern CPUs run as fast as they do is their caches. When you mess up a cache, the machine slows down a lot. The smaller the cache, the less it takes to mess it up. The smallest cache you have is the TLB. When you switch from usermode to kernelmode to handle a syscall, if the kernel uses virtual memory, you will toast the entire TLB (or at least a significant fraction) in the process of handling the syscall. And it is completely unnecessary -- it is not the tiniest bit difficult to write a kernel that knows how to live inside the restrictions of physical memory. If you turn vmem off immediately "on receipt" of a syscall, then all the usermode cached vmem stuff never gets dumped out of the TLB.

Unfortunately, you don't have a choice on a 64bit system.
User avatar
proxy
Member
Member
Posts: 108
Joined: Wed Jan 19, 2005 12:00 am
Contact:

Re: kernel address space

Post by proxy »

bewing wrote:One of the biggest reasons why the modern CPUs run as fast as they do is their caches. When you mess up a cache, the machine slows down a lot. The smaller the cache, the less it takes to mess it up. The smallest cache you have is the TLB. When you switch from usermode to kernelmode to handle a syscall, if the kernel uses virtual memory, you will toast the entire TLB (or at least a significant fraction) in the process of handling the syscall. And it is completely unnecessary -- it is not the tiniest bit difficult to write a kernel that knows how to live inside the restrictions of physical memory. If you turn vmem off immediately "on receipt" of a syscall, then all the usermode cached vmem stuff never gets dumped out of the TLB.
This is why the Global bit exists. Basically the entries marked global will not get flushed on CR3 write. This makes the User->Sys->User transition a lot less expensive.

proxy
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: kernel address space

Post by Brendan »

Hi,
bewing wrote:The smallest cache you have is the TLB.
It's small, but (for 4 KB pages) for a typical CPU with 8192 TLB entries those TLB entries cover 32 MB of linear address space, which is much larger than a "little" 2 MB L2 data cache will cover... ;)
bewing wrote:When you switch from usermode to kernelmode to handle a syscall, if the kernel uses virtual memory, you will toast the entire TLB (or at least a significant fraction) in the process of handling the syscall.
No. if you switch from user-mode to kernel-mode then the kernel's code might use several TLB entries (but those TLB entries may have already been present in the TLB). If the kernel accesses many MB of data (which IMHO is extremely unlikely with a sane kernel) it would end up getting rid of all the "least recently used" user-mode TLB entries; or, if the kernel does a task switch the user-mode TLB entries will be flushed.

However, if the kernel disables paging all TLB entries will be flushed (including entries for "global" pages), regardless of whether or not a task switch is done. This would cause far worse performance problems than leaving paging enabled - a simple/fast kernel API function would cause a huge number of TLB misses to occur after it returns to user-mode.
bewing wrote:And it is completely unnecessary -- it is not the tiniest bit difficult to write a kernel that knows how to live inside the restrictions of physical memory.
For a toy kernel like Linux, I agree. When you start doing NUMA optimizations (or trying to do fault tolerance for faulty RAM) you can't assume that any specific area in the physical address space will be suitable, and something common like "the kernel's code starts at 0x00100000 in the physical address space" becomes far too restrictive.
bewing wrote:If you turn vmem off immediately "on receipt" of a syscall, then all the usermode cached vmem stuff never gets dumped out of the TLB.
Hehe - from Intel's manual, section 10.9. Invalidating the Translation Lookaside Buffers (TLBs):
"The following operations invalidate all TLB entries, irrespective of the setting of the G flag:
* Asserting or de-asserting the FLUSH# pin.
* (Pentium 4, Intel Xeon, and P6 family processors only.) Writing to an MTRR (with a WRMSR instruction).
* Writing to control register CR0 to modify the PG or PE flag.
* (Pentium 4, Intel Xeon, and P6 family processors only.) Writing to control register CR4 to modify the PSE, PGE or PAE flag.
"

If you turn vmem off immediately "on receipt" of a syscall (which involves writing to control register CR0 to modify the PG flag), then you'll be completely flushing all TLB entries for every syscall.
proxy wrote:This is why the Global bit exists. Basically the entries marked global will not get flushed on CR3 write. This makes the User->Sys->User transition a lot less expensive.
Um, no - the global bit makes address space switches less expensive (e.g. process switches), not privilege level switches (CPL=3 -> CPL=0 -> CPL=3).


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
User avatar
bewing
Member
Member
Posts: 1401
Joined: Wed Feb 07, 2007 1:45 pm
Location: Eugene, OR, US

Re: kernel address space

Post by bewing »

Brendan wrote: Hehe - from Intel's manual, section 10.9. Invalidating the Translation Lookaside Buffers (TLBs):
Huh. Well, that's incredibly stupid of Intel. I hadn't noticed that detail. *sigh*
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: kernel address space

Post by Brendan »

Hi,
bewing wrote:
Brendan wrote:Hehe - from Intel's manual, section 10.9. Invalidating the Translation Lookaside Buffers (TLBs):
Huh. Well, that's incredibly stupid of Intel. I hadn't noticed that detail. *sigh*
Sorry - I hope this doesn't ruin your kernel's design...


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Post Reply