hi all,
is kernel using direct access to physical memory?
lets say in C we have the code :
void *address = 0xabcd1234;
if we use this address in kernel mode, is it directly accessing to physical address?
kernel address space
Re: kernel address space
My kernel does, but most kernels do not. It depends on the kernel.
Most kernels run in virtual memory. I personally think that is a mistake.
Most kernels run in virtual memory. I personally think that is a mistake.
Re: kernel address space
Why do you think so?
Re: kernel address space
I too use physical memory access.
I will withold my comments on virtual memory.
I will withold my comments on virtual memory.
Website: https://joscor.com
Re: kernel address space
One of the biggest reasons why the modern CPUs run as fast as they do is their caches. When you mess up a cache, the machine slows down a lot. The smaller the cache, the less it takes to mess it up. The smallest cache you have is the TLB. When you switch from usermode to kernelmode to handle a syscall, if the kernel uses virtual memory, you will toast the entire TLB (or at least a significant fraction) in the process of handling the syscall. And it is completely unnecessary -- it is not the tiniest bit difficult to write a kernel that knows how to live inside the restrictions of physical memory. If you turn vmem off immediately "on receipt" of a syscall, then all the usermode cached vmem stuff never gets dumped out of the TLB.Adek336 wrote:Why do you think so?
Unfortunately, you don't have a choice on a 64bit system.
Re: kernel address space
This is why the Global bit exists. Basically the entries marked global will not get flushed on CR3 write. This makes the User->Sys->User transition a lot less expensive.bewing wrote:One of the biggest reasons why the modern CPUs run as fast as they do is their caches. When you mess up a cache, the machine slows down a lot. The smaller the cache, the less it takes to mess it up. The smallest cache you have is the TLB. When you switch from usermode to kernelmode to handle a syscall, if the kernel uses virtual memory, you will toast the entire TLB (or at least a significant fraction) in the process of handling the syscall. And it is completely unnecessary -- it is not the tiniest bit difficult to write a kernel that knows how to live inside the restrictions of physical memory. If you turn vmem off immediately "on receipt" of a syscall, then all the usermode cached vmem stuff never gets dumped out of the TLB.
proxy
Re: kernel address space
Hi,
However, if the kernel disables paging all TLB entries will be flushed (including entries for "global" pages), regardless of whether or not a task switch is done. This would cause far worse performance problems than leaving paging enabled - a simple/fast kernel API function would cause a huge number of TLB misses to occur after it returns to user-mode.
"The following operations invalidate all TLB entries, irrespective of the setting of the G flag:
* Asserting or de-asserting the FLUSH# pin.
* (Pentium 4, Intel Xeon, and P6 family processors only.) Writing to an MTRR (with a WRMSR instruction).
* Writing to control register CR0 to modify the PG or PE flag.
* (Pentium 4, Intel Xeon, and P6 family processors only.) Writing to control register CR4 to modify the PSE, PGE or PAE flag."
If you turn vmem off immediately "on receipt" of a syscall (which involves writing to control register CR0 to modify the PG flag), then you'll be completely flushing all TLB entries for every syscall.
Cheers,
Brendan
It's small, but (for 4 KB pages) for a typical CPU with 8192 TLB entries those TLB entries cover 32 MB of linear address space, which is much larger than a "little" 2 MB L2 data cache will cover...bewing wrote:The smallest cache you have is the TLB.
No. if you switch from user-mode to kernel-mode then the kernel's code might use several TLB entries (but those TLB entries may have already been present in the TLB). If the kernel accesses many MB of data (which IMHO is extremely unlikely with a sane kernel) it would end up getting rid of all the "least recently used" user-mode TLB entries; or, if the kernel does a task switch the user-mode TLB entries will be flushed.bewing wrote:When you switch from usermode to kernelmode to handle a syscall, if the kernel uses virtual memory, you will toast the entire TLB (or at least a significant fraction) in the process of handling the syscall.
However, if the kernel disables paging all TLB entries will be flushed (including entries for "global" pages), regardless of whether or not a task switch is done. This would cause far worse performance problems than leaving paging enabled - a simple/fast kernel API function would cause a huge number of TLB misses to occur after it returns to user-mode.
For a toy kernel like Linux, I agree. When you start doing NUMA optimizations (or trying to do fault tolerance for faulty RAM) you can't assume that any specific area in the physical address space will be suitable, and something common like "the kernel's code starts at 0x00100000 in the physical address space" becomes far too restrictive.bewing wrote:And it is completely unnecessary -- it is not the tiniest bit difficult to write a kernel that knows how to live inside the restrictions of physical memory.
Hehe - from Intel's manual, section 10.9. Invalidating the Translation Lookaside Buffers (TLBs):bewing wrote:If you turn vmem off immediately "on receipt" of a syscall, then all the usermode cached vmem stuff never gets dumped out of the TLB.
"The following operations invalidate all TLB entries, irrespective of the setting of the G flag:
* Asserting or de-asserting the FLUSH# pin.
* (Pentium 4, Intel Xeon, and P6 family processors only.) Writing to an MTRR (with a WRMSR instruction).
* Writing to control register CR0 to modify the PG or PE flag.
* (Pentium 4, Intel Xeon, and P6 family processors only.) Writing to control register CR4 to modify the PSE, PGE or PAE flag."
If you turn vmem off immediately "on receipt" of a syscall (which involves writing to control register CR0 to modify the PG flag), then you'll be completely flushing all TLB entries for every syscall.
Um, no - the global bit makes address space switches less expensive (e.g. process switches), not privilege level switches (CPL=3 -> CPL=0 -> CPL=3).proxy wrote:This is why the Global bit exists. Basically the entries marked global will not get flushed on CR3 write. This makes the User->Sys->User transition a lot less expensive.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: kernel address space
Huh. Well, that's incredibly stupid of Intel. I hadn't noticed that detail. *sigh*Brendan wrote: Hehe - from Intel's manual, section 10.9. Invalidating the Translation Lookaside Buffers (TLBs):
Re: kernel address space
Hi,
Cheers,
Brendan
Sorry - I hope this doesn't ruin your kernel's design...bewing wrote:Huh. Well, that's incredibly stupid of Intel. I hadn't noticed that detail. *sigh*Brendan wrote:Hehe - from Intel's manual, section 10.9. Invalidating the Translation Lookaside Buffers (TLBs):
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.