Long Mode
Long Mode
Hey all,
Was wondering if anyone working on a 64bit kernel had tried to see if there were any "backdoors" (as is usual with Intel/AMD x86) to get into Long Mode and then not using paging?
TBH I can't see a good reason why paging is the only option available in 64bit as opposed to offering flat as well (as it was with 32bit pmode).
Obviously almost all OSes will use paging, but there should be some good reasons NOT to use it possibly?
John
Was wondering if anyone working on a 64bit kernel had tried to see if there were any "backdoors" (as is usual with Intel/AMD x86) to get into Long Mode and then not using paging?
TBH I can't see a good reason why paging is the only option available in 64bit as opposed to offering flat as well (as it was with 32bit pmode).
Obviously almost all OSes will use paging, but there should be some good reasons NOT to use it possibly?
John
Re: Long Mode
No doubt it simplifies processor design, and allows certain optimizations.johnsa wrote:I can't see a good reason why paging is the only option available in 64bit as opposed to offering flat as well (as it was with 32bit pmode).
JAL
Re: Long Mode
Speaking of which.. has anyone done any performance tests on flat vs paged memory to see how much difference is made by having the CPU have to look everything up in the page tables?
Perhaps for some different operations.. like copying/filling memory for a block which is sub-page-size (like 1kb) and doing large ops like 1meg copy from array 1 to array 2.. etc
I remember in the 486 days paging was much slower, around 15-20%.. curious if this is still the case or if the CPU's have evolved and optimized away the performance issues of paging in 32bit pmode.
Perhaps for some different operations.. like copying/filling memory for a block which is sub-page-size (like 1kb) and doing large ops like 1meg copy from array 1 to array 2.. etc
I remember in the 486 days paging was much slower, around 15-20%.. curious if this is still the case or if the CPU's have evolved and optimized away the performance issues of paging in 32bit pmode.
Re: Long Mode
Don't forget about TLB+other page caches existsing in all modern CPUs. It is simple arithmetic task to show how often you actually gonna walk you pages and how often you hit in the TLBs. Largest load you could do in x86 is 16-byte length. The page size is 4K. I am already not talking about 2Mb or 1Gb pages. So how many page walks you will do for one page copy ?johnsa wrote:Speaking of which.. has anyone done any performance tests on flat vs paged memory to see how much difference is made by having the CPU have to look everything up in the page tables?
Perhaps for some different operations.. like copying/filling memory for a block which is sub-page-size (like 1kb) and doing large ops like 1meg copy from array 1 to array 2.. etc
I remember in the 486 days paging was much slower, around 15-20%.. curious if this is still the case or if the CPU's have evolved and optimized away the performance issues of paging in 32bit pmode.
When you copy large blocks of memory (say 1Mb, although 1Mb is not large enough because it fits in L2/L3 completely) you will be completely limited by cache/memory bandwidth. Core 2 Duo L2 can't handle more than 16 bytes per cycle with latency of 18 clocks. The buffers inside the core are not large enough to handle it without stalling the whole pipeline. The Barcelona is even worse.
It looks like you are lazy of understanding it and prefer no-paging mode instead and looking for excuse
Stanislav
Re: Long Mode
I do understand it and have implemented it before. But I don't necessarily see a lot of benefit to paging except in terms of building an OS for end-users where you want user processes seperated and the ability to fault on page access etc.. IE it's all about it being a security mechanism rather than a critical feature. Yes for PAE it's essential if you want more than 4gig, but that is no longer an issue as it was when PAE originated pre-64bit.. if you want more than 4gig.. go 64...imho. Mapping each process into a virtual 0-4gig space isnt a necessicty for me, nor is protecting memory from being over-written. If I was building an end-user OS it would be and I'd agree with paging 100%.
I'm just curious if anyone had at least tried to see if there was a back door mode in long mode (ala unreal mode etc) to switch off paging.
I'm just curious if anyone had at least tried to see if there was a back door mode in long mode (ala unreal mode etc) to switch off paging.
-
- Member
- Posts: 524
- Joined: Sun Nov 09, 2008 2:55 am
- Location: Pennsylvania, USA
Re: Long Mode
Thats not the only reason. Paging allows you to run multiple programs that expect to be in the same space without moving around a ton of memory. It also allows for disk mapping and virtual memory. Don't dismiss it because its inconvenient sometimesit's all about it being a security mechanism rather than a critical feature.
Re: Long Mode
I don't really see it's inconvenience (then again I'm writing a general purpose os), If your going to do multitasking it's actually convenient to use paging.
Working On:Bootloader, RWFS Image Program
Leviathan: http://leviathanv.googlecode.com
Kernel:Working on Design Doc
Leviathan: http://leviathanv.googlecode.com
Kernel:Working on Design Doc
Re: Long Mode
I personally think virtual memory is a lousy idea.. (imho) ... at least when implement ala Windows style. Other architecture machines managed multi-tasking and dealing with relocations no trouble without a paging mechanism.. Amiga/68k eg?
In any event I guess we could debate pros and cons for hours.. My only two real questions were if anyone had meaused the performance difference (if any) .. IE: use rdtsc switch to pmode non paged, do some memory copies/writes of various sizes.. and then do the same in paged mode and see if there is any difference at all. and secondly if anyone who was working on a 64bit kernel had tried looking for some backdoors to get long mode up without paging.. I wouldn't put it past amd/intel to have a way to do it that just isn't documented.
In any event I guess we could debate pros and cons for hours.. My only two real questions were if anyone had meaused the performance difference (if any) .. IE: use rdtsc switch to pmode non paged, do some memory copies/writes of various sizes.. and then do the same in paged mode and see if there is any difference at all. and secondly if anyone who was working on a 64bit kernel had tried looking for some backdoors to get long mode up without paging.. I wouldn't put it past amd/intel to have a way to do it that just isn't documented.
Re: Long Mode
I've developed a few 64-bit kernels.
The only real speed performance I've seen is when using optimized memcpy's using 64-bit registers instead of 32-bit, but even then, I usually use SSE for those memory functions anyways if possible so it's hard to judge.
Getting an accurate comparison between paging vs. non-paging kernels would be near-impossible. Also, comparing 64-bit to 32-bit is difficult as well as I'd imagine in emulators it would run much differently then on real hardware is they usually associate speed with code size.
I know of no tricks to get into Long Mode without enabling paging as the core concept of Long Mode relies on the PML4 giving the addressable area an extension.
btw, I just thought of something that *may* be an arguable speed increase on 64-bit systems is the use of scratch registers as MEMORY/INTEGER function arguments instead of using the stack.
The only real speed performance I've seen is when using optimized memcpy's using 64-bit registers instead of 32-bit, but even then, I usually use SSE for those memory functions anyways if possible so it's hard to judge.
Getting an accurate comparison between paging vs. non-paging kernels would be near-impossible. Also, comparing 64-bit to 32-bit is difficult as well as I'd imagine in emulators it would run much differently then on real hardware is they usually associate speed with code size.
I know of no tricks to get into Long Mode without enabling paging as the core concept of Long Mode relies on the PML4 giving the addressable area an extension.
btw, I just thought of something that *may* be an arguable speed increase on 64-bit systems is the use of scratch registers as MEMORY/INTEGER function arguments instead of using the stack.
Website: https://joscor.com
-
- Member
- Posts: 524
- Joined: Sun Nov 09, 2008 2:55 am
- Location: Pennsylvania, USA
Re: Long Mode
Uh... why?johnsa wrote:I personally think virtual memory is a lousy idea..
It lets a program map in any file it needs, including its code, and share it with other programs that are using the same data. Whats not to love?
Re: Long Mode
It might put a slight damper on performance, but it makes memory management much easier.
Working On:Bootloader, RWFS Image Program
Leviathan: http://leviathanv.googlecode.com
Kernel:Working on Design Doc
Leviathan: http://leviathanv.googlecode.com
Kernel:Working on Design Doc
Re: Long Mode
Hi,
If you think paging is slow, then you're not using it right...
Cheers,
Brendan
For long mode paging, you can copy or move PML4 entries (or PDP entries, or page directory entries, or page table entries) instead of copying/moving individual bytes. For "best case" this works out to about 30 billion times faster than memory bandwidth.stlw wrote:When you copy large blocks of memory (say 1Mb, although 1Mb is not large enough because it fits in L2/L3 completely) you will be completely limited by cache/memory bandwidth.
If you think paging is slow, then you're not using it right...
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Long Mode
Surely if you copy the PML or table entries you'd kind of be defeating the objective of having a copy.. as you'd land up with two different ranges of addresses mapped to the same physical data? You'd seldom copy something if you weren't going to do anything destructive to it.
Re: Long Mode
Hi,
Cheers,
Brendan
See "copy-on-write". Now consider how you'd implement "fork()" for potentially huge (64-bit) processes without an insane amount of overhead.johnsa wrote:Surely if you copy the PML or table entries you'd kind of be defeating the objective of having a copy.. as you'd land up with two different ranges of addresses mapped to the same physical data? You'd seldom copy something if you weren't going to do anything destructive to it.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Re: Long Mode
I heard that you can get to long mode by changing a bit in a MSR.