it is not my assumption, it is my goal. It is not an OS for desktops.Combuster wrote:Basically, your assumption only holds for simple embedded devices
General protection in 64bit mode (Intel)
Re: General protection in 64bit mode (Intel)
Re: General protection in 64bit mode (Intel)
32bit is not a solution either for me, I will lose additional registers including SSE, and I need those.Casm wrote:bluemoon wrote: It should be at least a little while before anybody needs more than the 64Gb they can get with physical address extension in 32 bit mode.
Re: General protection in 64bit mode (Intel)
It worth consider that raw access speed of individual bytes in memory is less meaningful to the overall performance.
Without paging it is most likely that a system need many memory copy everywhere, thus the overall workload increases.
Without paging it is most likely that a system need many memory copy everywhere, thus the overall workload increases.
Re: General protection in 64bit mode (Intel)
Are we allowed to know what solution fits for you? Did you reckon the advantages of paging?32bit is not a solution either for me
Besides, what trouble are you having setting up paging?
Programming is not about using a language to solve a problem, it's about using logic to find a solution !
Re: General protection in 64bit mode (Intel)
No troubles yet. I just wanted to do segment limit checking in 64bits (because it is faster). but as I see it is not possible.Chandra wrote:Are we allowed to know what solution fits for you? Did you reckon the advantages of paging?32bit is not a solution either for me
Besides, what trouble are you having setting up paging?
And I want to note that the removal of this feature has a big impact on some applications, for example VMWare wasn't be able to run in 64bit mode until virtualization instructions were released in next processor revisions:
http://www.pagetable.com/?p=25
Regards
Re: General protection in 64bit mode (Intel)
@ nulik: Dear Sir, if you look into the mirror, do you - by any chance - look somewhat alike rdos, perhaps with a sock pulled over his hand?
I just ask because you are the second person insisting on using segmentation instead of paging for protection, and using quite similar arguments (pointing to years-old papers).
Segmentation "being faster"? Assumptions and presumptions prove nothing. Blaming the CPU designers for taking away your toys solves nothing. It is as it is, cope, or design your own CPU, and toolchain, and language, capable of coping with segmentation.
Oh wait, you want to code your OS in ASM only, too, do you?
As for the VMware article, did you catch the last paragraph where the author shakes his head in wonder why VMware didn't use paging for protection in the first place...?
I just ask because you are the second person insisting on using segmentation instead of paging for protection, and using quite similar arguments (pointing to years-old papers).
Segmentation "being faster"? Assumptions and presumptions prove nothing. Blaming the CPU designers for taking away your toys solves nothing. It is as it is, cope, or design your own CPU, and toolchain, and language, capable of coping with segmentation.
Oh wait, you want to code your OS in ASM only, too, do you?
As for the VMware article, did you catch the last paragraph where the author shakes his head in wonder why VMware didn't use paging for protection in the first place...?
Every good solution is obvious once you've found it.
Re: General protection in 64bit mode (Intel)
But I've used paging for 23 years. I use both paging and segmentation.Solar wrote:@ nulik: Dear Sir, if you look into the mirror, do you - by any chance - look somewhat alike rdos, perhaps with a sock pulled over his hand?
I just ask because you are the second person insisting on using segmentation instead of paging for protection, and using quite similar arguments (pointing to years-old papers).
Segmentation is a debugging-tool primarily. And it avoids putting device-drivers in their own address-spaces. As soon as the debugging-phase is over, much of the segment-based protection can be disabled. I think I posted code on just how to do this previously for ACPI.Solar wrote:Segmentation "being faster"? Assumptions and presumptions prove nothing. Blaming the CPU designers for taking away your toys solves nothing. It is as it is, cope, or design your own CPU, and toolchain, and language, capable of coping with segmentation.
I will most likely write some major pieces of C-code in the kernel in the near future. Not that I will ever provide the scheduler or memory manager in C, but some fairly complex device-drivers would quite likely be in C or C++.Solar wrote:Oh wait, you want to code your OS in ASM only, too, do you?
Re: General protection in 64bit mode (Intel)
Uh... I understood your memory model to be non-flat?
Every good solution is obvious once you've found it.
Re: General protection in 64bit mode (Intel)
The kernel device-driver model uses one code selector and one data selector (DGROUP) per device-driver for protection. All pointers to data are 48-bit. Today all applications are flat (slightly modified PE-format), but segmented applications are still supported. Paging is used for virtual/physical memory allocation, and for running applications in separate address spaces.Solar wrote:Uh... I understood your memory model to be non-flat?
- DavidCooper
- Member
- Posts: 1150
- Joined: Wed Oct 27, 2010 4:53 pm
- Location: Scotland
Re: General protection in 64bit mode (Intel)
I don't use paging at all at the moment, but I certainly do plan to add it at some stage so that when the OS runs out of usable memory space due to fragmentation it will be able to switch to using paging - that will slow it down a little (I don't know how much though), but it will enable more things to be packed into the extra space made available and it is well worth compromising on raw speed at that point. However, it seems to me that 3GB of usable space is a massive amount to play with, so if your OS the apps you're using are guaranteed safe and stable and you don't need any protection features, why use paging before you're actually forced to? If the speed gain is significant, it's clearly worth having. Exactly the same would apply to running in 64-bit mode if the facility to run in that mode without paging had been provided.
Help the people of Laos by liking - https://www.facebook.com/TheSBInitiative/?ref=py_c
MSB-OS: http://www.magicschoolbook.com/computing/os-project - direct machine code programming
MSB-OS: http://www.magicschoolbook.com/computing/os-project - direct machine code programming
Re: General protection in 64bit mode (Intel)
I can argue it like this:rdos wrote:Even if it were 30 years old it would still show the obvious that a system without paging is faster than a system with paging. I don't understand how anybody can argue otherwise.
The key here is the word 'system'. Faster can be measured lots of ways. The perceived speed of modern operating systems benefits from the OS features that can be built on paging like mmap-ing files, demand loading executable images, copy-on-write, forking, passing data by paging, remapping DMA buffers, etc. The speed-up provided by these features outweighs the occasional slow-ups caused by having to do page translations.
There may be a penalty at the single instruction level (and be sure this is mitigated by the TLB's working in parallel with the rest of the core and other tricks) but overall the system speed of the operating systems of the primary OS vendors using chips supplied by Intel and AMD is faster with paging.
If your OS really doesn't need paging and the OS features that can be built on it then I guess none of this argument applies. But for most of us it certainly does apply.
If a trainstation is where trains stop, what is a workstation ?
Re: General protection in 64bit mode (Intel)
It is not occasional, it should be happening a lot, given the size of the TLB.gerryg400 wrote:]I can argue it like this:
The key here is the word 'system'. Faster can be measured lots of ways. The perceived speed of modern operating systems benefits from the OS features that can be built on paging like mmap-ing files, demand loading executable images, copy-on-write, forking, passing data by paging, remapping DMA buffers, etc. The speed-up provided by these features outweighs the occasional slow-ups caused by having to do page translations.
From Intel's 64-ia-32 optimization manual (page 2-19):
Here you lost 7 cpu cycles.DTLB can perform three linear to physical address translations every cycle, two for load address and one for a store address. If the address is missing in the DTLB, the processor looks for it in the STLB, which holds data and instruction address translations. The penalty of a DTLB miss that hits the STLB is seven cycles.
Page 8-26
Now here you will lose a lot more. To read a cache line the processor may take about 200 clock cycles if it reads memory contigously and the memory latency is hidden.The next largest set of memory access delays are associated with the TLBs when linear-to-physical addresss translation is mapped with a finite number of entries in the TLBs. A miss in the first level TLBs results in a very small penalty that can usually be hidden by the OOO execution and comipler's scheduling. A miss in the shared TLB results in the Page Walker being invoked and this penalty can be noticeable in the execution.
So, it is 207 cycles against 1 cycle to use physical addresses directly without any paging.
For example, to open the web page of this thread the server probably incurred in tens of page walks. There are memory copies from disk to memory, then from user app (web server) to the kernel, from kernel, it is copied inside to the device buffer, and from the buffer it goes to the NIC and then to your PC over a network. Multiply this by the thousands of users who open this webiste. If someone would calculate how much clock cycles are wasted in supporting virtual memory that is no longer needed because memory is very cheap , he could probably shot himself in the head.
Re: General protection in 64bit mode (Intel)
Ahh! And I forgot that by doing page walk , it may evict your variables from the cache, so it is probably much worse, since another 40 cycles maybe wasted for access to bring your lost variables back to the L1 cache from L2 cache (if they are still there)
So, it is 207 cycles against 1 cycle to use physical addresses directly without any paging.
Note, I am not even giving you the worst scenario and we are talking about 247 cycles here, against only 1 cycle.
Re: General protection in 64bit mode (Intel)
That's not the only way to measure performance.
You have to take into account the performance increases due to paging, the security improvements due to paging, and the performance losses due to lack of paging.
Microbenchmarks out of context are useless.
You have to take into account the performance increases due to paging, the security improvements due to paging, and the performance losses due to lack of paging.
Microbenchmarks out of context are useless.
Re: General protection in 64bit mode (Intel)
Did you read the entire paragraph ?
You are describing exactly what must happen on a processor that doesn't support paging. A well written full-size OS on a processor that supports paging needn't do any of those copies. And it will be faster because it uses paging.
A small, embedded or less-than-fully-featured OS may get better performance by disabling paging but a desktop or server OS must have paging.
But that is beside the point..Intel's 64-ia-32 optimization manual actually wrote:An DTLB0 miss and STLB hit causes a penalty of 7cycles. Software only pays this penalty if the DTLB0 is used in some dispatch cases. The delays associated with a miss to the STLB and PMH are largely non-blocking.
That's an argument in favour of using paging. Pageable memory allows the OS to reduce the number of copies and speeds up disk access because DMA'ed data can be mapped to exactly where it's needed by the OS or the application. All that copying that you describe would only need to happen if paging weren't supported.For example, to open the web page of this thread the server probably incurred in tens of page walks. There are memory copies from disk to memory, then from user app (web server) to the kernel, from kernel, it is copied inside to the device buffer, and from the buffer it goes to the NIC and then to your PC over a network. Multiply this by the thousands of users who open this webiste. If someone would calculate how much clock cycles are wasted in supporting virtual memory that is no longer needed because memory is very cheap , he could probably shot himself in the head.
You are describing exactly what must happen on a processor that doesn't support paging. A well written full-size OS on a processor that supports paging needn't do any of those copies. And it will be faster because it uses paging.
A small, embedded or less-than-fully-featured OS may get better performance by disabling paging but a desktop or server OS must have paging.
If a trainstation is where trains stop, what is a workstation ?