Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
Qbyte wrote:RAM hasn't just become faster and less power hungry, capacities have significantly increased as well. Modern systems now have 16 or 32 GB, so RAM isn't exactly scarce anymore and will continue to become less so.
This claim is predicated on the assumption that the demand for memory in applications has saturated. This is probably true - for the most part - for things such as word processors, simple spreadsheets, or smaller casual games; but it is far from true for more elaborate spreadsheets, database engines, AAA 3D video games, A/V editors, modeling and simulation, or even web browsers (individual tabs rarely have a lot of memory demand, despite the best efforts of things like Node.js to waste as much as they can, but if you have more than a few tabs open and no ad-blocker or script manager running, filling 16GB of just web pages is entirely possible - especially given the latest builds of Firefox, Chrome, and Edge).
The so-called "Parkinson's Law of Data" ('Programs expand to fill the available memory') might be a joke, but it is kidding on the square; it, and the corresponding epigram regarding speed, Wirth's Law ('software gets slower more rapidly than hardware becomes faster'), are both straightforward applications of the Jevons Paradox ('increased availability of a resource leads to greater use of the resource, and less efficient use of it'). It is not impossible for demand of a resource to saturate, but it is pretty uncommon.
In any case, the rise of NVMe memory will probably lead to a more hierarchical memory structure, rather than less, especially since it is likely to lead to an abandonment of the current file-oriented models entirely in favor of persistent operating systems, meaning that the OS will have to be able to work with different kinds of memory fluently and transparently in a fine-grained manner. This will probably lead to a hybrid approach that differs from both paging and segmentation in significant ways.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Schol-R-LEA wrote:
In any case, the rise of NVMe memory will probably lead to a more hierarchical memory structure, rather than less, especially since it is likely to lead to an abandonment of the current file-oriented models entirely in favor of persistent operating systems...
I don't think it will be as groundbreaking as people think. The file-oriented had its start on systems that made heavy use of non-volatile memory: RAM was magnetic core, and machines made heavy use of swap. The original architecture of Unix was that you swapped a process's entire address space into RAM when its timeslice came around, and swapped the entire thing back out when the next process got scheduled, and only one process was in RAM at a time. Fork() was implemented by swapping the process out as if a new process were being scheduled, but instead of swapping some other process in, the kernel added a new entry to the process table pointing at the copy in swap, then left the original process in memory and returned to the process.
But despite being developed on a system with no volatile RAM outside of processor registers, Unix used a file-oriented model. I don't think much will change this time around.
zaval wrote:What you are talking about, QByte, is just huge pages. And they are not flexible and only could have advantages when used rarely where they really needed. using only them is nothing but system memory wastage.
Segmentation is in fact less wasteful of system memory than paging. It doesn't require the overhead of large page tables for each process, which reduce the amount of physical memory available to user-space.
In my experience, people that say things like "segmentation consumes less memory" and "segmentation is faster" are people that have no idea how to use paging properly to avoid consuming a massive amount of RAM, and have no idea how expensive fragmentation/compaction of the physical address space becomes in practice.
Paging costs a very small amount of RAM to save a very large amount of RAM (via. tricks like allocate on demand, copy on write, memory mapped files, etc). The end result is that it saves a very large amount of RAM (the "less than 1%" used for page tables, etc is insignificant). Segmentation can't do these tricks effectively (and can't even handle basic swap space in a sane way) and therefore costs far more RAM than paging ever will.
Good operating systems try to use all RAM - anything that isn't used by applications gets used to improve the speed of other things (e.g. file system caches to speed up file IO, DNS caches to speed up networking, etc). In this case (almost all RAM used almost all the time) compaction becomes "shift almost everything in RAM almost every time anything is allocated".
For a rough estimate, if you gathered the top designers/researchers in the world and threw billions of $$$ at them (every year for ten years) to create the best CPU that supports segmentation and the best possible OS that uses segmentation; then the resulting OS (for real world usage, not meaningless carefully selected micro-benchmarks) will probably be about 100 times slower than Windows or Linux simply because of segmentation alone.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
zaval wrote:What you are talking about, QByte, is just huge pages. And they are not flexible and only could have advantages when used rarely where they really needed. using only them is nothing but system memory wastage.
Segmentation is in fact less wasteful of system memory than paging. It doesn't require the overhead of large page tables for each process, which reduce the amount of physical memory available to user-space. Additionally, wasted memory from internal fragmentation is completely avoided; each process can be allocated precisely the amount of memory it needs down to a single byte.
But as you yourself have said, RAM is plentiful. Therefore, we can afford the memory overhead imposed by internal fragmentation and page tables.
External fragmentation doesn't waste system memory, it merely means that you may need to occasionally compact some of the existing segments to form a contiguous region that can be used to satisfy a new allocation request.
And this can slow a program down quite a bit. Furthermore, no segment can be larger than physical memory (whereas a program on a paged system can mmap a file much larger than RAM).
This is all beside the point anyway, which is that since we have such large amounts of physical memory now, these types of considerations are not as important as performance, and segmentation can provide better performance for the reasons I've stated earlier.
Segmentation actually has quite a few impediments to its performance:
Every memory access now incurs an addition for the base + offset, whereas with paging, if you get a TLB hit, you just read the physical page frame number from the TLB and tack the offset onto the end. If a good portion of physical memory is in fact being used, you will lose significant time to compaction. If you do in fact hit swap, you can only swap on segment granularity. Even if you have plenty of RAM free, you can only do copy-on-write on segment granularity. These last two aren't an issue if you use segments on the order of what your page size would be, but then you suddenly have lots of segment table entries, so that advantage of segmentation goes away. If you use larger segments, then you copy a lot of data you don't need to when a program modifies one byte in a copy-on-write segment, whereas on a paged system you would only copy one page. If the system is under memory pressure, you can end up swapping out a whole segment, possibly several megabytes in size, when the allocation that triggered the swap may only have been for a few tens of kilobytes. An application that allocates a huge sparse array in one segment and only touches small portions of it can end up causing the system to thrash on swap, or even starve the system for memory completely: every time the application is accesses that segment, the whole segment is swapped in, even though most of it is empty, and any time any other application is scheduled, that whole segment has to be swapped back out to make room. This *will* kill performance. Meanwhile, with a paged system that overcommits memory and allocates pages on use, the portions of that sparse array that don't get populated don't get any physical memory assigned to them, don't evict pages that other programs are actually using when the application accesses the array, and don't get uselessly juggled back and forth between swap and RAM, even if the size of the array actually exceeds the combined amount of available swap and RAM.
My philosophy on segmentation is that there's something to be said for a non-flat address space (though most people will disagree with me on that), but whatever memory looks like to programs, base/bounds and limit/offset segmentation are obsolete. If you implement a non-flat address space in a modern CPU, it *must* be based on paging on the back end.
linguofreak wrote:Segmentation actually has quite a few impediments to its performance: Every memory access now incurs an addition for the base + offset, whereas with paging, if you get a TLB hit, you just read the physical page frame number from the TLB and tack the offset onto the end.
I would imagine the addition to be neglible here, as the operation is essentially the same as reading an operand from a register and adding a constant to it, which happens all in one cycle. The TLB in a paging MMU, however, is implemented as content-addressable memory and therefore takes up a lot more die space and consumes more power than the hardware required for a simple memory read + addition, and is still highly susceptible to TLB misses which segmentation entirely avoids, but I digress.
If the system is under memory pressure, you can end up swapping out a whole segment, possibly several megabytes in size, when the allocation that triggered the swap may only have been for a few tens of kilobytes. An application that allocates a huge sparse array in one segment and only touches small portions of it can end up causing the system to thrash on swap, or even starve the system for memory completely: every time the application is accesses that segment, the whole segment is swapped in, even though most of it is empty, and any time any other application is scheduled, that whole segment has to be swapped back out to make room.
These points would only be true of a very naive implementation. Nothing mandates that segments need always be swapped in and out in full and there are numerous simple strategies which avoid that. Of course, segmentation fares worse than paging in this regard, but the situation isn't as fatal as you're implying.
Schol-R-LEA wrote:In any case, the rise of NVMe memory will probably lead to a more hierarchical memory structure, rather than less, especially since it is likely to lead to an abandonment of the current file-oriented models entirely in favor of persistent operating systems, meaning that the OS will have to be able to work with different kinds of memory fluently and transparently in a fine-grained manner. This will probably lead to a hybrid approach that differs from both paging and segmentation in significant ways.
Brendan wrote:For a rough estimate, if you gathered the top designers/researchers in the world and threw billions of $$$ at them (every year for ten years) to create the best CPU that supports segmentation and the best possible OS that uses segmentation; then the resulting OS (for real world usage, not meaningless carefully selected micro-benchmarks) will probably be about 100 times slower than Windows or Linux simply because of segmentation alone.
I would conjecture that a project of that scale would most likely result in the development of high-speed, high-density, non-volatile RAM, the so-called "universal memory", and the integration of said memory onto the CPU die as per the Berkeley iRAM project. This would have rather profound implications for system design, one of which would be calling into question the usefulness of many aspects of paging, since secondary storage would be eliminated and now everything would reside persistently in a "single-level store" as envisioned by the designers of Multics, who were well-known advocates of segmentation. With circa 2 TB of on-chip non-volatile RAM, segmentation would be well positioned to undergo a renaissance.
If the system is under memory pressure, you can end up swapping out a whole segment, possibly several megabytes in size, when the allocation that triggered the swap may only have been for a few tens of kilobytes. An application that allocates a huge sparse array in one segment and only touches small portions of it can end up causing the system to thrash on swap, or even starve the system for memory completely: every time the application is accesses that segment, the whole segment is swapped in, even though most of it is empty, and any time any other application is scheduled, that whole segment has to be swapped back out to make room.
These points would only be true of a very naive implementation. Nothing mandates that segments need always be swapped in and out in full and there are numerous simple strategies which avoid that. Of course, segmentation fares worse than paging in this regard, but the situation isn't as fatal as you're implying.
If you're able to split a segment into pieces (and swap in/out pieces) it's easier/cheaper/faster to add attributes to those pieces (pages) and not bother with all the additional overhead of segments.
Qbyte wrote:
Schol-R-LEA wrote:In any case, the rise of NVMe memory will probably lead to a more hierarchical memory structure, rather than less, especially since it is likely to lead to an abandonment of the current file-oriented models entirely in favor of persistent operating systems, meaning that the OS will have to be able to work with different kinds of memory fluently and transparently in a fine-grained manner. This will probably lead to a hybrid approach that differs from both paging and segmentation in significant ways.
Brendan wrote:For a rough estimate, if you gathered the top designers/researchers in the world and threw billions of $$$ at them (every year for ten years) to create the best CPU that supports segmentation and the best possible OS that uses segmentation; then the resulting OS (for real world usage, not meaningless carefully selected micro-benchmarks) will probably be about 100 times slower than Windows or Linux simply because of segmentation alone.
I would conjecture that a project of that scale would most likely result in the development of high-speed, high-density, non-volatile RAM, the so-called "universal memory", and the integration of said memory onto the CPU die as per the Berkeley iRAM project. This would have rather profound implications for system design, one of which would be calling into question the usefulness of many aspects of paging, since secondary storage would be eliminated and now everything would reside persistently in a "single-level store" as envisioned by the designers of Multics, who were well-known advocates of segmentation. With circa 2 TB of on-chip non-volatile RAM, segmentation would be well positioned to undergo a renaissance.
Let's design an OS based on persistent objects.
First, if you have 1000 text files there's no point duplicating the code for your text editor 1000 times, so at the lowest levels you'd split the object into "code" and "data" (in the same way that objects in C++ are split into "class" and "data" where the code is not duplicated for every object). Second, because hardware and software is always dodgy (buggy and/or incomplete, and needing to be updated), and because people like to be able to completely replace software (e.g. switch from Linux to Windows, or switch from MS Office to Libre Office, or ...) you'd want to make sure the data is in some sort of standardised format. Third, to allow the data to be backed up (in case hardware fails) or transferred to different computers you'd need a little meta-data for the data (e.g. a name, some attributes, etc). Essentially, for practical reasons (so that it's not a fragile and inflexible joke) "persistent objects" ends up being implemented as files, with the code stored as executable files and the data stored as data files (with many file formats).
Fourth, when modifying data users need/want the ability to either discard all their changes or commit their changes. Fifth, there may be multiple users modifying the same file at the same time. Sixth, if/when software crashes you want to be able to revert the data back to the "last known good" version and don't want the data permanently corrupted/lost. Essentially, you end up with a "create copy, modify the copy, then commit changes" pattern; which is identical to "load file, modify file, save file".
So...
If all RAM is non-volatile; the first thing we need to do is partition that RAM so that some is used for "temporary stuff" and the remainder is used for file system/s.
Note that we can cheat in some cases; primarily, we can use the equivalent of "memory mapped files" (where pages from the file system are mapped into objects so that "RAM as RAM" isn't used) with "copy on write" tricks so that if a memory mapped file is modified we aren't modifying the file itself. We could also borrow "currently free file system RAM" and use it for "temporary stuff" if there's a way to for the file system to ask for it back if/when it's needed, so that the amount of "RAM as RAM" is likely to be larger than the minimum that was reserved when partitioning the RAM.
What we end up with is something that is almost identical to what we already have - executable files and data files, objects/processes, and paging (and tricks to save memory). The only real differences are that "hibernate" is faster (you can turn the computer off without copying data from RAM to swap space and restore everything when the computer is turned back on without copying data from swap space back into RAM), swap space would be replaced with its opposite ("file system pages borrowed and used as RAM"), and there'd be no reason to have file system caches. These are all relatively superficial differences - they don't change how applications are written or how computers are used.
The main reason to want non-volatile RAM (rather than just very fast SSD) has nothing to do with the OS or software at all - it's power consumption. For DRAM, when there's lots of RAM (many GiB, where only a small amount is actually being read/written at any point in time) you end with a lot of power consumed to refresh the contents of RAM (and SRAM is significantly worse, always consuming power). Non-volatile RAM (with many GiB, where a large amount isn't being read/written at any point in time) could consume significantly less power.
Cheers,
Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
@Brendan: how familiar are you with prior work on persistent systems such as KeyKOS or CoyoteOS? I expect that you have looked at them given you thoroughness, but I was curious about your opinion on them.
Also, I think that we need to draw a distinction between segmentation on the x86 and segmentation in general. The segmentation model for the x86 has evolved in a very different direction from anywhere else, but as I said, the main purpose of segmentation, regardless of platform, was unrelated to memory management as a general idea - it was a fix to a specific problem, that of addressing 2^n memory cells with fewer than n address lines, a problem which no longer exists because the cost of additional addressing is trivial now.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
One thing that segmentation could be useful are services that needs to be fast.
Take the following example with a DMA driver. With a microkernel, the DMA driver would likely be in its own process in order to fulfill the advantages with process isolation. Other drivers that use the DMA driver would be in their own processes. Problem is that with a driver you want to be able to start DMA jobs quickly and going through the kernel with IPC and scheduling is really the long and slow way. With a DMA driver this is usually not necessary as often DMA blocks are designed so that each channel can be controlled independently. Segmentation could solve this problem by allowing a segment of code to allowed to be executed by other processes, like a "public code segment". The CPU needs of course be able to support this by understanding that when inside the "public code segment", it is also allowed to access its own code and data thus the public code segment can act as a gateway.
In the case of the DMA driver, the public code segment would simply just take a DMA job descriptor and the code just sets the parameters for the DMA channel. This is far faster than going through the kernel.
One interesting part of this is the security, how to make this functionality secure. For example the caller process cannot be allowed to read the public code segment, only execute it. Should entry points be limited etc. The CPU also needs to understand which segment which faults, caller data or the service code or data.
Brendan wrote:What we end up with is something that is almost identical to what we already have - executable files and data files, objects/processes, and paging (and tricks to save memory). The only real differences are that "hibernate" is faster (you can turn the computer off without copying data from RAM to swap space and restore everything when the computer is turned back on without copying data from swap space back into RAM), swap space would be replaced with its opposite ("file system pages borrowed and used as RAM"), and there'd be no reason to have file system caches. These are all relatively superficial differences - they don't change how applications are written or how computers are used.
You forgot that it likely would result in filesystem APIs to change, with mmap-style usage being more commonplace (no need to restrict the way CPU can access data from a file when the hardware isn't getting in the way). In fact, that may even encourage programs that make heavier use of temporary files (I've been considering this for Indigo, since 64KB of RAM is not much when dealing with user data and the cartridge can provide much larger non-volatile RAM).
But yeah, that'd be just an evolution. I really doubt files are ever going away, they're mainly a way to organize things. How they are stored doesn't affect that core concept, at most it only affects the design of the APIs that let programs access them.
I think people have a completely wrong idea why an OS would use segmentation. Segmentation provides byte-granularity limit checking, while paging at best can disallow applications to mess up kernel memory.
The x86 segmentation implementation is the best memory protection design ever done, but something similar can be accomplished on x86-64 by using the higher 16 bits of the address as a segment.
And of course, segmentation should run on top of paging. Paging is good for managing physical memory.
rdos wrote:The x86 segmentation implementation is the best memory protection design ever done.
This statement is Not Even Wrong. The x86 segmentation model isn't memory protection. Period. Nor is the paging system. The memory protection system works together with them, but they aren't part of it, and it isn't part of them.
Repeat after me: there is no memory protection in a stock PC running in real mode. There is, or at least can be, hardware memory protection in 16-bit and 32-bit protected mode, and in long mode, but neither segments nor paging are part of it.
As for other systems, well, 30+ years of OS dev experience or not, I can't see how anyone who had worked with a memory protection system on any other system could say that (for ones which had hardware memory protection at all, that is, i.e., things like the 68000 or Z8000 without an MMU add-on, the 6502, the 8080/Z80, the 6809, the 8051, and the AVR, don't count any more than the original 8086 would in that regard).
The x86 memory protection system - regardless of the memory management used (though segments aren't really that, either, even in protected mode) - is one of the worst kludges imaginable, which I will admit at least makes it consistent with the rest of the x86 design.
Not that I expect you to believe me; like certain others here, you are under the delusion that the relationship between product quality and market success is linear, rather than a bell curve as it more often is. The general rule of thumb is that for any given market, all other things being equal (which is rarely true, but bear with me), the least acceptable product - the one that fits, but only barely, the primary market demands - will usually be the one which performs best in the market. The x86 didn't win because it was better; it won because it was just good enough to be usable, but not so good that its quality became an impediment for it.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Speaking of memory mapped files, segmentation together with paging would be a good fit for memory mapped files. With virtual memory you have to know how much virtual memory is needed for the memory mapped file. You can expand and move the virtual region but you always risk tail gating another virtual region when you expand them. With segmentation the memory mapped file lives in its own segment and can be expanded and modified anytime, very easily.
Just take this recent discussion about virtual memory, viewtopic.php?f=15&t=32747, where a big question how manage virtual address. Usually using accelerator structures and manage to slice, dice and merge the regions back together. With segmentation all that is gone. Virtual memory is like laying out tarot cards without the gaming or occult benefit.
Same thing about heap, expanding it will be trivial if you use a continuous heap and no worries about growing regions from the other direction.
I can see a lot possibilities with segmentation and I would very much see a new takes on segmentation in a new processor designs.
I'm not interested in market success. I'm interested in elegant solutions, and especially those that do not have all the problems of C and flat memory models, like accidental memory overwrites and corrupt heaps. I have many years of C/C++ application development behind me, and the worst problems are those related to pointers and the misuse of them, because these problems typically appear at unrelated places. 32-bit protected mode using segmentation mostly solves this issue in an elegant way. It would have been even more elegant if selectors were 32 bit, and the processor had segmentation optimizations.
The alternatives are micro-kernels and interpreted languages. Today, interpreted languages like Java are a lot more popular than C++, and the reason is that it solves the issues with memory corruption. Something that segmentation can solve in a much less expensive way provided a reasonable architecture with 32-bit selectors and descriptor caches.
Another segmentation problem in multicore OSes is that when selectors are no longer used they can accidentally be loaded in some segment register but never used, something that will generate a protection fault if an IRQ or thread switch happens. However, I've solved this issue by simply loading a zero selector in the protection fault handler when invalid selectors are loaded.
I think you are looking in the wrong direction regarding languages like Java, Ruby, and Python; those are mostly used for applications work, and scripting in the latter cases.
For systems work, the hot languages right now are native-compiled languages such as Rust, D, and... well, C and C++, sad to say. Oh, and guess Swift and Go, but I really can't say how much interest they have gotten outside of Apple and Google, respectively.
We all are aware of the faults of C, and even more, of C++, for OS development. I haven't looked enough at Rust or D, but I get the impression that they exist as attempts to 'fix' C by patching the warts on them rather than starting from scratch, and since the most common compilers for both are back-ends for the same GNU Compiler Collection and LLVM compiler suites used for compiling Linux, they have precisely the same dependency on a flat memory model - which, as I have stated before, is an issue with the compiler, not the language (to repeat: there have been C compilers for p-mode x86 that do not have this limitation).
All I know about Swift is that Apple apparently created it in a fit of NIH over Objective-C++, and all I know about Go is that it exists.
And again: segmentation and paging are orthogonal to memory protection, as well as to application-level memory management. Similarly, automatic memory management is an orthogonal issue to interpretation vs compilation - native-compiled code can, and in modern languages often does, use automatic memory management systems such as auto-pointers or garbage collection. Similarly, automatic memory management and memory safety are separate issues - garbage collection is for ensuring memory gets reclaimed (e.g., reducing memory leaks), not for preventing buffer overflows or wild pointers (though it usually does have that effect as well).
You know I am a Lisp weenie, right? That I am intending to design a systems-programming Lisp dialect (or at least something heavily influenced by Lisp)? Well, contrary to the usual misunderstanding, both Scheme and Common Lisp are defined as compiled languages - but ones where a compile-and-go REPL is usually used, and which makes writing interpreters trivial, so people who have taken classes which wrote interpreters as class projects assumed they were really interpreted.
Let me redirect this, because I suspect I see where part of the problem lies: aside from x86, which CPUs with hardware memory protection (either built-in or add-on) have you used? You keep speaking of this as if x86 is the only thing in the universe, and... well, that doesn't make sense to me, especially for someone working in the hardware-control-system space.
No offense, but PCs have never seemed to me to be a good solution for that, anyway. There is a reason why eCos, Menuet, and QNX have never been more than niche systems, and even then they are usually just used as central controllers managing several microcontrollers in the devices themselves - no one deploys any of those just to run a single device. Pretty much everyone else designing systems for controlling things like fuel station pumps have gone with 8-bit microcontrollers connected to a central management console or a point-of-sale terminal, which (in the US, I can't speak for Europe) at the time you were first working on this usually meant an NCR or IBM cash-register/terminal, and today usually means a kiosked Window PC or a fixed-purpose tablet.
Now, I am in no way an expert on this, and for all I know you've hit on a trick the rest of the industry missed; but somehow, I am skeptical. However, this is getting off-topic, and is probably a bit too personal as well.
Getting back to the actual question: what experience do you have with the memory protection systems - not the paging systems, as I said paging and segmentation are orthogonal issues to that - in systems such as ARM, MIPS, 68000 (when matched to a 68881 or some other MMU), PowerPC, IBM POWER Systems (that is, the minis or blade servers, as opposed to the related PowerPC), SPARC, Alpha, VAX, or any other systems with such memory protection?
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.