Re: Memory Segmentation in the x86 platform and elsewhere
Posted: Thu Apr 12, 2018 11:24 am
It just occurs to me that rdos might be conflating 'segmentation' with the concept of a 'modified Harvard architecture'.
Just to recap on this: the Harvard architecture, named after the Harvard Mark I electromechanical computer, is a type of stored-program computer in which the instructions store is physically separate from the data store. In the Mark I, this was done for practical reasons relating to how the instructions and data were routed to the CPU and ALU (which in most early systems were also physically separate units) - instructions would go to the CPU, computation data would go to the ALU, and the CPU would tell the ALU which operation to perform.
There was no straightforward way to transfer between the two memories. This wasn't seen as a problem, because the whole idea of a stored program system was in its infancy, and it was assumed that program stores would always be the smaller of the two. Most of the other systems of the time (the Colossus, the Atanasoff-Berry Computer, the Zuse Z3, and the ENIAC, and so forth) weren't stored-program systems at all (though ENIAC was later rebuilt as one), and the importance of that approach wasn't appreciated until around 1946.
The Von Neumann architecture (after Johnny von Neumann), which arose a bit later following the e 'Summer Camp' conference in 1946 and first used in the EDVAC and EDSAC computers, is the other common way to design a stored-program computer, and became almost but not quite universal in the 1950s and later. In this design, a single memory is used for both the instructions and the data, and the instructions are capable of modifying other instructions on the fly. This was risky, but was sometimes useful, or in some really early systems, necessary for some basic operations such as function calls (like with most things when they are being done the first few times, the early designers were often guessing at what would or wouldn't be needed, and often made a lot of mistakes - some of which would get permanently enshrined in the system later).
As with the Harvard architecture, this was originally just an engineering solution - with memories based on things such as mercury delay lines, and a CPU and ALU built on vacuum tubes, it was easy to re-route the signals to the right part of the system, but expensive to build two separate memories.
There was a disagreement from the start about whether the Harvard design was safer that the Von Neumann design, apparently, but in the end, practical engineering issues trumped the questions about how safe self-modifying code was for the first and second generations of stored-program electronic computers.
By the time transistor-based systems with ferrite-core memories were making those engineering considerations moot, the Von Neumann approach had proven useful, if not necessarily as safe. Computer designers started trying to come up with a way to secure the instructions part of the time, while still allowing the privileged system software to load programs as needed, or even monkey-patch code (e.g., in a combination loader and linker) before locking it again in order to run the program securely.
This led to the the 'Modified Harvard architecture', which is what is what we are really talking about when we discuss 'memory protection'. Any modern system with memory protection built into the CPU's memory management is, in effect, a Modified Harvard system (even though most introductory textbooks will still call it a 'Von Neumann' architecture). This would come to be standard on mainframes by 1970, and minis by around 1978 or so, but wouldn't start to supplant pure Von Neumann designs in microcomputers until the late 1980s.
Paging? Segmentation? Separate matters entirely. They both solve a different set of problems from the memory protection, as well as from each other. Segmentation, as I said before, is about stuffing an m-bit address space into n address lines when n < m. Paging is about moving part of the data or instructions in a fast memory storage to a slower one and back in a way that is transparent to the application programmers (that is, without have to explicitly use overlays and the like).
All three overlap with yet another separate idea, virtual address spaces. While VAS is often mistakenly thought to provide additional memory protection, this is not actually the case on the x86 - it is always possible to access other memory address spaces, if the memory protection doesn't prevent it, because the separate address spaces are all built on top of either paging, or segmentation, or in the case of the x86, both at the same time. However, by default the memory protection does do this for all non-privileged (i.e., user) code.
A memory protection system may need to work in conjunction with whatever other memory management sub-systems exist on a CPU, and may even incorporate side properties of them in order to organize the memory being managed (more on this shortly), they are orthogonal concerns from both memory protection and from each other. You can have memory protection without either segmentation or paging at all.
Caching adds some complexity to this, but since cache consistency is a problem anyway, those issue get resolved as part of the caching itself. Caches basically add a limited form of content addressable memory, where the memory tag is says block of memory is is caching, and those cache blocks may or may not correspond to pages or segments - mind you, paging fits better, since cache blocks are of a fixed, small size, which can be mapped to similarly sized pages (hence the performance difference sometimes seen when actively manipulating segmentation on the x86). However, no current systems applies tagging to the entire memory space, nor are the cache's tags accessible to the system software - they are entirely internal and managed by the hardware.
However, since both paging and segmentation involve breaking the memory into separate blocks, and the memory protection has to do the same, the memory protection can just use the blocks defined by the other sub-system rather than having its own blocks. This works out well, because the protection system has to check the validity of every memory access, while in most implementations, both paging and segmentation are translating between the effective addresses and the physical memory locations on each and every main memory access. Since both the protection checks and the translations are both necessary every time, it is easiest to do them together as much as feasible - no point in doing operations repeating operations that overlap a lot.
To sum up: on the x86 in 32-bit protected mode, there is no difference whatsoever in the degree of protection one gets by actively using segments from that gotten from by setting the segments to a flat virtual space. None. Period.
Segmentation only wins over paging if you are using separate segments for every individual data element - as in, every variable has its own segment. Even then, the only advantage is in how well the segment size matches the object's size; you can do the same thing with pages, but since the sizes are fixed it almost always has a size mismatch.
This isn't practical for either segments or paging on the x86, in any case, because of how the page tables and segment registers work. Individually protecting every object would require a radical redesign, something along the lines of... well, a capability-addressing system.
Capability-based addressing would be part of the memory protection as well, being basically a more fine-grained form of the same memory protections, except that it checks the source of the access rather than the element being accessed. Since the burden of proof is on the requester, rather than the object's state, it turns the entire approach on its head, and becomes a lot more flexible and secure.
My guess is that this is what rdos think segmentation is giving the system, but if so, he is mistaken. The memory protection system doesn't really provide that at all in any current CPU design, which is a damn shame because it would do exactly what he (and several others, myself included) seems to be looking for.
Just to recap on this: the Harvard architecture, named after the Harvard Mark I electromechanical computer, is a type of stored-program computer in which the instructions store is physically separate from the data store. In the Mark I, this was done for practical reasons relating to how the instructions and data were routed to the CPU and ALU (which in most early systems were also physically separate units) - instructions would go to the CPU, computation data would go to the ALU, and the CPU would tell the ALU which operation to perform.
There was no straightforward way to transfer between the two memories. This wasn't seen as a problem, because the whole idea of a stored program system was in its infancy, and it was assumed that program stores would always be the smaller of the two. Most of the other systems of the time (the Colossus, the Atanasoff-Berry Computer, the Zuse Z3, and the ENIAC, and so forth) weren't stored-program systems at all (though ENIAC was later rebuilt as one), and the importance of that approach wasn't appreciated until around 1946.
The Von Neumann architecture (after Johnny von Neumann), which arose a bit later following the e 'Summer Camp' conference in 1946 and first used in the EDVAC and EDSAC computers, is the other common way to design a stored-program computer, and became almost but not quite universal in the 1950s and later. In this design, a single memory is used for both the instructions and the data, and the instructions are capable of modifying other instructions on the fly. This was risky, but was sometimes useful, or in some really early systems, necessary for some basic operations such as function calls (like with most things when they are being done the first few times, the early designers were often guessing at what would or wouldn't be needed, and often made a lot of mistakes - some of which would get permanently enshrined in the system later).
As with the Harvard architecture, this was originally just an engineering solution - with memories based on things such as mercury delay lines, and a CPU and ALU built on vacuum tubes, it was easy to re-route the signals to the right part of the system, but expensive to build two separate memories.
There was a disagreement from the start about whether the Harvard design was safer that the Von Neumann design, apparently, but in the end, practical engineering issues trumped the questions about how safe self-modifying code was for the first and second generations of stored-program electronic computers.
By the time transistor-based systems with ferrite-core memories were making those engineering considerations moot, the Von Neumann approach had proven useful, if not necessarily as safe. Computer designers started trying to come up with a way to secure the instructions part of the time, while still allowing the privileged system software to load programs as needed, or even monkey-patch code (e.g., in a combination loader and linker) before locking it again in order to run the program securely.
This led to the the 'Modified Harvard architecture', which is what is what we are really talking about when we discuss 'memory protection'. Any modern system with memory protection built into the CPU's memory management is, in effect, a Modified Harvard system (even though most introductory textbooks will still call it a 'Von Neumann' architecture). This would come to be standard on mainframes by 1970, and minis by around 1978 or so, but wouldn't start to supplant pure Von Neumann designs in microcomputers until the late 1980s.
Paging? Segmentation? Separate matters entirely. They both solve a different set of problems from the memory protection, as well as from each other. Segmentation, as I said before, is about stuffing an m-bit address space into n address lines when n < m. Paging is about moving part of the data or instructions in a fast memory storage to a slower one and back in a way that is transparent to the application programmers (that is, without have to explicitly use overlays and the like).
All three overlap with yet another separate idea, virtual address spaces. While VAS is often mistakenly thought to provide additional memory protection, this is not actually the case on the x86 - it is always possible to access other memory address spaces, if the memory protection doesn't prevent it, because the separate address spaces are all built on top of either paging, or segmentation, or in the case of the x86, both at the same time. However, by default the memory protection does do this for all non-privileged (i.e., user) code.
A memory protection system may need to work in conjunction with whatever other memory management sub-systems exist on a CPU, and may even incorporate side properties of them in order to organize the memory being managed (more on this shortly), they are orthogonal concerns from both memory protection and from each other. You can have memory protection without either segmentation or paging at all.
Caching adds some complexity to this, but since cache consistency is a problem anyway, those issue get resolved as part of the caching itself. Caches basically add a limited form of content addressable memory, where the memory tag is says block of memory is is caching, and those cache blocks may or may not correspond to pages or segments - mind you, paging fits better, since cache blocks are of a fixed, small size, which can be mapped to similarly sized pages (hence the performance difference sometimes seen when actively manipulating segmentation on the x86). However, no current systems applies tagging to the entire memory space, nor are the cache's tags accessible to the system software - they are entirely internal and managed by the hardware.
However, since both paging and segmentation involve breaking the memory into separate blocks, and the memory protection has to do the same, the memory protection can just use the blocks defined by the other sub-system rather than having its own blocks. This works out well, because the protection system has to check the validity of every memory access, while in most implementations, both paging and segmentation are translating between the effective addresses and the physical memory locations on each and every main memory access. Since both the protection checks and the translations are both necessary every time, it is easiest to do them together as much as feasible - no point in doing operations repeating operations that overlap a lot.
To sum up: on the x86 in 32-bit protected mode, there is no difference whatsoever in the degree of protection one gets by actively using segments from that gotten from by setting the segments to a flat virtual space. None. Period.
Segmentation only wins over paging if you are using separate segments for every individual data element - as in, every variable has its own segment. Even then, the only advantage is in how well the segment size matches the object's size; you can do the same thing with pages, but since the sizes are fixed it almost always has a size mismatch.
This isn't practical for either segments or paging on the x86, in any case, because of how the page tables and segment registers work. Individually protecting every object would require a radical redesign, something along the lines of... well, a capability-addressing system.
Capability-based addressing would be part of the memory protection as well, being basically a more fine-grained form of the same memory protections, except that it checks the source of the access rather than the element being accessed. Since the burden of proof is on the requester, rather than the object's state, it turns the entire approach on its head, and becomes a lot more flexible and secure.
My guess is that this is what rdos think segmentation is giving the system, but if so, he is mistaken. The memory protection system doesn't really provide that at all in any current CPU design, which is a damn shame because it would do exactly what he (and several others, myself included) seems to be looking for.