Memory Segmentation in the x86 platform and elsewhere

rdos · Post by **rdos** » Sun Feb 05, 2017 11:13 am

Octocontrabass wrote: I'm actually aiming for more of a hybrid kernel, where components that can do a lot of damage regardless of separation (like the memory manager) are part of the kernel since the benefits of separation do not outweigh the costs. This means system calls for things like memory allocation don't require an address space switch at all.

For things like passing messages between tasks, the kernel has no need to validate the contents of the message; it only checks to make sure that the two tasks are allowed to communicate. It's up to the receiver of the message to validate its contents. What's to stop a driver from making fraudulent requests in your system?

I use registered entry points both for functions that are callable from userland and drivers. These are either patched to call-gates (userland), or with the actual 48-bit address when they are called. Sure, it is always possible to code direct far-jumps, but then malicious manipulation is not in the scope of the design. Normal requests to other drivers (like memory allocation) always are done with a syscall macro which is then translated on first use to a far call. Drivers are also separately linked, so no huge kernel binary.

Octocontrabass wrote: Don't ignore the costs of using segments with non-zero base addresses. Remember, on modern x86, that adds a lot of overhead too. I can work towards a system that minimizes address space switches; there is nothing you can do to remove segmentation overhead.

Certainly. The x86-64 solution would have better performance, but poorer locality (more TLB misses).

Octocontrabass wrote: The use of separate address spaces means my drivers can't corrupt each another, too. So, where is my design lacking in adequate protection? Where does your design prevent malicious code from guessing the right segment to use to access someone else's data?

As I already wrote, preventing malicious manipulation is not the goal. Rather, I solve this issue by not letting people install drivers freely (in production versions).

Octocontrabass wrote:
rdos wrote:Nothing, but just like segmentation, it narrows the scope of bugs which means they are found quicker and are less likely to remain in release versions as fatal errors.
So you're saying that RDOS is as easy to compromise as MS-DOS?

Not really. RDOS uses paging for process separation and runs userland at ring 3. MS-DOS actually doesn't use segmentation in a smart way. It is only used to extended the address range, and cannot be used for limit checking.

Schol-R-LEA · Post by **Schol-R-LEA** » Sun Feb 05, 2017 12:34 pm

@rdos: OK, I think I am slowly getting a clearer picture here.

First, I think we have mostly misunderstood how you were using segmentation, and why. The fact that it applies specifically to kernel drivers, rather than applications, is a crucial factor I (and I suspect most others here) misunderstood.

Also, the fact that you were discussing it in terms of software development and debugging rather than protection from malicious code - and that malicious code was not a significant consideration in the first place - puts most of us in a very unfamiliar position, as it means something most of us see as a primary concern isn't even in the picture. A stand-alone system (or closed set of nodes) with a tightly controlled software environment and no real chance of introducing unmonitored code (unless the system is physically compromised in ways that would make OS security meaningless) is a situation that most OS devs haven't seen since the early 1980s at the latest.

What I am saying is that we have been debating at cross purposes: most of the others, including myself, have been talking about general design strategies, and focused on prevention of malicious code, whereas you are discussing the specific case of an RTOS which a) does not run general applications from unvetted sources (and presumably only communicates with other nodes of the same type, if it is doing any networking at all), b) is designed for a specific platform with no intention of portability, and c) is meant to address a set of constraints where RT performance is the key factor (meaning that it has to take actions in very specific timing ranges - not necessarily as fast as possible, but within the window between t and t' without fail).

One of the things this means is that the use of virtual memory, while not impossible, must be limited to processes that lack RT constraints, and page-ins and page-outs cannot be initiated if there is a chance that the processing time of a page request will encroach on a RT window, among other things. As a rule, RT and virtual memory are pretty much incompatible, so using paging for any core operations except in a supporting role would be an unattractive proposition.

The fact that portability isn't on the horizon at all also changes a lot of things, but that isn't the real key factor here.

Schol-R-LEA · Post by **Schol-R-LEA** » Sun Feb 05, 2017 12:59 pm

rdos wrote:I use registered entry points both for functions that are callable from userland and drivers. These are either patched to call-gates (userland), or with the actual 48-bit address when they are called.

EDIT: never mind, on re-reading your statement it isn't what I was thinking. I'll keep the posted text in case anyone read it already, but I can see I was filtering it through my own expectations.

The former method sounds a bit like a C-list to me (a common way of implementing capabilities), except it also sounds like you aren't generating them using a hash - not that you would need to, given the security situation already discussed.

This is also pretty close to how Synthesis passes a handle to a synthesized code unit to a quaject:

Masslin, Pu, and Ioannidis wrote:Synthesized code is protected through memory management. Each address space has its own page table, and synthesized code is placed in protected pages, inaccessible to the user program. To prevent the user program from tricking the kernel into executing code outside the protected pages, the synthesized routines are accessed via a jump table in the protected area of the address space. Since the user program can only specify an index into this table, the synthesized routines are entered at the proper entry points. This protection mechanism is similar to C-lists to prevent the forgery of capabilities.

I have been considering doing the same, but I am more inclined to use a more general closure-based capability model and have access to synthesized code as a special case of that - I am increasingly thinking that what we usually call capabilities can have a more general role than security alone.

Octocontrabass · Post by **Octocontrabass** » Sun Feb 05, 2017 2:27 pm

rdos wrote:Sure, it is always possible to code direct far-jumps, but then malicious manipulation is not in the scope of the design.

So, there is nothing preventing a driver from doing anything malicious.

rdos wrote:As I already wrote, preventing malicious manipulation is not the goal. Rather, I solve this issue by not letting people install drivers freely (in production versions).

What prevents drivers from being installed?

rdos wrote:Not really. RDOS uses paging for process separation and runs userland at ring 3. MS-DOS actually doesn't use segmentation in a smart way. It is only used to extended the address range, and cannot be used for limit checking.

I'm using MS-DOS as an example of an operating system that can easily be compromised. If you'd prefer, I could use classic Mac OS as my example instead.

Schol-R-LEA wrote:RTOS

I don't see anything on this page that suggests RDOS is designed for real-time operation, but it does mention "segment protection and paging to provide a stable and secure system". You're not confusing Leif Ekblad's RDOS with Data General's RDOS, are you?

Schol-R-LEA · Post by **Schol-R-LEA** » Sun Feb 05, 2017 3:22 pm

Octocontrabass wrote:
Schol-R-LEA wrote:RTOS
I don't see anything on this page that suggests RDOS is designed for real-time operation, but it does mention "segment protection and paging to provide a stable and secure system". You're not confusing Leif Ekblad's RDOS with Data General's RDOS, are you?

I'm going by what Leif has said here, namely that the primary use is in embedded control systems. Leif states flat out that:
Rewrite from Scratch

rdos wrote:My initial design goal was to write an OS that was compatible with MS-DOS, with multithreading and multiprocess support (the "r" in RDOS stands for "realtime DOS").

The primary use case is stated here:

Best processor for 32-bit [rd]OS - a.k.a RDOS-OS is best (I wish I had seen this thread earlier, as it covers quite a lot of what we've discussed here)

rdos wrote:Solar, I have 15+ years of experience professional embedded system development for petrol stations, and I know what works and what doesn't, and there is no design limitations in RDOS in this regard. The API mostly was designed during the last 15 years, and adapted to what I regard best practices for such applications. So, the API is the consequence of my experience in the area, not a baggage to overcome. Therefore, I don't need to defend anything.

The limited scope for the system is stated here:

OS Graphics

rdos wrote:I have no general end-users that can do things like that anyway, so it is out-of-scope for my design.

However, he has stated elsewhere that RDOS is not hard real-time, something I wasn't certain about until I read this:

OdinOS: I'd love some design feedback

rdos wrote:At one time I planned to add hard real-time extensions. My idea was to reserve a single core for real-time tasks that would not use preemption, and that would receive no hardware IRQs. In my OS, I use IRQ balancing to even out load, so I regularly change IRQ assignments between cores to achieve even load. This works because many IRQs will schedule a thread on the core the IRQ executes on. Threads are mostly sticky, so if they start executing on one core, they would stay on that core, unless the load balancer moves them to the global thread queue where any core can pick them from. The load balancer works on the time scale of 100s of milliseconds, so moving threads have little effect on performance.

Mind you, had I seen the thread "Your exprience on Segmentation vs. Paging", and remembered the discussion in "How each process can have same kernel address space?", I would have known that most if not all of this had been hashed out earlier, rather a critical research failure on my part.

rdos · Post by **rdos** » Sun Feb 05, 2017 3:30 pm

Octocontrabass wrote:
rdos wrote:As I already wrote, preventing malicious manipulation is not the goal. Rather, I solve this issue by not letting people install drivers freely (in production versions).
What prevents drivers from being installed?

Mostly that drivers cannot be installed. RDOS boots from a single binary image that contains the kernel, drivers and applications. It has CRC-sums both for the whole image and per component. That means it's not possible to simply add or change the image (won't boot). It must be built with a special tool. Also, in the production installations, there is a loader that checks exactly which drivers are installed against what is defined on a central site, and rebuilds when there are mismatches.

Octocontrabass wrote:
Schol-R-LEA wrote:RTOS
I don't see anything on this page that suggests RDOS is designed for real-time operation, but it does mention "segment protection and paging to provide a stable and secure system". You're not confusing Leif Ekblad's RDOS with Data General's RDOS, are you?

No, it is not an RTOS. I had some plans to add an RTOS component by dedicating a single core to it on multicore systems, but that is not finalized. In fact, RDOS would work as a general-purpose OS. It would need a special installation utility, but it certainly would be possible.

rdos · Post by **rdos** » Sun Feb 05, 2017 3:33 pm

The description on my site was recently updated, and should cover a lot of these aspects: www.rdos.net/rdos

Schol-R-LEA · Post by **Schol-R-LEA** » Sun Feb 05, 2017 3:46 pm

OK, that clarifies things. I didn't realize that (despite what you said about the origin of the name) it had never actually worked as a real-time OS, hard or soft. I knew it was used for embedded systems, and that you had said the name originally meant "Realtime DOS", but I must have drawn the wrong conclusions.

rdos · Post by **rdos** » Sun Feb 05, 2017 4:06 pm

Schol-R-LEA wrote: The fact that portability isn't on the horizon at all also changes a lot of things, but that isn't the real key factor here.

Well, the project started in 1988, when the 386 processor was fairly new, and before there were any good compilers (at least that I had access to) for it. It actually turned out that focusing on the 386 processor was a pretty good design choice, given that the code still runs 28 years after the choice was made.

This origin also explains the use of assembler code. Actually, I initially used assemblers for DOS that produced DOS executables as drivers. The code patching design was a way to link the modules without having any real linker. New driver models were eventually introduced, but code patching was kept as it was a nice way to ensure entry-points. DOS emulation was initially a major goal, but was later replaced with Win32 emulations, before eventually producing the native mode with the OpenWatcom compiler it uses today.

In fact, there are little similarities between the code from the early 90s and today, but it never was redesigned from scratch.

dozniak · Post by **dozniak** » Mon Feb 06, 2017 4:10 am

I wonder how often this segmentation argument with rdos pops up so that you all fall for it each time? Once a year?

OSwhatever · Post by **OSwhatever** » Thu Nov 09, 2017 3:27 pm

dozniak wrote:I wonder how often this segmentation argument with rdos pops up so that you all fall for it each time? Once a year?

Yep, I feel it's that time of the year. I don't know if it is because it is full moon or not. However, I can't let the segmentation topic go away, so here we go.

I think the segmentation model of x86 was just the wrong way to go but with a new architecture I think it could have a new start.

The problem I want to solve is.
Virtual memory management
Deep page tables on 64-bit architectures
Total position independent code
Move access right to segments rather than per page -> more free bits.

What I'm suggesting is only really viable on 64-bit architectures since we are not near having 2^64 memories. So let's give the last 16 bits to a segment number (or selector if you want) we have big virtual holes today anyway. Having 65536 segments should be enough I think for most processes and maximum offset address of 2^48 is not that bad. Try counting the number of virtual regions in a process in for example Linux and you'll see they it is far less.

Basically an address is 48-bits offset + 16 bits segment number. The segments descriptors are stored in physical memory describing the segment. Instead of a base address, each segment descriptor has instead a physical pointer to a page table. By having a per segment page table we can limit the depth of the page table. If the segment is below 2MB or so less than 4KB is needed to describe it. With segment page tables depths will seldom go over 2 levels. For small segments the page table can even be smaller than 4KB as you can make the architecture allow physical page addresses not aligned to 4KB. However, segment physical start addresses must still be 4KB aligned (well depending on what you choose really can be 8KB as well).

With this method, virtual memory management is basically gone (in terms of manage virtual memory space) since a virtual region is now just a number. Position independent code is a must since we are dealing with segments. By merging 48-bits offset + 16 bits segment number in one word we can easily deal with addresses as they would be any address, easier for the ABI.

tom9876543 · Post by **tom9876543** » Mon Dec 11, 2017 5:02 am

Can we work out a list of all CPU architectures that support Segmentation.

I know of:
Intel 80x86
Intel iAPX 432
Zilog Z8000

Any others?

Schol-R-LEA · Post by **Schol-R-LEA** » Mon Dec 11, 2017 11:55 am

Mostly, it was seen in mainframe and minicomputer architectures from the 1960s and 1970s, though obviously Intel used it in the x86 line as well as in the iAPX 432 microprocessors. As you mention, some (but not all, according to Wicked-Pedo) Z8000 models used 7-bit segment registers to increase the memory addressing to a maximum of 8MiB, as well.

Again, going by Wikipedia, it was used in:

some Burroughs mainframes, specifically the B5000 and B6500, the former apparently being the first commercial system to use segmentation. Notably, both of these systems were designed with high-level languages in mind, specifically Algol. The idea was that the programmers would never access the underlying hardware directly, and never use an assembly language; all programming, including systems programming, was meant to be done in Algol-60, though compilers for other languages did exist IIUC.
The General Electric GE-645 mainframe, which was the primary target and reference implementation for Multics (and wow, apparently Multics was open-sourced in 2007 and emulated versions for various architectures is still being maintained... weird).
The IBM System/38, and its successors the AS/400, the iSeries, and System i. As with Multics, a lot of the software written these is still supported under emulation.
Prime Computer Systems Prime 400, though I gather that their other models didn't use segmentation.

That page claims that Stratus Technologies and Apollo Computer had a segmented memory systems as well, but that seems incorrect.

The page for Stratus seems to indicate that they only made systems based on existing hardware; they didn't make their own CPU designs at all AFAICT. Of the CPUs they did use, only the Intel Xeon was a segmented architecture.

On the page for Apollo's systems, the only original ISA they developed was the PRISM architecture, a classic RISC design with paging but not segmentation, and their earlier workstations all used either Motorola 680x0 CPUs, or a proprietary bit-sliced implementation of the 68000 instruction set called the '2900'.

However, Hewlett-Packard did have a 32-bit segmented design, the FOCUS, though aside from being the first microprocessor with a full 32-bit address space to market (which I suppose were fixed 16-bit segments + 16-bit offsets with no segment overlap, but I can't seem to find out), it doesn't seem to have really influenced anything and vanished pretty quickly. Interestingly, like the Burroughs systems, it was a pure stack machine with no programmer-accessible registers, though it didn't have the restrictions on assembly programming that the B5000 did. It also had a massive 220 instructions, which for the time was huge - comparable to the DEC VAX-11 (243 instructions for the smallest model, IIUC) and the 432 (this document says ~225 instructions, while this paper says 230). It seems to have less of a troubled history than the 432, but nonetheless was a later CISC design that got swept away by the RISC revolution.

Interestingly, I am not seeing any segmented designs originating outside of the US at all (not counting derivatives such as the NEC V20, or Soviet-era K1810VM86 - in fact, all the segmented systems designed outside of the US appear to be x86 clones), and the last new segmented architectures I can find any record of are the Intel APX-432, the HP FOCUS, and those Zilog Z8000 variants. This seems to fit with the fact that segmentation, while used for memory protection in some cases, was primarily a means of saving addressing lines and reducing code space - you could have a total addressable space of, say, 24, 32, or (for some older mainframes) 36 bits, but only need 8, 16, or 18 bits for the majority of address arguments, and it also allowed them to use several clever tricks to reduce the total number of hardware address lines.

Which would also explain why they fell out of favor - not so much because of the problems of writing software with it in mind, but because the price of adding more address lines got to be cheaper than the segmentation support needed to avoid them, and the price of memory dropped enough that saving two bytes per address for most memory accesses just didn't seem to be worth bothering with, especially when there were more significant factors affecting memory use and performance by then (especially the fact that memory access speeds were rising much slower than CPU clock speeds, leading to a demand for chip real estate for caches and pipelining).

So, like RISC, segmented memory was primarily a pragmatic solution to the limits of the then-current technology.

(So was CISC, for that matter, which was mainly about providing a rich assembly programming environment at a time when compiler technology was still quite primitive - though of course it wasn't called CISC at the time, it was just called 'making a bigger and better instruction set', because that was what they assumed led to performance and/or less expensive software.)

But unlike RISC, which was originally advocated for its performance (and being easier to write compilers for - which IMAO is still a good enough reason to prefer it by itself), segmentation's advantages as a design principle aren't really enough to carry it past the disappearance of the specific set of problems it was created to solve.

Designs like ARM continue mainly because they are cheaper to produce in volume, make better use of chip real estate (meaning they are better suited for SoC designs), and lend themselves to energy-efficient implementations when running at low-to-medium clock rates. Segmentation, however, doesn't seem to have enough of an edge over paging in terms of memory protection to justify its complexities in terms of programming (though frankly, that's heavily overstated; there's no real reason for application programmers to even be concerned with it, and Linux aside, it is entirely possible to implement modern OSes with segmentation). They are still a lot of pain for no real gain, so most hardware manufacturers and OS devs don't bother.

linguofreak · Post by **linguofreak** » Sun Dec 17, 2017 11:35 pm

A big question here is how you even define segmentation.

The PDP-11, having spawned Unix, is considered the archetypal flat-address-space architecture, but the way the kernel interacted with the MMU to map logical to physical addresses was broadly similar to the 8086: the kernel loads values into segment registers, when the parts of the logical address space assigned to those segment registers are accessed, the value in the segment register is shifted left by so many bits and added to the logical address, giving a physical address.

On the other hand ESA/390 and z/Architecture have support for non-flat address spaces, which seem to be implemented entirely in terms of paged virtual memory.

Then you have the Apollo guidance computer: The "modern" systems that most closely resemble it in terms of how memory is accessed are the old 8-bit micros with CPUs that had flat 64k address spaces with RAM, ROM, I/O, etc. mapped in and out with an ad-hoc banking scheme implemented on the motherboard. But the AGC was built in an era when whole machines tended to be designed as units by the same manufacturer (as opposed to one manufacturer designing a CPU, another two or three manufacturing various support chips, and then yet another manufacturer integrating all that into a complete computer system), so the distinction between what we would now call "motherboard" and "CPU" functions was less clearly defined, so we could interpret the bank switching scheme as some perverse kind of segmentation.

Solar · Post by **Solar** » Mon Dec 18, 2017 6:53 am

Just tossing in some historical stuff, the famous C64 had something like the bank switching that @linguofreak mentioned as well. There were memory areas where you could access system functionality -- 53280 for border color, 53281 for background color, if memory serves me correctly, and many more I can no longer remember. You could "map out" these system registers and access the underlying memory directly -- again, IIRC.

Where you *definitely* had bank switching going on was the C128. I could scan & post the relevant parts of the manual if anyone is interested.

Anyway, I think the "real" picture becomes clearer when you free your mind of "all the world is x86" and realize the plethora of architectures out there that don't have segmentation. x86-64, 68k, PowerPC, ...

OSDev.org

Memory Segmentation in the x86 platform and elsewhere

Re: Memory Segmentation in the x86 platform and elsewhere

Re: Memory Segmentation in the x86 platform and elsewhere

Re: Memory Segmentation in the x86 platform and elsewhere

Re: Memory Segmentation in the x86 platform and elsewhere

Re: Memory Segmentation in the x86 platform and elsewhere

Re: Memory Segmentation in the x86 platform and elsewhere

Re: Memory Segmentation in the x86 platform and elsewhere

Re: Memory Segmentation in the x86 platform and elsewhere

Re: Memory Segmentation in the x86 platform and elsewhere

Re: Memory Segmentation in the x86 platform and elsewhere

Re: Memory Segmentation in the x86 platform and elsewhere

Re: Memory Segmentation in the x86 platform and elsewhere

Re: Memory Segmentation in the x86 platform and elsewhere

Re: Memory Segmentation in the x86 platform and elsewhere

Re: Memory Segmentation in the x86 platform and elsewhere