Re: Linux disdain
Posted: Sun Feb 14, 2021 1:33 pm
Getting back to the topic, complaints about the growing size and worsening performance of Unix (and OSes in general) are by no means new. In her dissertation on the Synthesis kernel in 1991, Dr Massalin had quite a lot to say about why this general tendency seems to occur (footnotes in the original elided; the full dissertation can be found in PDF form here). While her purpose was to argue for her solutions to the problems of reducing latency while increasing extensibility, these points still stand even if you disagree with her vis a vis code synthesis.
Early kernels tended to be large, isolated, monolithic structures that were hard to maintain. IBM's MVS is a classic example. Unix initially embodied the "small is beautiful" ideal. It captured some of the most elegant ideas of its day in a kernel design that, while still monolithic, was small, easy to understand and maintain, and provided a synergistic, productive, and highly portable set of system tools. However, its subsequent evolution and gradual accumulation of new services resulted in operating systems like System V and Berkeley's BSD 4.3, whose large, sprawling kernels hearken back to MVS.
There are two common measures of performance: throughput and latency. Throughput is a measure of how much useful work is done per unit time. Latency is a measure of how long it takes to finish an individual piece of work. Traditionally, high performance meant increasing the throughput - performing the most work in the minimum time. But traditional ways of increasing throughput also tend to increase latency.
The classic way of increasing throughput is by batching data into large chunks which are then processed together. This way, the high overhead of initiating the processing is amortized over a large quantity of data. But batching increases latency because data that could otherwise be output instead sits in a buffer, waiting while it fills, causing delays. This happens at all levels. The mainframe batch systems of the 1960's made efficient use of machines, achieving high throughput but at the expense of intolerable latency for users and grossly inefficient use of people's time. In the 1970's, the shift toward timesharing operating systems made for a slightly less efficient use of the machine, but personal productivity was enormously improved. However, calls to the operating system were expensive, which meant that data had to be passed in big, buffered chunks in order to amortize the overhead.
This is still true today [...] In light of these large overheads, it is interesting to examine the history of operating system performance, paying particular attention to the important, low-level operations that are exercised often, such as context switch and system call dispatch. We find that operating systems have historically exhibited large invocation overheads.
As new applications demand more functionality, the tendency has been simply to layer on more functions. This can slow down the whole system because often the mere existence of a feature forces extra processing steps, regardless of whether that feature is being used or not. New features often require extra code or more levels of indirection to select from among them. Kernels become larger and more complicated, leading designers to restructure their operating systems to manage the complexity and improve understandability and maintainability. This restructuring, if not carefully done, can reduce performance by introducing extra layers and overhead where there was none before.
Instead of attacking the problem of high kernel overhead directly, performance problems are being solved with more buffering, applied in ever more ingenious ways to a wider array of services. Look, for example, at recent advances in thread management. A number of researchers begin with the premise that kernel thread operations are necessarily expensive, and go on to describe the implementation of a user-level threads package. Since much of the work is now done at the user-level by subscheduling one or more kernel-supplied threads, they can avoid many kernel invocations and their associated overhead.
But there is a tradeoff: increased performance for operations at the user level come with increased overhead and latency when communicating with the kernel. One reason is that kernel calls no longer happen directly, but first go through the user-level code. Another reason could be that optimizing kernel invocations are no longer deemed to be as important, since they occur less often.
These problems became apparent to several research teams, and a number of new system projects intended to address the problem were begun. For example, recognizing the need for clean, elegant services, the Mach group at CMU started with the BSD kernel and factored services into user-level tasks, leaving behind a very small kernel of common, central services. Taking a different approach, the Plan 9 group at AT&T Bell Laboratories chose to carve the monolithic kernel into three sub-kernels, one for managing files, one for computation, and one for user interfaces. Their idea is to more accurately and flexibly fit the networks of heterogeneous machines that are common in large organizations today.
There are difficulties with all these approaches. In the case of Mach, the goal of kernelizing the system by placing different services into separate user-level tasks forces additional parameter passing and context switches, adding overhead to every kernel invocation. Communication between the pieces relies heavily on message passing and remote procedure call. This adds considerable overhead despite the research that has gone into making them fast. While Mach has addressed the issues of monolithic design and maintainability, it exacerbates the overhead and latency of system services. Plan 9 has chosen to focus on a particular cut of the system: large networks of machines. While it addresses the chosen problem well and extends the productive virtues of Unix, its arrangement may not be as suitable for other machine topologies or features, for example, the isolated workstation in a private residence, or those with richer forms of input and output, such as sound and video, which I believe will be common in the near future.
A good operating system provides numerous useful services to make applications easy to write and easy to interconnect. To this end, it establishes conventions for packaging applications so that formats and interfaces are reasonably well standardized. The conventions encompass two forms: the model, which refers to the set of abstractions that guide the overall thinking and design; and the interface, which refers to the set of operations supported and how they are invoked. Ideally, we want a simple model, a powerful interface, and high performance. But these three are often at odds.
Witness the MVS I/O system, which has a complex model but offers a powerful interface and high performance. Its numerous options offer the benefit of detailed, precise control over each device, but with the drawback that even simple I/O requires complex programming.
Unix is at the other end of the scale. Unix promoted the idea of encapsulating I/O in terms of a single, simple abstraction. All common I/O is accomplished by reading or writing a stream of bytes to a file-like object, regardless of whether the I/O is meant to be viewed on the the user's terminal, stored as a file on disk, or used as input to another program. Treating I/O in a common manner offers great convenience and utility. It becomes trivial to write and test a new program, viewing its output on the screen. Once the program is working, the output can be sent to the intended file on disk without changing a line of code or recompiling.
But an oversimplified model of I/O brings with it a loss of precise control. This loss is not important for the great many Unix tools -- it is more than compensated by the synergies of a diverse set of connectable programs. But other, more complex applications such as a database management system (DBMS) require more detailed control over I/O. Minimally, for a DBMS to provide reasonable crash recovery, it must know when a write operation has successfully finished placing the data on disk; in Unix, a write only copies the data to a kernel buffer, movement of data from there to disk occurs later, asynchronously, so in the event of an untimely crash, data waiting in the buffers will be lost. Furthermore, a well-written DBMS has a good idea as to which areas of a file are likely to be needed in the future and its performance improves if this knowledge can be communicated to the operating system; by contrast, Unix hides the details of kernel buffering, impeding such optimizations in exchange for a simpler interface.
Later versions of Unix extended the model, making up some of the loss, but these extensions were not "clean" in the sense of the original Unix design. They were added piecemeal as the need arose. For example, ioctl (for I/O controls) and the select system call help support out-of-band stream controls and non-blocking (polled) I/O, but these solutions are neither general nor uniform. Furthermore, the granularity with which Unix considers an operation "`non-blocking" is measured in tens of milliseconds. While this was acceptable for the person-typing-on-a-terminal mode of user interaction of the early 1980's, it is clearly inappropriate for handling higher rate interactive data, such as sound and video.