Microkernel

Microkernel (also abbreviated 焰 or uK) is the term describing an approach to Operating System design by which the functionality of the system is moved out of the traditional "kernel", into a set of "servers" that communicate through a "minimal" kernel, leaving as little as possible in "system space" and as much as possible in "user space".

Rationale

Microkernels were invented as a reaction to traditional "monolithic" kernel design, whereby all system functionality was put in a one static program running in a special "system" mode of the processor. The rationale was that it would bring modularity in the system architecture, which would entail a cleaner system, easier to debug or dynamically modify, customizable to users' needs, and more performant.

Examples

Perhaps the best known example of microkernel design is Mach, originally developed at CMU, and used in some free and some proprietary BSD Unix derivatives, as well as in the heart of GNU HURD. Rumor had it MICROS~1 Windows NT would originally have been a microkernel design (that was grown into the bloated thing it is), but this has been denied by NT architect Dave Cutler. Other well-known microkernels include Chorus, QNX, VSTa, etc. Latest evolutions in microkernel design led to things like "nano-kernel" L4, or "exokernel" Xok, where the kernel is shrunk ever more towards less functionality and less portability.

Opinionated History

At one time in the late 1980's and early 1990's, microkernels were the craze in official academic and industrial OS design, and anyone not submitting to the dogma was regarded as ridiculous (at least it seems to me from reading articles from OS conferences, or the Minix vs Linux flamefest; could people help confirm or infirm this impression of mine?). But microkernels failed to deliver their too many promises in terms of either modularity, cleanliness, ease of debugging, ease of dynamic modification, customizability, or performance. This led some microkernel people to compromise by having "single-servers" that have all the functionality, and pushing them inside kernel-space (allegedly NT, hacked MkLinux), yielding a usual monolithic kernel under another name and with a contorted design. Other microkernel people instead took an even more radical view of stripping the kernel from everything but the most basic system-dependent interrupt handling and messaging capabilities, and having the rest of system functionality in libraries of system or user code, which again is not very different from monolithic systems like Linux that have well-delimited architecture-specific parts separated from the main body of portable code. With the rise of Linux, and the possibility to benchmark monolithic versus microkernel variants thereof, as well as the possibility to compare kernel development in various open monolithic and microkernel systems. People were forced to acknowledge the practical superiority of "monolithic" design according to all testable criteria. Nowadays, microkernel is still the "official" way to design an OS, although you wont be laughed at when you show your monolithic kernel anymore. The sad truth about Microkernels has been known for a long time, but as far as we know, no one in the academic world ever dared raise any public theoretical criticism of the very concept of microkernel. Instead, any attempt to analyze things rationally will be dismissed as tackling an "obsolete" problem, including any attempt to assess whether the problem is indeed obsolete.

Argumented Criticism

As people understood that kernels only introduce (design-time and run-time) overhead without adding any functionality that couldn't be better achieved without it (for several reasons like efficiency, maintainability, modularity, etc), they tried to reduce kernel sizes as much as they could. The result is called a microkernel, which is pure overhead, with no functionality at all. There has thus been a (now waning) craze in Operating System research and development to boast about using a microkernel.

I contend that microkernels are a deeply flawed idea: instead of removing the overhead, they concentrate and multiply it. The overall space/time cost of the OS is not reduced at all, as the functionality has only been moved away from the kernel into "servers"; only now there is an additional overhead in space as well as in time and as in design, to manage the information flow of system services that now needs to go from user to kernel then kernel to server. Because of the low abstraction level of microkernels, lots of low-level bindings must be done for "servers" that provide functionality, so nothing is gained at the user/server interface either.

As a result, microkernel-based systems are slower, bigger, harder to program, and harder to customize than monolithic kernels. The only valid rationale about them is that they encourage some modularity. However, this modularity microkernels enforce on system programmers is of a very low-level kind, which implies the overhead of (un)marshalling, as well as total lack of consistency or trust between communicating servers. In comparison, "monolithic" systems can achieve arbitrary useful modularity with dynamically-loaded kernel code, allowing automatic enforcement of whatever consistency the system programming languages can express (for instance, strong static typing with module scoping in Modula-3-programmed SPIN and Standard ML-programmed Fox, or just weak typing with filtered global symbol matching in C-programmed Linux).

Thinking that microkernels may enhance computational performance can stem but from a typical myopic analysis: indeed, at every place where functionality is implemented, things look locally simpler and more efficient. Now, if you look at the whole picture, and sum the local effects of microkernel design all over the place, it is obvious that the global effect is complexity and bloat in as much as the design was followed, i.e. at every server barrier. For an analogy, take a big heavy beef, chop it into small morsels, wrap those morsels within hygienic plastic bags, and link those bags with strings; whereas each morsel is much smaller than the original beef, the end-result will be heavier than the beef by the weight of the plastic and string, in a ratio inversely proportional to the small size of chops (i.e. the more someone boasts about the local simplicity achieved by his 焰, the more global complexity he has actually added w.r.t. similar design w/o 焰). Microkernels only generate artificial barriers between functionalities, and any simplicity in servers is only the intrinsic simplicity of provided functionality, that is independent from the existence of low-level barriers around it. Every part of a 焰-based design is simpler (than a whole system), of course, because the design has butchered the system into small parts! But if one considers same-functionality overall systems, the only thing 焰 does is introduce stupid low-level barriers between services. The services are still there, and their intrinsic complexity isn't reduced: for every small part of a 焰-based system, one could find a corresponding, smaller or equal part, in a same-functionality non-焰 system, namely the one that implements the same functionality without having to marshall data to cross barriers.

Microkernels start from the (Right) idea of having modular high-level system design, and confuse the issue so as to end with the (Wrong) idea of its naive implementation as a low-level centralized run-time module manager, which constitutes a horrible abstraction inversion. So they have system programmers manually emulate an asynchronous parallel actor model with coarse-grained C-programmed polling processes, instead of directly using a real fine-grained actor language with optimizing compiler (Erlang, Mozart/Oz, Modula-3, some concurrent variant of Lisp or ML or Haskell, etc.). The discrepancy between the model and its naive and awkward implementation induces lots of overhead, that get worked around with lots of stupid compromises, with a two-level programming system: objects are segregated into a finite set of servers and a kernel, with completely different programming models for combining objects inside a same space and for combining objects not in a same space. Performance gets so bad that most "basic resources" must be statically special-cased in the "microkernel" anyway, and people group as much functionality as they can in every server to not pay the price of inter-server communication during their interaction. Semantics also becomes very difficult to get right, since low-level interactions make a hell out of debugging the already complex concurrent actor model. In the end, people put the whole of OS services in a monolithic "single-server", which completely defeats the whole purpose of a microkernel! As a result, everything gets both more complicated and slower! Of course, the very same conclusion holds for kernels in general; by pushing the idea of kernels to its limits, microkernels only end up proving the whole inadequacy of it.

The only possible justification for a microkernel is not technical. It's political: a microkernel is the only way to allow with any robustness the existence of black-box proprietary third-party binary modules that access and provide deep system resources without anyone having to disclose source code. Microkernels are technically the worst possible organization for system code of same functionality, and the fact that the proprietary closed-source development model encourages such horrors accounts for the deep evil behind that model. It has been suggested that a psychological reason behind the abstraction inversion is that, by a misled tradition, the "Operating System" community stubbornly refuses to mess with language issues (they claim "language independence"), and stick to designing system interfaces for bit-level languages; but we can also track this want of language "independence" to the political issue of proprietary software, since it is what induces the eager clustering of computing into hermetic fields where no one can modify or adapt (proprietary) code from other fields forces people into the paranoid "trust no one, never cooperate: even if you want to, you can't" behavior.

Latest developments in microkernels (L4, Xok) amount to reducing as much as possible the semantics and overhead of the kernel, and putting everything in either servers or system libraries. The logical next step beyond these developments would be to reduce the microkernels to zero, naught, nada, and have everything in "modules" that constitute the system, and are high-level concepts without forcibly any obvious, one-to-one direct correspondence between the high-level compile-time modules limits and low-level run-time code barriers. Depending on the point of view, this leaves us either with "monolithic" systems, or with systems without a privileged kernel at all (such as systems built atop the Flux OSKit). Such is the right way, in our opinion: to provide high-level modular design, but without any kernel at all. Kernels are but a stubborn straightforward low-level implementation of module management, through a centralized runtime message passing agent. Tunes will provide an optimizing compiler so that local message passing, which is only a low-level model for application of a function, will be completely inlined.