Safe Systems-Programming Language
Posted: Mon Oct 15, 2007 1:51 pm
Sorry about quoting myself, but that brings in the background to this topic.And I object to the safe languages on the basis that I can't kernel hack with them (why not a statically-typed safe language that allowed pointers in declared-unsafe sections just like Object Pascal allows inlined assembler blocks?) and their lack of manual, byte-by-byte memory manipulation makes interfacing them to foreign code rather difficult.
I remember how easy it was to generate Delphi units for major C++ libraries (ran on same OS and same processor == WORKED, with as little as a non-default calling convention), and I cry when I read about "foreign function interfaces" of safer languages like Haskell or ML.
Colonel Kernel proposed that we should move towards better languages for writing systems software as well as user-land software, probably based on compiler-time static checks rather than interpretation, JIT compiling, code scanning or run-time checks. Myself and a couple of others have objected that:
1) Such languages make peak-and-poke hardware banging difficult in a way that C, C++, Object Pascal, Objective C, and even unsafe C# do not.
2) Such languages tend to use garbage collection, which requires a language runtime library beyond what system hackers feel comfortable with. For example, manually specifying the lifetime of a closure can get very, very ugly.
Still, I'd like to see if someone can overcome those objections to produce a better systems language. In that spirit, I'd like to propose a few things:
A) A fully parametric static type system with types evaluated at "macro-evaluation" time. Type inference or Turing-complete type system optional.
B) Strict functional pass-by-value semantics with the sole exception of object-reference types being available. I'm not sure how to make those references scope-safe (though they can be type-checked, since they will have type). Programmers should still be able to allocate objects as normal, static variables, however.
C) Pointers allowed in explicitly-declared "unsafe" sections of code that the type- and safety- checkers assume correct. This will let kernel hackers shoot ourselves in the foot without sacrificing the compiler's help where we want it.
D) No object constructors. Instead, the converse of constructors: a function that will, given a reference to an appropriately-sized buffer of memory (type-checked?), construct an object in that space and return a reference to the object. This allows the construction of arbitrary structs, objects, and even closures (see below) while leaving the exact specifics of memory allocation up to the programmer (who can choose manual allocation or garbage collection themselves). I know everyone loves garbage collection, but it doesn't really work all that well at the kernel level.
E) Functional programming features. This means that a lambda-expression should create a type of anonymous function whose converse-of-a-constructor returns a reference to the function (possibly requiring memory to allocate a closure environment and copying, by value instead of reference, all required data into the closure). Since everything is statically-typed, we should have some kind of type-checked apply function. These will allow us to use more truly elegant algorithms in low-level systems work.
F) The language must work without a runtime library. That is, every possible statement or expression that this language can express must be compilable in the absence of a runtime library. That statement holds true for C and currently makes it the most popular systems language, because kernel coders can write only the bits of standard library they need instead of coding dummy functions for file access (Object Pascal, right there) just to make their kernel compile.
G) The ABSOLUTELY NECESSARY feature -- at least some types must be included in the language that map to things like bytes, words, longwords, 32-bit floats and other types that assembly understands. This is absolutely critical to interoperating with other languages (just like pointers), so put it in the unsafe sections if we like, but the language must have those types available.
I also thought of having Lispy syntax (ie: nearly no syntax) so that compile-time macros could do the work of most control structures, but I'm not sure how well that can work with the other stuff.
Just pipe dreaming.