Page 1 of 3

Safe Systems-Programming Language

Posted: Mon Oct 15, 2007 1:51 pm
by Crazed123
And I object to the safe languages on the basis that I can't kernel hack with them (why not a statically-typed safe language that allowed pointers in declared-unsafe sections just like Object Pascal allows inlined assembler blocks?) and their lack of manual, byte-by-byte memory manipulation makes interfacing them to foreign code rather difficult.

I remember how easy it was to generate Delphi units for major C++ libraries (ran on same OS and same processor == WORKED, with as little as a non-default calling convention), and I cry when I read about "foreign function interfaces" of safer languages like Haskell or ML.
Sorry about quoting myself, but that brings in the background to this topic.

Colonel Kernel proposed that we should move towards better languages for writing systems software as well as user-land software, probably based on compiler-time static checks rather than interpretation, JIT compiling, code scanning or run-time checks. Myself and a couple of others have objected that:

1) Such languages make peak-and-poke hardware banging difficult in a way that C, C++, Object Pascal, Objective C, and even unsafe C# do not.
2) Such languages tend to use garbage collection, which requires a language runtime library beyond what system hackers feel comfortable with. For example, manually specifying the lifetime of a closure can get very, very ugly.

Still, I'd like to see if someone can overcome those objections to produce a better systems language. In that spirit, I'd like to propose a few things:

A) A fully parametric static type system with types evaluated at "macro-evaluation" time. Type inference or Turing-complete type system optional.
B) Strict functional pass-by-value semantics with the sole exception of object-reference types being available. I'm not sure how to make those references scope-safe (though they can be type-checked, since they will have type). Programmers should still be able to allocate objects as normal, static variables, however.
C) Pointers allowed in explicitly-declared "unsafe" sections of code that the type- and safety- checkers assume correct. This will let kernel hackers shoot ourselves in the foot without sacrificing the compiler's help where we want it.
D) No object constructors. Instead, the converse of constructors: a function that will, given a reference to an appropriately-sized buffer of memory (type-checked?), construct an object in that space and return a reference to the object. This allows the construction of arbitrary structs, objects, and even closures (see below) while leaving the exact specifics of memory allocation up to the programmer (who can choose manual allocation or garbage collection themselves). I know everyone loves garbage collection, but it doesn't really work all that well at the kernel level.
E) Functional programming features. This means that a lambda-expression should create a type of anonymous function whose converse-of-a-constructor returns a reference to the function (possibly requiring memory to allocate a closure environment and copying, by value instead of reference, all required data into the closure). Since everything is statically-typed, we should have some kind of type-checked apply function. These will allow us to use more truly elegant algorithms in low-level systems work.
F) The language must work without a runtime library. That is, every possible statement or expression that this language can express must be compilable in the absence of a runtime library. That statement holds true for C and currently makes it the most popular systems language, because kernel coders can write only the bits of standard library they need instead of coding dummy functions for file access (Object Pascal, right there) just to make their kernel compile.
G) The ABSOLUTELY NECESSARY feature -- at least some types must be included in the language that map to things like bytes, words, longwords, 32-bit floats and other types that assembly understands. This is absolutely critical to interoperating with other languages (just like pointers), so put it in the unsafe sections if we like, but the language must have those types available.

I also thought of having Lispy syntax (ie: nearly no syntax) so that compile-time macros could do the work of most control structures, but I'm not sure how well that can work with the other stuff.

Just pipe dreaming.

Posted: Mon Oct 15, 2007 3:58 pm
by Alboin
What about some higher data types, that don't require any support? (eg,. tuples.)

Posted: Mon Oct 15, 2007 5:01 pm
by Crazed123
Aren't tuples just structs written with numerical indices to the members instead of symbolic names? What's so great about that?

Posted: Mon Oct 15, 2007 5:05 pm
by Alboin
Crazed123 wrote:Aren't tuples just structs written with numerical indices to the members instead of symbolic names? What's so great about that?

Code: Select all

(x, y, z) = function(a);
That is, instead of using pointers, or creating a new structure, which, IMO, are slightly ugly.

Posted: Mon Oct 15, 2007 5:27 pm
by Crazed123
OK, we could have structs replaced with tuples that can optionally include symbolic "tags" (ie: variable names) for the tuple members.

When you look at it that way, you wonder which idiot came up with structs that couldn't be used as tuples. It's just syntactic sugar for passing a struct or pointer around.

Posted: Mon Oct 15, 2007 6:12 pm
by Alboin
Because of B and E, normal functions wouldn't be needed, would they? One would simply assign a lambda function to a variable, and then call the variable. However, this would introduce another type, that is, function\lambda.

I think this would make objects easier as well: since functions are simply variables, we just create a tagged tuple, and set it up. Some things are yet unknown...

What about templates, or something of the sort?

Posted: Mon Oct 15, 2007 6:44 pm
by Crazed123
I was thinking that was part of what a parameterized type system means.

And the thing with assigning first-class-functions to variables is... that's exactly how Lisp-1 languages do it. Lisp-2s (like C and other Algol-based languages) have separate namespaces for functions and variables.

Making functions a first-class member of the language would actually mean introducing (am I naming it correctly?) a new type-class. The actual function signatures would designate specific types of functions (since we want a strongly- and statically-typed language to minimize run-time performance hits from safety checks).

The really big question is: what does that elaborate type and safety system look like? I haven't actually learned Haskell or ML or another language with a sufficiently strong, powerful type system, so I can't say. It needs to allow for types specialized on compile-time parameters, and also possibly for types specialized on previously computed and declared types. It will probably also find good use for "OR" types (where a value of type X can be, for example, any integer 0..9 OR NIL), and it will absolutely need the ability to represent function types.

But what about plain, old object types? Can tagged tuples with member functions (member functions being closures over a reference to the tuple itself, a "this pointer") replace a class-based object system with inheritance and virtual functions? I think CLOS, with its generic functions and methods, points us in a good direction here (provided the correct method and parameter types can be resolved at runtime).

The designers of this new language also have to look at non-local flow control. Should some kind of continuation support be included (I really don't think so!) so programmers can build their own flow control? A Java/C++/Object Pascal exception system (quite possibly)? Nothing but a goto statement (as in C)? Stack-unwinding operators as given in Common Lisp?

I think I'll go read up on Haskell's type system to start figuring out this whole type thing.

Re: Safe Systems-Programming Language

Posted: Tue Oct 16, 2007 12:15 am
by Colonel Kernel
Crazed123 wrote:Candy proposed that we should move towards better languages for writing systems software as well as user-land software, probably based on compiler-time static checks rather than interpretation, JIT compiling, code scanning or run-time checks.
I think it was me, actually. With such a unique handle, you'd think I'd be mistaken for others less often... :P
Myself and a couple of others have objected that:

1) Such languages make peak-and-poke hardware banging difficult in a way that C, C++, Object Pascal, Objective C, and even unsafe C# do not.
Who should be doing the peeking and the poking? IMO, it should not be application developers. This leaves kernel developers and driver developers. With isolation based on type safety, you can remove driver developers from that group too. Take a look at "Solving the starting problem" here to get an idea of how it can be done. The basic idea is that a driver includes a manifest describing what I/O resources (ports, memory ranges, etc.) that it should be given access to. The system, through a set of libraries that encapsulate access to these I/O resources (cheaply because there are no privilege transitions) enforce these access policies. Even before that, at install time, the system can use the manifests to ensure that more than one driver doesn't try to claim the same resource (but this doesn't really have anything to do with the use of type-safe languages).

I still don't understand how you can object to a theoretical systems architecture because it prevents you from hacking, when no one will force you to use it. Seriously, it's bizzare.
2) Such languages tend to use garbage collection, which requires a language runtime library beyond what system hackers feel comfortable with.
If progress was measured in terms of what "system hackers" feel comfortable with, we'd still be using punch cards. I put my faith in PhDs who have done decades of research and engineers who put their ideas to the test and refine them in the real world. Hacking is fine for people who are happy with the status quo or have nothing better to do. Frankly I don't see the point in catering to such navel-gazing self-interest. How is that going to advance the state of the art?
Still, I'd like to see if someone can overcome those objections to produce a better systems language. In that spirit, I'd like to propose a few things:
I won't go through these things point-by-point... I'll just assume it's a personal wish-list and leave it at that.

I think you've largely missed the point of the idea though, which is this: The OSes of the future should be designed in such a way that the amount of code that you'd be inclined to write in such a special-purpose "systems programming language" is as small as possible. The reason is that such code is the most vulnerable part of the system. The principle of least privilege says we should not be granting things kernel-like powers unless they really, really need it.

If you really want to ponder type-safe kernel development though, you should take a look at Cyclone which provides region-based memory management as an alternative to garbage collection.

@Alboin: If you're interested in the whole "functions are objects, objects are functions" thing, you should take a look at Scala which IMO strikes a nice balance between OO and functional programming (I wouldn't try to use it for kernel dev though).

Re: Safe Systems-Programming Language

Posted: Tue Oct 16, 2007 9:19 am
by Solar
Colonel Kernel wrote:Who should be doing the peeking and the poking? IMO, it should not be application developers. This leaves kernel developers and driver developers.
Unfair assumption, IMHO. For example, many of the more involved applications (databases, graphics applications, just to name two) benefit greatly from e.g. custom memory handling, implemented on top of the generic memory handling of the system. No way how good your garbage collector or how customizable your memory management system, there will always be someone requiring that bit more of performance or effectivity.
I still don't understand how you can object to a theoretical systems architecture because it prevents you from hacking, when no one will force you to use it. Seriously, it's bizzare.
This is called constructive criticism. Of course you can go ahead and build a "perfect" system, but you should actually be happy if people step up and tell you why it wouldn't be as "perfect" for others, before you learn it the hard and frustrating way at the end of a long development.

To put it in other words, I don't understand how you can object to a criticism when no one is forcing you to take it into account. (Note big smiley --> ) :wink:
If progress was measured in terms...
{Warning sounds} Leaving the ground of discussion, entering argument...
The OSes of the future should be designed in such a way that the amount of code that you'd be inclined to write in such a special-purpose "systems programming language" is as small as possible.
Who has the authority to define what "the OSes of the future", all of them, should look like?

Taking away pointers and adding garbage collection might be a good thing for some, but you are taking away a freedom to be replaced with a feature - which not everyone might be OK with.
The principle of least privilege says we should not be granting things kernel-like powers unless they really, really need it.
Uh-huh... and by designing a system in a way that it requires something to be written in a specific way (manifest, "safe" language etc.), you take away the ability to do it differently if one really, really needs it.

I am not saying that it's a bad tradeoff per se (IMHO it very much depends on the implementation), I just want to make you aware that it is actually a tradeoff.

Re: Safe Systems-Programming Language

Posted: Tue Oct 16, 2007 12:22 pm
by Candy
Crazed123 wrote:Candy proposed that we should move towards better languages for writing systems software as well as user-land software
Do you have a link to a quote from me saying that? I can't recall but then again I might be going mental.

Re: Safe Systems-Programming Language

Posted: Tue Oct 16, 2007 1:21 pm
by Combuster
Candy wrote:
Crazed123 wrote:Candy proposed that we should move towards better languages for writing systems software as well as user-land software
Do you have a link to a quote from me saying that? I can't recall but then again I might be going mental.
I though I knew where you posted that, but it happened to be one of Solar's posts. Since Colonel Kernel claimed the idea as well it might've been a misunderstanding.

Then again, maybe we just have to recall your "affinity" to C++0x :twisted:

Re: Safe Systems-Programming Language

Posted: Tue Oct 16, 2007 1:57 pm
by Alboin
Colonel Kernel wrote:If you really want to ponder type-safe kernel development though, you should take a look at Cyclone which provides region-based memory management as an alternative to garbage collection.
Cyclone looks really interesting. The only thing I see it missing are functional features.
Colonel Kernel wrote:@Alboin: If you're interested in the whole "functions are objects, objects are functions" thing, you should take a look at Scala which IMO strikes a nice balance between OO and functional programming (I wouldn't try to use it for kernel dev though).
I've looked at it, and found it too bulky for practice use. (ie. .NET or JVM.) Hence my interest in this thread.

Posted: Tue Oct 16, 2007 8:31 pm
by Crazed123
I think it was Colonel Kernel who raised the better languages point. Forgive* me, I haven't slept much lately.

Now look, Colonel. No matter how far low we put the barrier between "kernel hacking"/"device drivers" and things you can implement in your idealized safe language, someone somewhere still has to do kernel hacking and write device drivers.

My idea is that rather than condemning him to C hell while giving everyone above him a shiny new Managed Language to use, we should improve the safety, correctness and expressiveness of kernel-hacking languages to make his life easier.

I recommended a strong, expressive static type system because I think that a sufficiently strong type system can catch certain bugs. I recommend basic functional programming features because while monads, list comprehensions and folds may not appear in everyday work (and especially not in kernel hacking), anonymous functions; closures; application of arbitrary functions (apply and map type stuff); and even very primitive higher-order functions can make our kernel-hacker's life much easier.

Now, I've got some more ideas to add. First of all, I think Dylan-style syntax will probably work will, since it will also give us some starting points for a macro system. A macro system will allow neat little constructs like, for example, declaring a variable and naming it as a GC root (to a C-style library-based GC) in a macro so that the programmer can do that one with statement.

I don't know exactly about whether this language should have a class/object system, but I'd like to see something like Common Lisp's multimethods (though with single inheritance only).

And I've learned a bit about Haskell's type system, and I think that if we start with the C primitive types as primitive types, add tagged tuples, add functions, and finally possibly add some kind of classes that one can specialize (for the generic function system), then throw it all into a Haskell-like type system, we'll have something good going. Haskell's | type composition operator can serve as a way of creating enums or unions (depending on what we give it).

Does anyone have any ideas for how to create a strongly-typed safe(r) reference type?

* -- See? I actually forgot the "me" in "forgive me".

Posted: Tue Oct 16, 2007 9:03 pm
by Alboin
Crazed123 wrote:Does anyone have any ideas for how to create a strongly-typed safe(r) reference type?
Since a reference is guaranteed not to simply be a copy of the data, and instead, actually the said data, then it would make sense that a reference would only function with data of it's unreferenced type, and that of the same referenced type. Also, just like in C++, I don't think they should be able to be reassigned or changed.

Posted: Tue Oct 16, 2007 10:26 pm
by Crazed123
Why shouldn't you be able to reassign or change references? That always seemed a frustrating limitation of C++ to me.

Edit: The fact that references are strongly typed as to what they can reference is a given.