Page 1 of 2
OS development languages
Posted: Tue Jul 04, 2006 3:02 pm
by mystran
What would I need to change in say Python, Visual Basic or Prolog to be able to program an OS with it?
I was doing some protyping with writing servers for my operating system, and it occured to me that my design is such that it doesn't just make programming easier with applicative languages like Scheme, but actually makes it rather painful to use anything without at least higher-order functions (with proper closures).
Since I've done some language design&implementation in the past, and I know there's been a bit of interest at some point, I thought I'd revisit the point. So what are the crucial features of a language that are required for an OS to be written using it?
I can immediately think of some things:
- ability to dereference addresses (to access arbitary memory)
- should be able to manipulate raw data
- shouldn't require a garbage collector (for basic operation at least)
But what else would one normally expect?
I mean, if I have a Turing complete language Foo that can manipulate raw data at given memory address (by using primitives?) and that is able to implement it's own memory allocator, then do I have a theoretically acceptable language for OS implementation?
Being able to implement memory allocator implies running at least some subset of the language without implicit allocations (plus raw memory access) but I guess there's some other basic requirements, right?
I think inline assembly would be a "nice to have" feature, but can't be considered as essential as long as some interfacing with assembler code is possible, right?
Thoughts?
Re:OS development languages
Posted: Tue Jul 04, 2006 3:44 pm
by Kemp
I personally take the view of using the most applicable language for the type system you are deving for. x86 and suchlike tend to be easier to dev for with languages like C and so on, then there are stack-based machines (suggest a language, I'm not familiar with these architectures and may have got the name wrong
), machines designed so that LISP runs a lot faster and easier, etc etc. Of course, you can use what you want as long as it fulfils the basic requirements you listed in your post, you may have to write your own libraries for it though, or write a tool to process the output of the compiler, or even write your own.
Re:OS development languages
Posted: Tue Jul 04, 2006 8:52 pm
by Colonel Kernel
I only have a somewhat mundane example of what you're looking for -- the "unsafe" subset of C#.
mystran wrote:ability to dereference addresses (to access arbitary memory)
Any block of code marked with "unsafe" in C# can use pointers, which follow all the rules of C pointers (i.e. -- they can be cast, assigned to arbitrary values, manipulated with arithmetic, etc.).
- should be able to manipulate raw data
See above.
- shouldn't require a garbage collector (for basic operation at least)
Do you consider dynamic memory allocation to be a basic operation? If not, then unsafe C# fits this requirement. It has a "fixed" block that pins a block of GC'd memory so that it can be safely manipulated with pointers within the scope of the fixed block. I don't think there is a "new" operator that can allocate non-GC memory (except for the Exchange Heap allocator in Singularity, but that's not part of "standard" C#).
I'm not sure if this helps you much, since you're looking at languages with an execution model that can be radically different from C#...
Re:OS development languages
Posted: Tue Jul 04, 2006 10:02 pm
by proxy
you will also need the ability to write to I/O ports, but generally i would simply say you need:
- direct memory access
- i/o port access
things like GC and the like may require runtime components, but that's not gonna stop you from being _able_ to use the language for OS dev, just makes it more inconvinient.
proxy
Re:OS development languages
Posted: Wed Jul 05, 2006 12:27 am
by Solar
Re:OS development languages
Posted: Wed Jul 05, 2006 12:34 am
by marcio.f
mystran wrote:I can immediately think of some things:
- ability to dereference addresses (to access arbitary memory)
- should be able to manipulate raw data
- shouldn't require a garbage collector (for basic operation at least)
(...)
I think inline assembly would be a "nice to have" feature, but can't be considered as essential as long as some interfacing with assembler code is possible, right?
The
D programming language supports all of that. It's even possible to not write any .asm files at all. (Note that writing inline assembly is still required, e.g. for IO, but is
far easier than doing the same in C/C++.)
Re:OS development languages
Posted: Wed Jul 05, 2006 3:26 am
by mystran
Yeah I know of that, but I thought I'd bring the subject up anyway, because it's been a while since the last meaningful discussion about the subject.
As it is, I'm not necessarily looking for an existing language at all. I'm thinking more about what I need to design into a language, if I write a new one... and I actually do have a prototype (actually several) compiler up and running, and I'm thinking what I have to take care of in codegen.
I know of the Singularity/C# which is actually pretty nice, but I'm looking for even more expressive languages.
It seems there are some implementation choices even in compiled code that are fundamentally incompatible with the idea of avoiding large runtime. For example, the only ways to implement first-class continuations without implicitly allocating memory for each non-tail call would be those that can use the stack until a continuation is captured, so that memory management is required only when using call/cc or equivalent.
General closures and things like passing "rest" arguments as lists (like modern Lisp/Scheme does at least observably) require allocation as well, but since neither is necessary for writing a memory manager, it should be sufficient to compile code in such a way that it's possible to avoid any implicit allocation.
Ofcourse one strategy for languages that can't live without a GC (say lazy languages maybe?) would be to statically allocate some region, and write a simple GC in assembler (or another less expressive language), but that tastes a bit like cheating to me. Alternatively if GC itself can run in bounded space, then it would be sufficient to have enough memory allocated that system can survive until first GC.
Nobody wants to manually manage closures. Try it, it's not nice.
Anyway, thanks for input people, and any other comments welcome.
Re:OS development languages
Posted: Wed Jul 05, 2006 4:00 am
by Combuster
mystran wrote:
I can immediately think of some things:
- ability to dereference addresses (to access arbitary memory)
- should be able to manipulate raw data
- shouldn't require a garbage collector (for basic operation at least)
Not that anybody would expect it, but i've experimented a bit with FreeBASIC and it seems it complies to the statements above (though i needed to use gnu ld to fix up the output)
Just make sure you stay away from strings as that calls the runtime string allocator (shame, its the main reason i like basic over c).
intel assembly included, so you can skip the asm file if you try hard enough
I'll probably write a rip-off runtime and have fun writing drivers in BASIC ( primer, probably ;D ), that is, if i ever get this ELF loader to work.....
Re:OS development languages
Posted: Fri Jul 14, 2006 1:04 pm
by kernel64
It's possible to use any language that can access raw memory provided the sources are available if you need to hack the compiler to add extensions (like accepting __asm { ... } and copying it verbatim to the output file, if it generates assembler) and to make the OS dependent library dependent on your OS instead of someone else's.
I know it's possible to write a kernel in Pascal for example, and probably BASIC as well. So long as you paid attention to the calling convention and hacked up the runtime libraries, it's possible.
I have always wanted a nicer language for OS development, but I have no skills in compiler development. Personally I find Pascal code eminently readable and pleasurable. Given that GCC can be instructed to use a standalone environment, and that there's a GCC pascal compiler available, in theory it should be possible to write a runtime library for my OS with a few assembler files and implement the rest in Pascal.
Pointer notation is a pain in Pascal however, and that's something that I'd change if I had the skills to work on a Pascal compiler.
Generating good object code is challenging in the extreme, as I found out by following Jack Crenshaw's Let's Write a Compiler! tutorial.
I wanted to write a Pascal-like language for OS development, but gave up when I realised I needed to spend a couple of years researching and experimenting at the very least to get a decent compiler happening.
Re:OS development languages
Posted: Fri Jul 21, 2006 8:30 am
by kernel64
Keep an eye on this thread for my attempt at a hello, world kernel written in Pascal. ;D
I'll have to use GNU Pascal, since to my knowledge there's no other Pascal capable. But, I don't think that's a hinderance at all.
Re:OS development languages
Posted: Fri Jul 21, 2006 10:28 am
by nick8325
Well, there's
BitC, which looks like it might be useful (it's going to be the implementation language for Coyotos), but it doesn't seem to be working yet...
I think being able to run without a garbage collector is the most important thing - you could always add raw memory operations as built-in functions (although it might be trickier to make that type-safe).
The
MLKit has region-based memory management. Instead of having one heap, it divides the heap into regions. The compiler allocates memory inside regions, but frees entire regions at a time (i.e. it never frees a single value in a region). The compiler can infer region information (it can decide what region a value should be in, and when to allocate and free regions, at compile-time) so there's no need for a garbage collector (but I think it still needs dynamic memory allocation and freeing, for the regions).
There's a
paper which gives a language which can define functions on lists without needing memory allocation. In most functional languages, you might have
Code: Select all
cons : (a, [a]) -> [a]
uncons : [a] -> (a, [a])
(i.e. cons takes the head and tail of a list and puts them together, uncons takes a list and returns its head and tail.)
In the paper, they introduce resource types, which are pieces of memory. You have to pass cons a resource (i.e. an address for the new list to use), and the type system ensures that each resource is only used once (i.e. you can't make two lists using the same address - this is called linear typing). So it has:
Code: Select all
cons : (resource, a, [a]) -> [a]
uncons : [a] -> (resource, a, [a])
uncons returns the resource so that once you've called it you're allowed to re-use the memory used by the list node. You can write functions like map without needing to allocate memory (since nothing in the language will need to allocate memory itself).
I'm sure there are other ways of doing it but those are the only ones I've seen...
Re:OS development languages
Posted: Fri Jul 21, 2006 12:28 pm
by NoItAll
kernel64 wrote:
Keep an eye on this thread for my attempt at a hello, world kernel written in Pascal. ;D
I'll have to use GNU Pascal, since to my knowledge there's no other Pascal capable. But, I don't think that's a hinderance at all.
What about turbo pascal or freepascal ?
Re:OS development languages
Posted: Fri Jul 21, 2006 2:19 pm
by paulbarker
I have been thinking about this a fair bit, not designing a language for kernel programming but designing a language for applications programming. I'm sticking to C for my kernel. But heres a few points that may be useful:
- 'Naked' function attribute, meaning don't generate prolog/epilog. This, along with a good inline assembly syntax, means you won't need any asm source files.
- Support for types that match the processor. Think about adding vector types and the like as the language matures.
- Non-mangled function names. Even if you support overloading, you'll save a lot of time if a function named 'main' in your language is named 'main' (or maybe '_main') as seen from other languages like c & asm.
- Interoperability with c. Not all people will like your language, no matter how good it is. Also consider that you might one day need to include existing drivers written in c.
- Extensibility. You don't want a kernel written in language X, system programs in Y, user applications in Z and so on if it's not necessary. I like the idea of a language with 'unmanaged' mode for systems programming and 'managed' mode for applications programming (with garbage collection and the like).
- Separation of language and library. Eg. is 'print()' a keyword or a library function?
And lastly, the most important property a language needs for kernel programming: predictability.
Re:OS development languages
Posted: Fri Jul 21, 2006 9:09 pm
by kernel64
NoItAll wrote:
What about turbo pascal or freepascal ?
Not sure, haven't researched those. GNU Pascal is the most capable and most suitable as far as I know. GNU Pascal also uses the C calling conventions, and can call varargs C functions. It also supports inline assembler.
Re:OS development languages
Posted: Sat Jul 22, 2006 1:56 am
by marcio.f
paulbarker wrote:I have been thinking about this a fair bit, not designing a language for kernel programming but designing a language for applications programming. I'm sticking to C for my kernel. But heres a few points that may be useful:
Take a look at the
D programming language.
paulbarker wrote:- 'Naked' function attribute, meaning don't generate prolog/epilog. This, along with a good inline assembly syntax, means you won't need any asm source files.
From
D x86 Inline Assembler:
naked: Causes the compiler to not generate the function prolog and epilog sequences. This means such is the responsibility of inline assembly programmer, and is normally used when the entire function is to be written in assembler.
paulbarker wrote:- Support for types that match the processor. Think about adding vector types and the like as the language matures.
It does, from
Types:
void, bool, byte, ubyte, int, uint, float, double, char, pointers, arrays, just to name a few.
paulbarker wrote:- Non-mangled function names. Even if you support overloading, you'll save a lot of time if a function named 'main' in your language is named 'main' (or maybe '_main') as seen from other languages like c & asm.
No problem, from
Linkage Attribute:
D provides an easy way to call C functions and operating system API functions, as compatibility with both is essential.
E.g.: extern (C) void main() // GRUB can call it
directly
paulbarker wrote:- Interoperability with c. Not all people will like your language, no matter how good it is. Also consider that you might one day need to include existing drivers written in c.
Interfacing to C
paulbarker wrote:- Extensibility. You don't want a kernel written in language X, system programs in Y, user applications in Z and so on if it's not necessary. I like the idea of a language with 'unmanaged' mode for systems programming and 'managed' mode for applications programming (with garbage collection and the like).
Quoting again:
D is statically typed, and compiles direct to native code. It's multiparadigm: supporting imperative, object oriented, and template metaprogramming styles. It's a member of the C syntax family, and its look and feel is very close to C++'s.
Major features of D:
- Object Oriented Programming: classes, operator overloading
- Modules
- Templates
- Built-in associative arrays (hashmaps)
- Nested functions
- Function literals
- Closures
- Automatic (by means of Garbage Collection) or explicit memory management
- Inline assembler
- Contracts
- Exceptions
- etc...