Choosing the right language for kernel development

Love4Boobies · Post by **Love4Boobies** » Sun Nov 02, 2008 6:10 am

Hi.

There are probably many threads on this but I failed to come across them. There are low level, middle level and high level languages (and probably some that are somewhere in between these). The lower the level, the easier to get access to what you need; the higher, the easier it is to program (and maintain) in a bug-free way. However, which do you think is the right choice?

I'm not doing a poll here, I'm looking for serious reasons. I'm well aware of projects like Singularity and SharpOS. Do you think C makes a better choice? C++? Java? (If your favorite language isn't in the C family, just pick the level it fits in).

I'm not very experienced with C# for instance but I thought about making a kernel in C# and translating the byte code from CIL to x86 (NOT at run-time). That would mean that it's safe, yet applications would still use normal hardware protection, as in a monolithic kernel. Sort of like a safe monolithic kernel. Someone just looking at the code produced would say that it's monolithic, but it would actually be a microkernel. What do you think about this?

Does C# require a lot of setting up?

Hery · Post by **Hery** » Sun Nov 02, 2008 7:20 am

You really won't take any noticeably benefit from using C# instead of any other language. I don't think that it is better or worse solution, although using C# compiled directly to x86 as a kernel is a bit strange.

C# don't give you anything you can't do in other languages, it's just the way of writing your code. Solutions such as smart pointers makes all languages "safe" in the same way in which Java, C#, Python, etc are safe. However, there are some parts of kernel that can't be safe in that languages, but you still can use smart pointers there.

Someone just looking at the code produced would say that it's monolithic, but it would actually be a microkernel.

Sorry, but I didn't get the point of that. Using C# wouldn't make your kernel a microkernel.

Concluding, there are not wrong and right languages for osdev. Some are more often used, some less, but innovation is always desired

SirStorm25 · Post by **SirStorm25** » Sun Nov 02, 2008 7:28 am

Well, with the Operating System I'm working on were using Assembly and Pascal. Each language works quite good together as you can implement Asm with functions and procedures. If you can read and write (translate between) C and Pascal with a little Asm you could easily create an OS in pascal.

However, if you want ot use C most tutorials on this website use C which is a great start and helps you with further development of your OS, you could just create an OS in assembly which might require a bit of studying, but once you've learnt it, it becomes pretty easy to develop an OS with!

Hope I've Helped!
Regards

SirStorm25.

bewing · Post by **bewing** » Sun Nov 02, 2008 8:09 am

The way I see it, when you take even one step away from Assembly, you gain one potentially big advantage: portability.
If you code in ASM, you gain big piles of efficiency improvements.
(In fact, you also can gain some of that efficiency by intentionally switching language at the end of the project -- either hand-assembling your finished HLL code, or recoding your ASM in an HLL.)
It is very true that you can code faster in an HLL. From my experience, I disagree about the "fewer bugs" thing.

So, it really makes a difference according to your priority list. You need to decide in advance the priority of the following:
1. How important is code portability?
2. How important is system efficiency?
3. How soon do you need a functional product?

In my case, I don't care at all about #1, #2 is a HUGE priority for me, and #3 is a moderate priority. So I'm coding the main part of the kernel in ASM, and the rest in C.

Colonel Kernel · Post by **Colonel Kernel** » Sun Nov 02, 2008 9:30 am

Hery wrote:You really won't take any noticeably benefit from using C# instead of any other language. I don't think that it is better or worse solution, although using C# compiled directly to x86 as a kernel is a bit strange.

Why is it strange? Singularity does it. So does any .NET app that's been run through ngen. It's just code in the end.

C# don't give you anything you can't do in other languages, it's just the way of writing your code. Solutions such as smart pointers makes all languages "safe" in the same way in which Java, C#, Python, etc are safe.

No, that's not what "safe" means in this context. "Safe" means that you can't stomp all over memory just by having made a typo. There is no pointer arithmetic allowed in C# (except in blocks marked as "unsafe"), and no casting other than simple numerical type conversion and class hierarchy up/down casts, which can be checked at run-time. That's what makes C# "safe".

All smart pointers do is avoid memory leaks by automating reference counting, which is what the garbage collector is there to do for C# programmers. This is actually the elephant in the room when talking about writing a kernel in C# -- How are you going to implement a highly concurrent real-time garbage collector in your kernel? The Singularity guys have done it because they probably all have PhDs.

Someone just looking at the code produced would say that it's monolithic, but it would actually be a microkernel.
Sorry, but I didn't get the point of that. Using C# wouldn't make your kernel a microkernel.

I agree. The one does not imply the other. Singularity is a microkernel because it is structured as a set of co-operating processes that communicate via message passing. The fact that the processes are isolated via type-safe languages (C#, Sing#) and code verification instead of (or in addition to) an MMU is orthogonal to the way the system is structured. The Singularity kernel certainly could have been written in any language and still implement a system using software-isolated processes.

As for my reasons about what language to choose, I would say it depends a lot on what your goals are. If your goal is to learn x86 asm, then use that. If it's to learn C#, try doing regular app development in C# first.

In my case, I chose C because I wanted portability and at least a bit of abstraction, while still having a lot of control over the libraries used in my OS. I chose C over C++ because I wasn't interested in implementing a lot of run-time support for the language. My goal was to learn about kernel implementation issues, not language run-time implementation issues.

Brendan · Post by **Brendan** » Sun Nov 02, 2008 9:55 am

Hi,

The idea that HLL code is more portable can be a complete misconception when you're doing very low level stuff. For example, imagine you write HLL code to handle 80x86 paging structures, and you compile this code on a completely different architecture - will it work? Of course not - a different architecture will have different paging structures, and you'll need to write new code to handle the new architecture's paging structures. In this case using a HLL doesn't make it any more portable. Things like CPU detection and configuration, task switching code and exception handling are all "architecture specific" regardless of which language you use, and so is some of the rest of the hardware (PIC, PIT, RTC, APICs, etc) because other architectures have different hardware.

What you could do is write architecture dependent code for each architecture and then build abstraction layer/s on top of that (either one huge abstraction layer or lots of small abstraction layers). In this case you can have portable code on top of the abstraction layers, and then hope that the compiler will be able to remove all the inefficiency that this introduces. However, for something like a micro-kernel there wouldn't be much portable code on top of the abstraction layers, so you'd end up with extra work designing and writing abstraction layers, and extra inefficiency caused by the abstraction layers, with no real benefit.

Applications can be portable, so there's a good reason to use a high level language for them. Some device drivers can also be portable (e.g. drivers for PCI and USB devices, as these devices can be used in different architectures) so it can make sense to use a high level language for them too. Other devices don't exist in other architectures though - for example, why bother writing a portable PS/2 keyboard and mouse driver when 80x86 is the only architecture that has PS/2 keyboards and PS/2 mouses?

One approach would be to write micro-kernels in pure assembly language (because efficiency matters and portability doesn't really exist anyway), and then write some device drivers in assembly, and write everything else in a portable high level language. That way porting the OS would involve writing a new micro-kernel and some device drivers (that need to be written anyway) and then porting everything else.

The other thing people often don't consider is the tools themselves. Once your OS is working nicely, how much fun is it going to be trying to port GCC (or any other compiler) to your OS? If you're writing "yet another boring Unix clone" (that nobody will ever use because there's plenty of good established alternatives) then it'd be easy to port GCC to your OS. In my case, the OS uses asynchronous messaging for everything, has different linear address space characteristics and it doesn't support file I/O in the same way that most OSs do. Porting something like GCC would be a massive problem - it'd be easier for me to write my own compiler; but it'd be even easier if I didn't use any high level languages and only needed to write my own assembler/s...

Cheers,

Brendan

quok · Post by **quok** » Sun Nov 02, 2008 10:37 am

Colonel Kernel wrote:
Someone just looking at the code produced would say that it's monolithic, but it would actually be a microkernel.
Sorry, but I didn't get the point of that. Using C# wouldn't make your kernel a microkernel.
I agree. The one does not imply the other. Singularity is a microkernel because it is structured as a set of co-operating processes that communicate via message passing. The fact that the processes are isolated via type-safe languages (C#, Sing#) and code verification instead of (or in addition to) an MMU is orthogonal to the way the system is structured. The Singularity kernel certainly could have been written in any language and still implement a system using software-isolated processes.

Using that definition of a microkernel, then Windows NT and its derivatives qualify. NT is made up of a bunch of co-operating processes all communicating via message passing, however most of them all run in kernel space (which according to some people, makes Windows NT more of a monolithic kernel). At least that's my understanding after reading the Wikipedia article on Windows NT architecture. That all being said, another wikipedia article on Windows NT says that NT 3.1 was a more traditional microkernel, and as new versions were released more and more processes were moved to kernel space.

As for my reasons about what language to choose, I would say it depends a lot on what your goals are. If your goal is to learn x86 asm, then use that. If it's to learn C#, try doing regular app development in C# first. In my case, I chose C because I wanted portability and at least a bit of abstraction, while still having a lot of control over the libraries used in my OS. I chose C over C++ because I wasn't interested in implementing a lot of run-time support for the language. My goal was to learn about kernel implementation issues, not language run-time implementation issues.

My kernel is using C and asm as well. I struggled a bit with deciding what languages to use but I went for this more traditional model for pretty much the same reasons. I wanted some portability and as much control as possible, without having to bother with implementing a lot of run-time support. Also, I had to learn asm as it was, and learning one language was enough. I do know C++ and had a couple of stints as a C++ developer, but I'm more familiar with C so stuck with that. But no matter what language one uses for their kernel project, I would recommend knowing C and asm (and being capable of reading and translating between both at&t and intel syntaxes) if only because there's so many tutorials and sample/reference code that's written in these languages. Since my project is open source, using this same combination would allow me to copy bits of code from other projects like OpenBSD if I had need (but then I wouldn't be learning nearly as much).

neon · Post by **neon** » Sun Nov 02, 2008 11:25 am

Do you think C makes a better choice? C++? Java? (If your favorite language isn't in the C family, just pick the level it fits in).

I have used both C and C++ in two different OS projects (One in C, my current one in C++) and I personally prefer C++ over C. The main reason is that I can use the OOP features of C++ (classes, virtual functions, inheritance, et al..) which I prefer. The only downside is no support for RTTI, and anything that relies on it, but I consider this a small loss compared to the OOP support that I get.

I just feel that I can create a more sophisticated and robust design using C++ then I can with C, hence the reason why I prefer C++. (Both my bootloader and kernel use C++).

I also recommend the same concept when choosing a language: Don't choose it because it is the most used, rather choose the language that best suits your design goals.

Colonel Kernel · Post by **Colonel Kernel** » Sun Nov 02, 2008 2:47 pm

Brendan wrote:The idea that HLL code is more portable can be a complete misconception when you're doing very low level stuff. For example, imagine you write HLL code to handle 80x86 paging structures, and you compile this code on a completely different architecture - will it work? Of course not - a different architecture will have different paging structures, and you'll need to write new code to handle the new architecture's paging structures. In this case using a HLL doesn't make it any more portable.

That's true, but you're ignoring all the other benefits of HLLs (readability, maintainability, etc.). I think you over-estimate the amount of such "low-level stuff" involved in OS dev, perhaps because it's where you like to spend your time.

Brendan wrote:However, for something like a micro-kernel there wouldn't be much portable code on top of the abstraction layers, so you'd end up with extra work designing and writing abstraction layers, and extra inefficiency caused by the abstraction layers, with no real benefit.

And yet there are a number of microkernels written in C/C++ (L4, QNX Neutrino, Mach, to name a few). Have a look at the QNX source code and tell me how much of it you think is architecture-specific.

Also, the "extra inefficiency caused by the abstraction layers" is a property of the design, not the language chosen. There are also plenty of efficient abstractions out there (e.g. -- pretty much anything done at compile-time).

quok wrote:Using that definition of a microkernel, then Windows NT and its derivatives qualify. NT is made up of a bunch of co-operating processes all communicating via message passing, however most of them all run in kernel space (which according to some people, makes Windows NT more of a monolithic kernel).

Yes and no. The kernel-mode portions of NT just call each other directly -- there is no message-passing involved. However, they are modular and have well-defined boundaries, so it wouldn't be fair to call it monolithic. Message-passing is used between the kernel, the sub-system processes like csrss.exe, lsass.exe, all the services (often hosted in svchost.exe), user-mode device drivers (as of Vista IIRC), and the application processes that use them. It's a stretch to call it "micro" though, given the size and complexity of the kernel.

Hery · Post by **Hery** » Mon Nov 03, 2008 9:39 am

Why is it strange? Singularity does it. So does any .NET app that's been run through ngen. It's just code in the end.

It's against the idea of such technologies. Singularity in general is compiled just in time, ngen is not a default way of compiling .NET programs.

No, that's not what "safe" means in this context. "Safe" means that you can't stomp all over memory just by having made a typo. There is no pointer arithmetic allowed in C# (except in blocks marked as "unsafe"), and no casting other than simple numerical type conversion and class hierarchy up/down casts, which can be checked at run-time. That's what makes C# "safe".

Maybe I express myself not clearly enough, you'd probable need to improve smart pointers a bit but it is not impossible. You can achieve the same safety using (improved) smart pointers in any other language. Everything depends on programmer, you can do "unsafe" things in C#, and "safe" in C/C++, etc.

I chose C over C++ because I wasn't interested in implementing a lot of run-time support for the language.

If you don't need RTTI and exceptions immediately, writing kernel in C++ doesn't require many run-time support functions. I think that there are other problems with C++ that encourages to use C instead.

I'm using C++ in my project, despite the fact that I think object oriented C would be better

I'm also considering adding some scripting language (probably as loadable modules), but it's in early stage of designing

Colonel Kernel · Post by **Colonel Kernel** » Tue Nov 04, 2008 9:39 am

Hery wrote:
Why is it strange? Singularity does it. So does any .NET app that's been run through ngen. It's just code in the end.
It's against the idea of such technologies. Singularity in general is compiled just in time, ngen is not a default way of compiling .NET programs.

Singularity does not use JIT compilation for anything! This is a very common misconception about how it works. I wish more people would actually read the research papers before making such assumptions.

The kernel is compiled to native code by Bartok, and applications are compiled before the first time they're run. There is no dynamic loading in Singularity. Bartok combines all required libraries together with the executable at link-time (not load-time) and does a lot of inter-procedural optimizations that are very difficult, if not impossible, to do with JIT.

There is nothing about the C# language itself that requires JIT.

Hery wrote:
No, that's not what "safe" means in this context. "Safe" means that you can't stomp all over memory just by having made a typo. There is no pointer arithmetic allowed in C# (except in blocks marked as "unsafe"), and no casting other than simple numerical type conversion and class hierarchy up/down casts, which can be checked at run-time. That's what makes C# "safe".
Maybe I express myself not clearly enough, you'd probable need to improve smart pointers a bit but it is not impossible. You can achieve the same safety using (improved) smart pointers in any other language. Everything depends on programmer, you can do "unsafe" things in C#, and "safe" in C/C++, etc.

No, everything does not depend on the programmer. I'm talking about "safety" as a property of language type systems that can be mathematically proven, not as a general quality of software. Yes, you can do unsafe things in C#, but only in blocks of code explicitly marked as "unsafe". If you try to use pointers outside an "unsafe" block in C#, you will get a compile error. You can run any IL code through a verifier that will tell you whether or not it violates type safety, giving the system the ability to guarantee freedom from memory corruption errors. This is all automatic and requires no extra discipline on the part of C# programmers. There is no such enforcement in C or C++.

rdos · Post by **rdos** » Fri Nov 07, 2008 1:40 pm

Assembler and C++ is the optimal combination. C is useless "middle-ground". You can do everything faster in assembler than i C, and object-oriented interfaces are much easier constructed in C++ than in C. Therefore, C has no real advantage anywhere.

I've coded everything in the kernel in assembler, and then I provide a C++ class interface for it for user-mode applications to use. I think this is optimal. Some kernel functions (especially file-systems and the TCP/IP stack) was hard to code in assembler, but so what? Writing an OS is a challenge, and so it should be.

The "handle" concept that is very useful for isolating kernel-code also maps perfectly to objects in C++. The C++ wrappers often isolate the handle as a private variable, and export the kernel-functions that uses the handle internally.

rdos · Post by **rdos** » Fri Nov 07, 2008 1:47 pm

Colonel Kernel wrote:No, everything does not depend on the programmer. I'm talking about "safety" as a property of language type systems that can be mathematically proven, not as a general quality of software. Yes, you can do unsafe things in C#, but only in blocks of code explicitly marked as "unsafe". If you try to use pointers outside an "unsafe" block in C#, you will get a compile error. You can run any IL code through a verifier that will tell you whether or not it violates type safety, giving the system the ability to guarantee freedom from memory corruption errors. This is all automatic and requires no extra discipline on the part of C# programmers. There is no such enforcement in C or C++.

Isolation is done by using handles in kernel code, and only exporting those to user-mode. This concept is 100% safe and doesn't need any compiler support. Isolation within the kernel is best achieved with hardware mechanisms (paging, segmentation, and whatever else is present). No language that support pointers can be considered "safe".

Walling · Post by **Walling** » Fri Nov 07, 2008 3:39 pm

rdos wrote:Isolation is done by using handles in kernel code, and only exporting those to user-mode. This concept is 100% safe and doesn't need any compiler support. Isolation within the kernel is best achieved with hardware mechanisms (paging, segmentation, and whatever else is present). No language that support pointers can be considered "safe".

Can you mathematically prove that user-mode programs can't use (malicious) handles that the kernel don't want? Can you prove that any handle used does what it's supposed to do or fails (type error)? You would have to insert runtime checks in the kernel system calls for that. Using C# or any other type safe language the compiler can prove that a function is only supplied valid objects/handles and hence no runtime check is necessary.

rdos · Post by **rdos** » Fri Nov 07, 2008 3:54 pm

Walling wrote: Can you mathematically prove that user-mode programs can't use (malicious) handles that the kernel don't want?

Absolutely. The handle is only a number. This number is used to look up the kernel data. The handle could also be "typed", so the kernel knows the handle passed is associated with the data-type intended. The typing is kept in the kernel.

Walling wrote: Can you prove that any handle used does what it's supposed to do or fails (type error)?

The handle is like the "this" pointer in C++. It refers to the object and is passed as a parameter to "methods". This means that all the user-level code knows about the object-data in the kernel is the handle, and nothing else. True encapsulation that is unbreakable, unlike C++ encapsulation.

Walling wrote: You would have to insert runtime checks in the kernel system calls for that.

Of course. Every API function needs to dereference the handle.

Walling wrote: Using C# or any other type safe language the compiler can prove that a function is only supplied valid objects/handles and hence no runtime check is necessary.

Not between kernel and user-level. Only within the same program. It is easy to break any type-safe language with assembler code.

OSDev.org

Choosing the right language for kernel development

Choosing the right language for kernel development

Re: Choosing the right language for kernel development

Re: Choosing the right language for kernel development

Re: Choosing the right language for kernel development

Re: Choosing the right language for kernel development

Re: Choosing the right language for kernel development

Re: Choosing the right language for kernel development

Re: Choosing the right language for kernel development

Re: Choosing the right language for kernel development

Re: Choosing the right language for kernel development

Re: Choosing the right language for kernel development

Re: Choosing the right language for kernel development

Re: Choosing the right language for kernel development

Re: Choosing the right language for kernel development

Re: Choosing the right language for kernel development