Well, I was drafting this, but since the topic came up:
Hi all!
After several months of saying relatively little, I'd like to tell you about my project, and my progress. Then I'd like to ask your opinion on my latest work.
I guess I should begin by better introducing myself. I've just started my career as a programmer after finishing my CS degree. While I was in school, I figured that I should study the things I'd never bother to learn on my own, which is why I was just two credits shy of a math minor. Thus, I am just getting back to treat programming as a end in itself (rather than a means to an end).
I've wanted to write my own language since I was 16. But I didn't learn how to until a little later. But I've lacked the time to do anything (due to the degree and work) until a few months ago. By then I had taught myself enough x86 assembly to implement whatever I wanted (poorly).
Of course, I still have plenty to learn. Once I was satisfied that I knew the basics of what I wanted in a language, I shifted my thoughts to being more domain specific. And I became curious about the requirements for an OS developing language.
So I started learning how to develop an OS. I spent my bus-ride into work (slightly less than two hours per day) reading the Intel and AMD manuals. I wrote a protected mode bootloader, and eventually came here when I wanted to expand beyond that.
Long story short, I wrote a "scaffold" OS that pretty much did the basics. The only component of any interest was the randomizing stuff (I figured that by randomizing certain things the chances of a bug surviving due to dumb luck would be minimized). As an aside, Would anyone want me to make a wiki page on the topic of PRNGs?
Anyway, I learned the basics, then set out to write a language. But I've independently re-discovered someone else's work before. This prompted me to learn liked languages that have been used for OS development in the past (or have a supposed feature I'd want). I learned Scheme (in lieu of LISP), Erlang, Forth, LLVM, and I've skimmed a few more. I should really relearn D (I learned it back in 2002).
I also tried to expose myself to different techniques. Such as Quajects. In retrospect, I probably should have done this step years ago. My hope was that I'd find something spectacular for OS dev that isn't widely known (there aren't many OS devs not using C/C++/asm). After all that searching I didn't come back with anything of note.
I've decided on making a yet another C-like language. At the moment, the two are so similar that I could almost accomplish my current goals by using typedefs, macros, and altering the C library. But since my primary interest is in language design, I'd much rather make my own than port/clean existing code (such as pcc).
I've already written a regular expression tokenizer and am currently working on an LALR(1) syntax analyzer (from scratch) in C. Soon I'll be writing the grammar/libraries to bind the four.
But before I go any further, I'd like to get some input.
Question 0: From my experience, I believe that the primary operations of a micro-kernel can be described as "moving data" in contrast to most applications, which can be described as doing data analysis/manipulation. In your experience, is this observation correct? If not, what circumstances negate my observation?
* If a lot of people say that data manipulation frequently occurs, I would consider adding anonymous functions, to allow
map and
fold. But I could be dissuaded from this.
* At the moment, I'm only going to write an x86, 64-bit version. Although I will want to add more architectures later. For now, I'll need to write the long-mode code in assembly.
* By default integers will be unsigned.
* An integer's size will always be relative to the current computer. For example, on a 32-bit machine "int" will refer to a 4-byte number, and an 8-byte number on a 64-bit machine. There will also be absolute integers (byte, word, dword, qword) for cases like ASCII strings that are just bytes.
Question 1. Have you ever required an integer larger than the processor's bit length? If so, why and how much larger?
Question 2. Have you ever used floating point in your kernel? If yes, why?
* I've determined that a characteristic of languages is if they have meta-data about variables, or not. For an OS development language, I do not think meta-data is appropriate... With one exception, malloc should store how much space a pointer has. Although I must admit I'm still figuring out how this can be used in conjunction with pointer arithmetic as well as a few edge cases. My hope is that this will help prevent buffer overflow.
* I want to create a "semi-debug" mode. This mode will turn on checks (deep down in the language) that you wouldn't dream of running in normal mode (mostly due to expense). My current thoughts on the topic are that there would be a "checklist" to avoid irrelevant cases. My hope is that this will aid debugging on real hardware.
Question 3: How would you want language-level errors to be handled? Try/catch? Assert?
Conditions? If+code? I have very little preference on this issue.
* I want to have a relatively large built-in library. Not that I want it to have a Java-size library, but I've maintained C code that has multiple implementations of the same generic data structure within a single codebase!
Question 4: My opinion of Object Orientation varies from time to time. At the moment, I think it is overrated. And I don't see much value in it being used to develop a kernel. Have you ever used object orientation in a kernel in such a way that would be non-trivial to implement in a non-OO language?
* Despite the previous question, I still am a fan of operator and function overloading. While I think binding should happen at compile time for an OS language. Although I must admit that I am afraid that this feature could be terribly misused.
Question 5: Assuming and ignoring a Hardware Abstraction Layer, how much assembly have you used? Do you work with multiple architectures? How would you want to organize this?
Question 6: I haven't really talked much about parallel programming, but I do want it to be embedded into the core of my language. What features would you want? Right now I like
Cilk as a base.
General Question: What would you change in C?
What I want is a language that is meant to control hardware, with a few different assumptions than C (ones that were made back in the early 70s due to low-powered hardware). This base language will more than likely lead to a more ambitious one as I use it (using a language with the intent of changing it, in my limited experience, is one of the best ways of creating a new language).
Thank you for any comments given. I tend to consider a lot of points of view, hence I will spend a considerable amount of time pondering your input. Any additional suggestions on topics that I have not brought up (or more than likely not thought of) would be appreciated. I plan on resuming work on this project in a couple of days. I'll post something when it is worth showing.