Intermediate projects before picking OSdev ? (C, C++, asm)

Programming, for all ages and all languages.
User avatar
~
Member
Member
Posts: 1226
Joined: Tue Mar 06, 2007 11:17 am
Libera.chat IRC: ArcheFire

Re: Intermediate projects before picking OSdev ? (C, C++, as

Post by ~ »

davidv1992 wrote:I would suggest against trying to compile C, or even any reasonable subset of C. Because of it's design heritage, C has some major gotchas in the lexing/parsing department (especially around user defined types) that you probably dont want to try your hand on for a first compiler. Besides that, the kind of familiarity one gets with a language when building a compiler for it is quite different than one gets (and needs) for actually building projects in it. You get a really good idea of all the edgecases that a language supports, but won't learn as much on proper design of a larger software project, and good habits around building maintainable code in it, both of which are key for getting into something as complex osdev in the long run.

If you want to become more familiar with C through this project, the best option probably would be to program the compiler itself in C. Do note that this makes your job harder, as you will be missing any object oriented tools that can be very useful, but is perfectly doable. For what language to compile, my suggestion would be to go for something simple. There are various languages designed specifically for teaching compiler construction (for the project I did for university, the goal was to compile SPL, a language defined somewhat during the coarse, see the code.pdf target produced by this repo: https://github.com/davidv1992/SPLCompiler), such languages can also be found in instructional books on the topic of compiler construction, such as A.W. Appel's Modern Compiler Implementation in C or it's variants. It is also an option to construct your own simple language by choosing a number of language features you find interesting.

As for learning assembly during a project like this, a good option is to try to get your compiler to the point where it outputs assembly for an architecture you are interested in. This will teach you a lot about the assembly language for a platform, as well as the broader ecosystem surrounding it.
I'd recommend you to get some good C and x86 NASM interpreters, so that you can interpret C code instead of compiling it.

It's guaranteed that then you will learn C and Assembly (if NASM interpreters exist) as easily as if it was JavaScript, because an interpreter closes the breach between having to configure the right compiler/assembler/toolchain and actually run and debug the code freely enough and immediately so as to make it possible for you to spot and correct bugs fast, on the immediate term, so you become capable of understanding and correcting even complex code in the same way you would inspect a JavaScript or PHP application.

Shipping an OS with pure source code interpreters for C, C++, x86 NASM Assembly (16, 32, 64-bit), would be a key learning tool to trivialize those languages without mutilating their complexity or scope.
YouTube:
http://youtube.com/@AltComp126

My x86 emulator/kernel project and software tools/documentation:
http://master.dl.sourceforge.net/projec ... 7z?viasf=1
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Intermediate projects before picking OSdev ? (C, C++, as

Post by Brendan »

Hi,
orion40 wrote:Of course I don't aim to rival nasm or gcc, they do a perfectly good job. My goal is to learn. How far should I go ? Should I just stick with "just" a compiler ? Is trying to make most of the toolchain myself realistic ?
Creating your own toolchain is entirely realistic; but it's also likely to be a massive waste of time (unless your goal is to learn how to create a toolchain). There are far more efficient ways to learn C and far more efficient ways to learn information that's useful for OS design and implementation.

My suggestions are:

a) Forget about the sheer idiocy of wasting ages writing any compiler, interpreter or toolchain (and then coming back here after several decades of learning nothing useful and having to ask "I wish to learn about the more complex parts of an OS" again).

b) Write a utility in C that reads a text file, extracts "words" from the text file, then inserts each word into a sorted linked list. Reason: This will help you learn about pointers, memory management, linked lists and sorting.

c) Write a utility in C that reads a text file, extracts "words" from the text file, then adds each word to a hash table and increments a "number of times this word was seen" counter if the word is already in the hash table; and once that is done print a "number of times each word occurred" table. Reason: This will help you learn about code re-use (recycle the first half from the previous utility) and hash tables (which helps you to understand more complex structures later, like "hash trees" and how CPU's paging works).

d) Modify the previous utility to use multiple threads; such that one "master thread" reads the file, extracts words, then (for each word extracted) tells one of 4 other "worker threads" to add the word to the hash table (where "which thread is told to add the word to the hash table" is done in order - e.g. "worker thread number to tell to add the word = word number in file % 4"); and once that is done the "master thread" should print a "number of times each word occurred" table. Reason: This will help you learn about concurrency (locking, producer/consumer queues, race conditions, etc), which is important for schedulers, multi-CPU, etc.

e) Show people that are good at C your code for the multi-threaded utility, and ask them to find problems and suggest improvements. Reason: You'll gain more from their advice than you would from writing more in C.

f) Start writing an OS. Anything that you didn't learn from writing the 3 utilities above doesn't matter enough (for OS development) to bother with; and you'll learn it just as fast if you learn it while writing an OS.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
User avatar
Solar
Member
Member
Posts: 7615
Joined: Thu Nov 16, 2006 12:01 pm
Location: Germany
Contact:

Re: Intermediate projects before picking OSdev ? (C, C++, as

Post by Solar »

I second most of what Brendan wrote, it's sound advice.

And if you write the utility in a way that's Unicode-aware (with behalf of the definition of "words", collation / sorting, normalization of combining characters etc.), that'd be awesome.

(Three quarters kidding. While certainly of importance when writing OS code, like file system drivers etc., it's a very involved subject, and from my experience, 9 out of 10 developers don't have a clue in this field anyway. :twisted: )
Every good solution is obvious once you've found it.
davidv1992
Member
Member
Posts: 223
Joined: Thu Jul 05, 2007 8:58 am

Re: Intermediate projects before picking OSdev ? (C, C++, as

Post by davidv1992 »

Brendan wrote:Hi,
orion40 wrote:Of course I don't aim to rival nasm or gcc, they do a perfectly good job. My goal is to learn. How far should I go ? Should I just stick with "just" a compiler ? Is trying to make most of the toolchain myself realistic ?
Creating your own toolchain is entirely realistic; but it's also likely to be a massive waste of time (unless your goal is to learn how to create a toolchain). There are far more efficient ways to learn C and far more efficient ways to learn information that's useful for OS design and implementation.

My suggestions are:

a) Forget about the sheer idiocy of wasting ages writing any compiler, interpreter or toolchain (and then coming back here after several decades of learning nothing useful and having to ask "I wish to learn about the more complex parts of an OS" again).

b) Write a utility in C that reads a text file, extracts "words" from the text file, then inserts each word into a sorted linked list. Reason: This will help you learn about pointers, memory management, linked lists and sorting.

c) Write a utility in C that reads a text file, extracts "words" from the text file, then adds each word to a hash table and increments a "number of times this word was seen" counter if the word is already in the hash table; and once that is done print a "number of times each word occurred" table. Reason: This will help you learn about code re-use (recycle the first half from the previous utility) and hash tables (which helps you to understand more complex structures later, like "hash trees" and how CPU's paging works).

d) Modify the previous utility to use multiple threads; such that one "master thread" reads the file, extracts words, then (for each word extracted) tells one of 4 other "worker threads" to add the word to the hash table (where "which thread is told to add the word to the hash table" is done in order - e.g. "worker thread number to tell to add the word = word number in file % 4"); and once that is done the "master thread" should print a "number of times each word occurred" table. Reason: This will help you learn about concurrency (locking, producer/consumer queues, race conditions, etc), which is important for schedulers, multi-CPU, etc.

e) Show people that are good at C your code for the multi-threaded utility, and ask them to find problems and suggest improvements. Reason: You'll gain more from their advice than you would from writing more in C.

f) Start writing an OS. Anything that you didn't learn from writing the 3 utilities above doesn't matter enough (for OS development) to bother with; and you'll learn it just as fast if you learn it while writing an OS.


Cheers,

Brendan
Although points b) through e) are reasonable suggestions for exercises, I don't agree with the advice of part a) and f). The exercises suggested fail on one important point, in that they don't teach the important skills of dealing with a larger project, setting realistic goals and knowing when and when not to stick to those. Those are very important skills to have when starting doing OS development if one wants to get anywhere. Furthermore, they don't really do anything on the front of learning to deal with assembly.

A moderate size project, like a compiler for a simple language (outputting to assembly language) will teach those skills. While I agree that a complete toolchain is overkill, and will take too long to be practical, I stand by my point that, choosing a limited, simple language, a compiler can be written in free time in the span of a few months. This gives a shorter learning cycle than OSdev, where most people will need at least a year or two to get anywhere close to something that resembles a functional kernel, whilst still giving a valuable exercise in how to design a larger project.

Furthermore, a compiler deals with hardware in a lowlevel enough fashion that a number of concepts that are useful to understand when building an OS, such as how a processor runs code and what ABIs are, are acquired. It is also a reasonable way to get an understanding of assembly, how one can interact between assembly and code compiled from other languages, and how to debug it, without writing large programs in assembly by hand. I found this a valuable experience to have when I started to write and debug my kernel.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Intermediate projects before picking OSdev ? (C, C++, as

Post by Brendan »

Hi,
davidv1992 wrote:Although points b) through e) are reasonable suggestions for exercises, I don't agree with the advice of part a) and f). The exercises suggested fail on one important point, in that they don't teach the important skills of dealing with a larger project, setting realistic goals and knowing when and when not to stick to those. Those are very important skills to have when starting doing OS development if one wants to get anywhere.
Given the choice between "write a large project to get experience writing large projects (but make it an OS so that you also get other relevant experience/knowledge)" and "write a large project to get experience writing large projects (but make it something where you won't get much other relevant experience/knowledge)"; why would anyone advocate an incredibly idiotic choice?
davidv1992 wrote:Furthermore, they don't really do anything on the front of learning to deal with assembly.
To write an OS; the amount of knowledge/experience you need with assembly language programming is mostly limited to the ability to cut&paste a handful of inline assembly macros from the wiki or anywhere else (unless you're hoping to write an OS in 100% assembly language, but in that case you can completely forget about C).
davidv1992 wrote:Furthermore, a compiler deals with hardware in a lowlevel enough fashion that a number of concepts that are useful to understand when building an OS, such as how a processor runs code and what ABIs are, are acquired. It is also a reasonable way to get an understanding of assembly, how one can interact between assembly and code compiled from other languages, and how to debug it, without writing large programs in assembly by hand. I found this a valuable experience to have when I started to write and debug my kernel.
I found that writing operating systems gave me experience/knowledge that was valuable when I started to write a compiler (things like experience with large projects, assembly language, ABIs, executable file formats, etc). Perhaps the original poster should start by writing an OS as a slow and inefficient way to learn about compilers (and then write a compiler as a slow and inefficient way to learn about operating systems after that).


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
davidv1992
Member
Member
Posts: 223
Joined: Thu Jul 05, 2007 8:58 am

Re: Intermediate projects before picking OSdev ? (C, C++, as

Post by davidv1992 »

Brendan wrote: Given the choice between "write a large project to get experience writing large projects (but make it an OS so that you also get other relevant experience/knowledge)" and "write a large project to get experience writing large projects (but make it something where you won't get much other relevant experience/knowledge)"; why would anyone advocate an incredibly idiotic choice?
The main reason why I would advocate going for a compiler first (or possibly some other reasonably largish project) is because it takes quite a bit less time than an OS to develop (3 months vs 1-2 years). You get all the experience of working on a larger project without at the same time having to ditch a lot of the comforts that come from programming purely in userspace. It is a lot easier to first deal with complexity in an environment where you have the full power of something like gdb or the visual studio debugger available, with things like reverse debugging. Furthermore, the reduced time to get stuff that actually does interesting things significantly reduces the cycle time when developing, leading to more opportunities for learning.
Brendan wrote: To write an OS; the amount of knowledge/experience you need with assembly language programming is mostly limited to the ability to cut&paste a handful of inline assembly macros from the wiki or anywhere else (unless you're hoping to write an OS in 100% assembly language, but in that case you can completely forget about C).
I personally found it rather useful when debugging to be competent reading assembly language instructions, and being able to trace (at least at a somewhat global level) what parts of my kernel functions they came from and what they do. This might be partly due to the fact that I primarily use my exception handlers and the bochs debugger to debug my code, but I find those a lot more useful for debugging some of the really low level stuff.

Also, copy-pasting even simple assembly might not be the best idea for those chunks of assembly code one cannot avoid, as anything going wrong there can already be very tricky to debug, and not understanding what those pieces do precisely can make that bad situation even worse. Also, I find that I do more than just little snippets in assembly, particularly in the very early boot code (I'm currently at ~300 lines of assembly across various files), even though i'm using grub as the bootloader, but again that might just be the way I decided to implement things.
Kevin
Member
Member
Posts: 1071
Joined: Sun Feb 01, 2009 6:11 am
Location: Germany
Contact:

Re: Intermediate projects before picking OSdev ? (C, C++, as

Post by Kevin »

Brendan wrote:e) Show people that are good at C your code for the multi-threaded utility, and ask them to find problems and suggest improvements. Reason: You'll gain more from their advice than you would from writing more in C.
Another related advice for later, when you have already played around enough so that you can get something working, is that in order to improve from there, a good way is to contribute to a project of someone else who is better at C (or programming in general) than you.
Developer of tyndur - community OS of Lowlevel (German)
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Intermediate projects before picking OSdev ? (C, C++, as

Post by Brendan »

Hi,
davidv1992 wrote:
Brendan wrote:Given the choice between "write a large project to get experience writing large projects (but make it an OS so that you also get other relevant experience/knowledge)" and "write a large project to get experience writing large projects (but make it something where you won't get much other relevant experience/knowledge)"; why would anyone advocate an incredibly idiotic choice?
The main reason why I would advocate going for a compiler first (or possibly some other reasonably largish project) is because it takes quite a bit less time than an OS to develop (3 months vs 1-2 years). You get all the experience of working on a larger project without at the same time having to ditch a lot of the comforts that come from programming purely in userspace.
One of the reasons I'd advocate going for an OS first is that it gives you experience working on a larger project that you won't get from a smaller/faster project (like a simple compiler). Another reason is that you'll be able to recycle parts of the older code when you want to write a better OS (which you won't be able to do if the older code was for a compiler).
davidv1992 wrote:It is a lot easier to first deal with complexity in an environment where you have the full power of something like gdb or the visual studio debugger available, with things like reverse debugging.
Yes; it is a lot easier to first deal with complexity in an environment like "Qemu + GDB" where you have the full power of something like GDB for debugging your OS. There's even tools (e.g. VisualKernel?) that integrate "Qemu + GDB" into VisualStudio if that's what you want.
davidv1992 wrote:Furthermore, the reduced time to get stuff that actually does interesting things significantly reduces the cycle time when developing, leading to more opportunities for learning.
Yes; the reduced time to get to stuff that actually does interesting things (scheduling, memory management, PCI enumeration, etc) reduces the time when developing, leading to more opportunities for learning boring stuff that doesn't help much with an OS project ( lexing, spending a month trawling through Intel manuals entering opcodes into a big table buried in a compiler's back-end, spending another 2 months tweaking "peep-hole optimiser" patterns, etc).
davidv1992 wrote:
Brendan wrote:To write an OS; the amount of knowledge/experience you need with assembly language programming is mostly limited to the ability to cut&paste a handful of inline assembly macros from the wiki or anywhere else (unless you're hoping to write an OS in 100% assembly language, but in that case you can completely forget about C).
I personally found it rather useful when debugging to be competent reading assembly language instructions, and being able to trace (at least at a somewhat global level) what parts of my kernel functions they came from and what they do. This might be partly due to the fact that I primarily use my exception handlers and the bochs debugger to debug my code, but I find those a lot more useful for debugging some of the really low level stuff.
I personally found it rather useful to have an early background in digital electronics, because this helped me understand what a (6502) CPU was actually doing, which helped me learn (6502) machine code, which helped me learn 6502 assembly language, which help me learn 16-bit 80x86 assembly, which helped me learn 32-bit and 64-bit assembly, which helped me write OSs in 100% assembly and helped me understand what Boch's debugger is telling me.

Of course the only important part of my paragraph is "I personally". Most people here write code in C or some other higher level language, and use tools like GDB, and would therefore gain very little from spending a huge amount of time trying to follow the path I did. That is why I don't suggest "learn digital electronics" when people say they want to write an OS in C.
davidv1992 wrote:Also, copy-pasting even simple assembly might not be the best idea for those chunks of assembly code one cannot avoid, as anything going wrong there can already be very tricky to debug, and not understanding what those pieces do precisely can make that bad situation even worse.
Sure; but spending a few years writing a compiler without knowing if you'll ever need that information is a whole lot less efficient than spending 10 minutes to ask someone if you ever actually need to.
davidv1992 wrote:Also, I find that I do more than just little snippets in assembly, particularly in the very early boot code (I'm currently at ~300 lines of assembly across various files), even though i'm using grub as the bootloader, but again that might just be the way I decided to implement things.
For their first OS; about 90% of people will use something like GRUB and won't write any of their early boot code. I'd like to think that eventually most people will write their own boot code (e.g. for their second or third OS); but even then it's more likely that they'll write a UEFI boot loader in C.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
Kevin
Member
Member
Posts: 1071
Joined: Sun Feb 01, 2009 6:11 am
Location: Germany
Contact:

Re: Intermediate projects before picking OSdev ? (C, C++, as

Post by Kevin »

Brendan wrote:For their first OS; about 90% of people will use something like GRUB and won't write any of their early boot code. I'd like to think that eventually most people will write their own boot code (e.g. for their second or third OS); but even then it's more likely that they'll write a UEFI boot loader in C.
I have hardly ever seen anyone going from GRUB to their own bootloader in the second or third OS. Usually those who start with an existing bootloader stay there, only very few desperatly want additional features that existing bootloaders can't provide.

But I've seen quite a few people going the other direction because after doing their own bootloader (usually because they think an OS is incomplete with it, because all those bad old tutorials hack some boot code together) they noticed that it doesn't really add anything except a few bugs.
Developer of tyndur - community OS of Lowlevel (German)
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Intermediate projects before picking OSdev ? (C, C++, as

Post by Brendan »

Hi,
Kevin wrote:
Brendan wrote:For their first OS; about 90% of people will use something like GRUB and won't write any of their early boot code. I'd like to think that eventually most people will write their own boot code (e.g. for their second or third OS); but even then it's more likely that they'll write a UEFI boot loader in C.
I have hardly ever seen anyone going from GRUB to their own bootloader in the second or third OS. Usually those who start with an existing bootloader stay there, only very few desperatly want additional features that existing bootloaders can't provide.

But I've seen quite a few people going the other direction because after doing their own bootloader (usually because they think an OS is incomplete with it, because all those bad old tutorials hack some boot code together) they noticed that it doesn't really add anything except a few bugs.
I'd like to think that eventually most people will want additional features that existing bootloaders can't provide (e.g. for their second or third or fourth or .... millionth OS); but even then it's more likely that they'll write a UEFI boot loader in C.

I'd also like to think that eventually I'll be extremely rich and live until I'm at least 200 years old; and ACPI will be replaced by sane hardware standards, and SMM will cease to exist, and Unix will be forgotten.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
User avatar
Schol-R-LEA
Member
Member
Posts: 1925
Joined: Fri Oct 27, 2006 9:42 am
Location: Athens, GA, USA

Re: Intermediate projects before picking OSdev ? (C, C++, as

Post by Schol-R-LEA »

Brendan wrote: I'd like to think that eventually most people will write their own boot code (e.g. for their second or third OS); but even then it's more likely that they'll write a UEFI boot loader in C.
And I would like to think that by the time a person has finished writing their first (and most likely only, for anyone who isn't as obsessed as you are) OS, they have enough sense to realize that the only people who should be writing a boot loader are those who make boot loaders for other people to use, and not bother with such a pointless task that teaches nothing about OS dev.

And that is coming from one who does intend to write one. Not because it is in any way a useful or interesting task in itself, or because it furthers my OS dev ambitions, but because there is a need for a better one than GRUB (mind you, the real intent of it is not as a boot loader, but as a hypervisor, it's just that it would take over the early parts of the boot cycle for the client systems as well. Still, it would be usable as one, since my intent is that it could be configured just to boot a system to bare metal without the hypervisor if actually necessary for, say, running certain Windows games).
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
simeonz
Member
Member
Posts: 360
Joined: Fri Aug 19, 2016 10:28 pm

Re: Intermediate projects before picking OSdev ? (C, C++, as

Post by simeonz »

Writing a driver for another OS, or multiple OSes, might interest the OP. Or writing a disassembler, Elf dependency walker/viewer, code injector/tracer, even some kind of debugger. It still would be easier than writing a C compiler and will provide some of the learning benefits.

If algorithmic skills are required, there are many learning projects to do and I myself hope to spend some time on a few of those. E.g., regex engine, big matrix calculator, graphics interpreter, small minimalistic db. Parallelism and cache-oblivious/aware algorithms are useful accents.
orion40
Posts: 11
Joined: Tue Jun 13, 2017 12:37 pm

Re: Intermediate projects before picking OSdev ? (C, C++, as

Post by orion40 »

Thanks everyone for the replies, so far I've decided to stick to this little project :
http://www.buildyourownlisp.com/

It seems short enough, I'll see what I'll do next. Making a dependency walker might be nice, as well as other kind of tools as suggested by the previous poster.

I don't have enough time for now so maybe I'll skip some stuff and directly go toward OSdev. Even if my first OS sucks, well I've discarded existing progress in the past to rewrite them in a much better way. I don't really mind, at the end of the day, perfect or not it's still more experience.
Post Reply