I try to get this out to all new members, to get everyone started on the same footing. I hope this helps. I usually send it by PM, but I try to post it to a thread periodically so that anyone who doesn't check their PMs might see it anyway.
----------------------------------
The first thing I want to say is this: if you aren't already using version control for all software projects you are working on, drop everything and start to do that now. Set up a VCS such as Git, Subversion, Mercurial, Bazaar, or what have you - which you use is less important than the fact that you need to use it. Similarly, setting up your repos on an offsite host such as Gitlab, Github, Sourceforge, CloudForge, or BitBucket should be the very first thing you do whenever you start a new project, no matter how large or small it is.
If nothing else, it makes it easy to share your code with us on the forum, as you can just post a link, rather than pasting oodles and oodles of code into a post.
Once you have that out of the way (if you didn't already), you can start to consider the OS specific issues.
If you haven't already, I would strongly advise you to read the introductory material in the wiki:
After this, go through the material on the practical aspects of
running an OS-dev project:
I strongly suggest that you read through these pages in detail, along with the appropriate ones to follow, before doing any actual development. These pages should ensure that you have at least the basic groundwork for learning about OS dev covered.
This brings you to your first big decision: which platform, or platforms, to target. Commonly options include:
- x86 - the CPU architecture of the stock PC desktops and laptops, and the system which remains the 'default' for OS dev on this group. However, it is notoriously quirky, especially regarding Memory Segmentation, and the sharp divisions between 16-bit Real Mode, 16-bit and 32-bit Protected Modes, and 64-bit Long Mode.
- ARM - a RISC architecture widely used on mobile devices and for 'Internet of Things' and 'Maker' equipment, including the popular Raspberry Pi and Beagleboard single board computers. While it is generally seen as easier to work with that x86, most notably in the much less severe differences in between the 32-bit and 64-bit modes and the lack of memory segmentation, the wiki and other resources don't cover it nearly as well (though this is changing over time as it becomes more commonly targeted).
- MIPS, another RISC design which is slightly older than ARM. It is one of the first RISC design to come out, being part of the reason the idea caught on, and is even simpler than ARM in terms of programming, though a bit tedious when it comes to assembly programming. While it was widely used in workstations and game consoles in the 1990s, it has declined significantly due to mismanagement by the owners of the design, and is mostly seen in devices such as routers. There are a handful of System on Chip single-board computers that use it, such as the Creator Board and the Onion Omega2, and manufacturers in both China and Russia have licensed the ISA with the idea of breaking their dependence on Intel. Finding good information on the instruction set is easy, as it is widely used in courses on assembly language and computer architecture and there are several emulators that run MIPS code, but finding usable information on the actual hardware systems using it is often difficult at best.
- RISC-V is an up and coming open source hardware ISA, but so far is Not Ready For Prime Time. This may change in the next few years, though.
You then need to decide which
Language to use for the kernel. For most OS-Developers this means knowing and using C; while other languages can be used, it is important to know how to read C code, even if you don't use C, as most OS examples are written in it. You will also need to know at least some assembly language for the target platform, as there are always parts of the kernel and the device drivers which cannot be done in high-level languages.
You further need to choose the compiler, assembler, linker, build tool, and support utilities to use - what is called the 'toolchain' for your OS. For most platforms, there aren't many to choose from, and the obvious choice would be
GCC and the
Binutils toolchain due to their ubiquity. However, on the Intel x86 platform, it isn't as simple, as there are several other toolchains which are in widespread use for it, the most notable being the Microsoft one - a very familiar one to Windows programmers, but one which presents problems in OSDev. The biggest issue with Visual Studio, and with proprietary toolchains in general, is that using it rules out the possibility of your OS being "self-hosting" - that is to say, being able to develop your OS in the OS itself, something most OSdevs do want to eventually be able to do. The fact that
Porting GCC to your OS is feasible, whereas porting proprietary x86 toolchains isn't, is a big factor in the use Binutils and GCC, as it their deep connection to Linux and other Unix derivatives.
Regardless of the high-level language you use for OS dev (if any), you will still need to use
assembly language, which means
choosing an assembler. If you are using Binutils and GCC, the obvious choice would be
GAS, but for x86 especially, there are other assemblers which many OSdevs prefer, such as Netwide Assembler (
NASM) and Flat Assembler (
FASM).
The important thing here is that assembly language syntax varies more among the x86 assemblers than it does for most other platforms, with the biggest difference being that between the Intel syntax used in the majority of x86 assemblers, and the AT&T syntax used in GAS. You can see an overview of the differences on the somewhat misnamed wiki page
Opcode syntax. While it is possible to coax GAS to use the Intel syntax using the
.intel_syntax noprefix directive, the opposite is generally not true for Intel-based assemblers such as NASM, and even with that directive, GAS is still quite different from other x86 assemblers in other regards.
It is still important to understand that the various Intel syntax assemblers -
NASM,
FASM, and
YASM among others - have differences in how they handle indexing, in the directives they use, and in their support for features such as macros and defining data structures. While most of these follow the general syntax of Microsoft Assembler (
MASM), they all diverge from it in various ways.
Once you know which platform you are targeting, and the toolchain you want to use, you need to understand them. You should read up on the core technologies for the platform. Assuming that you are targeting the PC architecture, this would include:
This leads to the next big decision: which
Bootloader to use. There are a number of different
standard bootloaders for x86, with the most prominent being
GRUB. We strong recommend against
Rolling Your Own Bootloader, but it is an option as well.
You need to consider what kind of
File System to use. Common ones used when starting out in OS dev include:
We generally don't recommend designing your own, but as with boot loaders, it is a possibility as well.
While this is a lot of reading, it simply reflects the due diligence that any OS-devver needs to go through in order to get anywhere. OS development, even as a simple project, is not amenable to the Stack Overflow cut-and-paste model of software development; you really need to understand a fair amount of the concepts and principles before writing any code, and the examples given in tutorials and forum posts generally are exactly that. Copying an existing code snippet without at least a basic idea of what it is doing simply won't do. While learning itself is an iterative process - you learn one thing, try it out, see what worked and what didn't, read some more, etc. - in this case a basic foundation is needed at the start. Without a solid understanding of at least some of the core ideas before starting, you simply can't get very far in OS dev.
Hopefully, this won't scare you off; it isn't nearly as bad as it sounds. It just takes a lot of patience and a bit of effort, a little at a time.