Why so many custom-build toolchains?

Programming, for all ages and all languages.
anta40
Posts: 1
Joined: Mon Jan 05, 2009 10:58 am

Why so many custom-build toolchains?

Post by anta40 »

I guess I spent too much time lurking on OSDev board.
Time to start studying and coding.

I looked at some hobby OSes in Github, and 1 common thing among them is the need to 'build your own GCC'.
I don't understand it. Assuming you are only targeting X86, isn't the standard GCC is enough?
Of course there's the -ffreestanding and you can always implement some features in assembly.

Perhaps I'm missing something here.

Thank you :)
Korona
Member
Member
Posts: 1000
Joined: Thu May 17, 2007 1:27 pm
Contact:

Re: Why so many custom-build toolchains?

Post by Korona »

For freestanding code, using an existing GCC works as long as it is configured similarly. However, using its libgcc will almost never work (e.g. using a Linux libgcc will break without glibc) as libgcc calls malloc()/free() if they are available at libgcc build time. Furthermore, keep in mind that not all code is freestanding code. If you are compiling user-space programs for your OS, you will almost surely need a new gcc. For example, a OS-specific GCC will have __youros__ instead of __linux__ and it will not try to use features that are actually not available in your OS (e.g. dynamic linking).
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
User avatar
Solar
Member
Member
Posts: 7615
Joined: Thu Nov 16, 2006 12:01 pm
Location: Germany
Contact:

Re: Why so many custom-build toolchains?

Post by Solar »

Try this:

Code: Select all

gcc -dumpmachine
It will give something like this:

Code: Select all

x86_64-linux-gnu
Usually, only one out of three will match YourOS -- the x86_86... and the other two are not just for show.
Every good solution is obvious once you've found it.
simeonz
Member
Member
Posts: 360
Joined: Fri Aug 19, 2016 10:28 pm

Re: Why so many custom-build toolchains?

Post by simeonz »

It never became perfectly clear to me too, to be honest. :) The wiki is very adamant about using a cross-compiler, but I am not sure what exactly varies between a hosted linux target and a freestanding elf one, which cannot be controlled from the command line. I am curious more than anything, because making a comparison like that can also provide insight into the toolchain and runtime. Here are the things I can think of:
  • The start files (i.e. crti.o, crtn.o, crtbegin.o, crtend.o) could have different contents (or be absent), but they are not relevant when using -nostartfiles or -nostdlib.
  • The language runtime in libgcc and libstdc++ may vary, but it is not relevant with -nostdlib.
  • The default search paths for system libraries and headers may have to be modified by using --sysroot, -nostdinc, -isystem.
  • Depending on the binutils configuration as well, certain aspects like the choice between .init or .init_array for hooking the initialization funclets will vary. If both are supported on the target or neither is used in the executable, this could be unimportant. Otherwise, I don't think that the choice can be controlled at compile-/link-time - it is burned-in when building binutils.
  • It could affect the built-in specs, which govern how the gcc driver treats the options and maps them to the auxiliary build tools, such as the assembler and linker. But the spec files usually govern the defaults primarily, which can be overridden. And besides, you don't have to link using the gcc driver.
  • The built-in macros will be determined by the target, which will change the behavior of certain headers. This can be worked around with -undef, -U, and -D.
  • The assumed behavior of the standard library routines may change (from undefined to defined), but this can be controlled with -ffreestanding and -fno-builtin.
I don't believe that the code generation will be impacted.

Edit: crti.o, crtn.o, crt1.o come from glibc, but they are suppressed with -nostartfiles all the same.
Edit2: To think of it, there are other things affected, like the supported list of linker emulations in binutils (, which you pass with -m). The hosted environment should have more comprehensive support, rather than more restrictive. There is something called multilib mapping, which I think is burned-in. Honestly, I am not sure how it works, but sounds like it shouldn't be relevant here.
User avatar
nielsd
Member
Member
Posts: 31
Joined: Sun Apr 05, 2015 3:15 pm

Re: Why so many custom-build toolchains?

Post by nielsd »

There's something else too that hasn't been said yet.
If you have multiple people with different versions of the compiler, you might run into problems where the code generation differs.
A custom toolchain is also important when you want to have an OS specific toolchain. For example: You don't want to have to modify Makefiles with alot of compiler flags when you port software and need to compile it on your host system.
osdev project, goal is to run wasm as userspace: https://github.com/kwast-os/kwast
simeonz
Member
Member
Posts: 360
Joined: Fri Aug 19, 2016 10:28 pm

Re: Why so many custom-build toolchains?

Post by simeonz »

It also occurred to me that the linux kernel is built with the standard toolchain from the distribution, and it runs in a freestanding environment. They do all the hacks necessary to adjust the paths, remove the startup files, disable vectorization, disable the red zone, etc. So, although it may not be the most elegant solution, it is not bad enough to be deemed inadequate for the kernel builds. Now, a hosted compiler for a custom OS is a different story.
Korona
Member
Member
Posts: 1000
Joined: Thu May 17, 2007 1:27 pm
Contact:

Re: Why so many custom-build toolchains?

Post by Korona »

The Linux guys do not link to libgcc, (AFAICS) basically because Torvalds does not trust the GCC developers to not **** up. Instead they reimplement functions like long division. You can do this and get away with the Linux GCC but it will be more work and you have to know what you're doing. Also note that the Linux userland does use a OS-specific toolchain. You cannot build the Linux userland with e.g. a FreeBSD gcc, no matter what flags you pass to it.
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
simeonz
Member
Member
Posts: 360
Joined: Fri Aug 19, 2016 10:28 pm

Re: Why so many custom-build toolchains?

Post by simeonz »

Korona wrote:The Linux guys do not link to libgcc, (AFAICS) basically because Torvalds does not trust the GCC developers to not **** up. Instead they reimplement functions like long division. You can do this and get away with the Linux GCC but it will be more work and you have to know what you're doing.
I would like to ask actually, is there some kind of official position from the gcc guys (written or spoken) regarding the context usage of the elf targets, like i686-elf and x86_64-elf. I mean, is it intended for environments without red-zone and demand-saving of the FPU and SSE context, or is it intended for targets merely without libc and dynamic loader support. The former case would be a narrowing of the amd64 ABI, which seems counter intuitive to a name like x86_64-elf. For i686-elf, libgcc is I think compiled without vectorization and red zone, but that seems to be implied by the i686 architecture and ABI, not by the target as such. I haven't checked for x86_64-elf, because I don't have a build readily available, but in any case - is there some kind of official statement on the kind of assumptions that those targets make?
Korona
Member
Member
Posts: 1000
Joined: Thu May 17, 2007 1:27 pm
Contact:

Re: Why so many custom-build toolchains?

Post by Korona »

I don't think there is a written statement, but you can look at the source: *-elf libgcc is compiled with the default compiler flags IIRC. That means that the x86_64 libgcc does assume a red-zone. However, you should be able to change that by changing the default spec file definitions in the target headers (i.e. the gcc/config directory). I have not verified whether libgcc keeps working if you do that but it is on my TODO list.
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
simeonz
Member
Member
Posts: 360
Joined: Fri Aug 19, 2016 10:28 pm

Re: Why so many custom-build toolchains?

Post by simeonz »

Korona wrote:I don't think there is a written statement, but you can look at the source: *-elf libgcc is compiled with the default compiler flags IIRC. That means that the x86_64 libgcc does assume a red-zone.
I see. I didn't find mno-red-zone anywhere in the gcc and libgcc configuration files, but wanted another opinion, in case the story is more convoluted (i.e. implied options from -fbuilding-libgcc, etc). Anyway, if that is the case, then the elf targets are designed for unknown System V ABI compliant environment. If anyone wants to use them in a kernel context, it is their responsibility to know their job. Fair enough. Thanks for clarifying.
Korona wrote:However, you should be able to change that by changing the default spec file definitions in the target headers (i.e. the gcc/config directory).
Actually, it appears that the wiki has a page about that. They even demonstrate multilib mapping.
User avatar
Solar
Member
Member
Posts: 7615
Joined: Thu Nov 16, 2006 12:01 pm
Location: Germany
Contact:

Re: Why so many custom-build toolchains?

Post by Solar »

simeonz wrote:The wiki is very adamant about using a cross-compiler, but I am not sure what exactly varies between a hosted linux target and a freestanding elf one, which cannot be controlled from the command line.
One, those command line options varied, depending on host and target. The forum was a very busy place with all the questions about "why is this not working" and "why is that not working" and people being asked about what their setup was and getting explanations on why they had to set their command line up just so. (Remember, there are MinGW users out there, and Cygwin users as well...) Going cross-compiler leveled the playing field for everybody. It's kind of the OSDev variant of Stackoverflow's "Minimal, Complete, and Verifiable Example". It also massively reduced the number of people going some "custom" way with e.g. DJGPP etc.; it reduced the "which toolchain is best" discussions significantly and also made many Wiki entries much simpler.

Two, another rather common issue was that people went #include <stdio.h> and then asking why printf() wasn't working in their boot menu, or going #include <stdlib.h> and asking why malloc() did not work as expected, or building to a.out and trying to execute that as bootloader... Following the rule "fail early, fail loudly", a cross-compiler setup slaps you for trying to work with what isn't there, much more so than a system compiler bent to your will.

Three, in many cases it is much easier to influence what compiler in which version you are using with a cross-compiler setup than with your system compiler. Fiddling with the system compiler can render your whole system useless, while you can do anything to your cross-compiler without any risk to your "other" build chains.

Four, en route to making your system self-hosting, at some point you basically have to go into the fun that is compiler-building, for bootstrapping. Why not make the first step toward your native build tools right at the beginning?

I hope that clears things up a bit.
Every good solution is obvious once you've found it.
simeonz
Member
Member
Posts: 360
Joined: Fri Aug 19, 2016 10:28 pm

Re: Why so many custom-build toolchains?

Post by simeonz »

Solar wrote:I hope that clears things up a bit.
That's fine. I genuinely believed that there might have been technical arguments. And consequently became curious what they were or how someone learned about them. Whichever the case, I am not arguing against it being a clean solution. There is one downside obviously. That you don't automatically update the ordinary compiler, when you update the freestanding one. But that is not game breaking.

It's clear now. Thanks.
User avatar
Solar
Member
Member
Posts: 7615
Joined: Thu Nov 16, 2006 12:01 pm
Location: Germany
Contact:

Re: Why so many custom-build toolchains?

Post by Solar »

Not updating the compiler automatically can actually be a benefit, too. See, your system compiler gets updated only after the distro maintainers have (hopefully) checked that everything will still be shiny after the update... but of course they are only testing the distro, not your OS...

YourOS should have its own compiler update schedule. Imagine the pain when you're in the middle of some involved work when the system compiler gets updated, and you are sitting there with a broken build and have to figure out what is due to your changes and what is due to your compiler having been updated.

You also cannot easily fall back to previous versions with the system compiler. While setting up a newer cross-compiler version, testing it against YourOS, and then either updating or switching back to the old version is very easy.

While GCC is rather stable for now, there have been ABI-breaking changes in the past (anyone remember the fun we had during the 3.x releases?), and there are likely to be more in the future. I feel you'd need control rather than automatic updates...
Every good solution is obvious once you've found it.
simeonz
Member
Member
Posts: 360
Joined: Fri Aug 19, 2016 10:28 pm

Re: Why so many custom-build toolchains?

Post by simeonz »

Solar wrote:Not updating the compiler automatically can actually be a benefit, too. See, your system compiler gets updated only after the distro maintainers have (hopefully) checked that everything will still be shiny after the update... but of course they are only testing the distro, not your OS...
I didn't make myself very clear here. You are right. I meant that if a person wants the newer language features or some bug fix, whether they build the compiler themselves or use an unofficial repository, they would likely want both toolchains updated. It would be unlikely to desire C++17 or the fno-plt option for just one version. If the case is exceptional.. noone is restrained from doing the right thing and building two separate compilers. Although you could still build them both as hosted and provide the necessary switches later. It is a choice.
User avatar
Velko
Member
Member
Posts: 153
Joined: Fri Oct 03, 2008 4:13 am
Location: Ogre, Latvia, EU

Re: Why so many custom-build toolchains?

Post by Velko »

What about LLVM? The Wiki page is not very generous, but the impression is that there is no need to build one for each target.

I imagine, that some effort is needed in order to obtain OS Specific Toolchain for userspace, but for pure kernel work - isn't pre-built, pre-packaged versions enough?
If something looks overcomplicated, most likely it is.
Post Reply