Page 1 of 1

Subtleties dealing with cross compilers, libraries, etc.

Posted: Sat Mar 28, 2020 10:52 pm
by sunnysideup
I'm reading this: https://wiki.osdev.org/Meaty_Skeleton. I've already built a cross compiler, roughly following Barebones, and the cross compiler executable is present in ~/opt/cross. One question that I have is the difference between ~/opt/cross/bin and ~/opt/cross/i686-elf/bin. Also in this case, is ~/opt/cross/lib the directory where the standard library is supposed to be located? And ~/opt/cross/lib/gcc the place where the libgcc objects are present?

I have a few other questions -
0. The GCC documentation explicitly states that libgcc requires the freestanding environment to supply the memcmp, memcpy, memmove, and memset functions, as well as abort on some platforms. We will satisfy this requirement by creating a special kernel C library (libk) that contains the parts of the user-space libc that are freestanding (doesn't require any kernel features) as opposed to hosted libc features that need to do system calls.
Alright, I understand that libgcc is a 'private library' that is used by gcc, i.e. It has significance only during the compilation process, when gcc is in use, kind of like a helper for gcc. Is this right? I understand that the machine I run gcc on is called the build machine, and the host machine is my own OS. What is the freestanding environment specified here? Since gcc runs on the build machine, libgcc must use the build machine libgcc I guess? Where does freestanding come into the picture?

Re: Subtleties dealing with cross compilers, libraries, etc.

Posted: Sat Mar 28, 2020 11:35 pm
by nullplan
sunnysideup wrote:One question that I have is the difference between ~/opt/cross/bin and ~/opt/cross/i686-elf/bin.
Only the file names. In $prefix/bin, you get i686-elf-gcc, whereas in $prefix/i686-elf/bin, you get gcc. It's the same GCC, tho.
sunnysideup wrote:Also in this case, is ~/opt/cross/lib the directory where the standard library is supposed to be located?
No, that would be $prefix/i686-elf/lib (and the include files in $prefix/i686-elf/include). The general idea is that $prefix/ contains binaries for the host machine, but $prefix/$machtype contains binaries for the target. Obvious exception in case of $prefix/$machtype/bin.
sunnysideup wrote:And ~/opt/cross/lib/gcc the place where the libgcc objects are present?
Well, yes, that is there. However, it also contains a ton of CRT files, libraries to link in, and GCC-internal header files (like <stddef.h>).
sunnysideup wrote:Alright, I understand that libgcc is a 'private library' that is used by gcc, i.e. It has significance only during the compilation process, when gcc is in use, kind of like a helper for gcc. Is this right?
No. libgcc contains runtime support routines for code generated by GCC. For instance, if you are developing for a 32-bit architecture, and then you divide a 64-bit number by either another 64-bit number or a 32-bit one, the compiler will generate a call to __udivdi3 (or similar), and that is implemented in libgcc. Also if you are developing for an architecture without hardware float conversion support (e.g. PowerPC), and you convert an integer to a floating-point number, the compiler will probably call a support function to do the actual conversion.
sunnysideup wrote:What is the freestanding environment specified here?
A term from the C standard. The C standard recognizes two types of implementation, namely the hosted and the freestanding implementations. The hosted ones have to provide the entire library from Chapter 7 (i.e. printf() and stuff), while the freestanding implementation only has to provide a small number of headers which only declare types and macros, not functions (stuff like <limits.h> and <float.h>). You can put GCC in freestanding mode by supplying the "-ffreestanding" option. An OS kernel is sort of the prototypical application of that, but for instance, if you are developing a standard C library in C, you have to compile it in freestanding mode as well.
Anyway, the gcc manual states explicitly that even in a freestanding environment, these four memory functions must be available, since GCC can generate calls to them in any mode.

Re: Subtleties dealing with cross compilers, libraries, etc.

Posted: Sat Mar 28, 2020 11:47 pm
by sunnysideup
So a few routines from libgcc would be statically linked in the binary the I would produce using a cross compiler.. That is if my cross compiler produces code that runs on a 32 bit machine, and I have a 64 bit division, as per your example, then the routine _udividi3 should be present in the final binary that gets output?

Re: Subtleties dealing with cross compilers, libraries, etc.

Posted: Sun Mar 29, 2020 12:06 am
by nullplan
Correct. And that routine comes from libgcc.

Re: Subtleties dealing with cross compilers, libraries, etc.

Posted: Sun Mar 29, 2020 12:31 am
by sunnysideup
"Normally when you compile programs for your local operating system, the compiler locates development files such as headers and libraries in system directories such as:

/usr/include
/usr/lib
These files are of course not usable for your operating system. Instead you want to have your own version of these directories that contains files for your operating system:

/home/bwayne/myos/sysroot/usr/include
/home/bwayne/myos/sysroot/usr/lib"

Alright I think this ^ paragraph describes the location of the standard library and headers.

So in our case, is sysroot $PREFIX/i686-elf ??

Also, what does this mean?
nullplan wrote:No, that would be $prefix/i686-elf/lib (and the include files in $prefix/i686-elf/include). The general idea is that $prefix/ contains binaries for the host machine, but $prefix/$machtype contains binaries for the target
I'm confused about the terminology - build, host and target. I thought target was used only when building a compiler .. :?:

Re: Subtleties dealing with cross compilers, libraries, etc.

Posted: Sun Mar 29, 2020 3:58 am
by nullplan
sunnysideup wrote:So in our case, is sysroot $PREFIX/i686-elf ??
If that is the prefix you set on all your cross-compiled projects, yes. Also, small thing to note, but I don't use /usr if I can help it. In my system, it is merely a symlink to ".". /usr only exists because Kernighan and Ritchie ran out of space on their root disk back in the '70ies, and I don't think that is sufficient justification for another root directory layer. Also, i think the distinction between /bin and /sbin is rather arbitrary, so one is merely a symlink to the other in my system.

In case of i686-elf, I do not actually have a standard library, nor do I plan to ever have one. Once my OS if far enough along to actually host binaries, I shall compile GCC again for my OS, and then that one will have a standard library. It is possible I'm overthinking this, tho. Still plenty of time left before I have to decide.
sunnysideup wrote:I'm confused about the terminology - build, host and target. I thought target was used only when building a compiler .. :?:
Right, I did mean "host". That is the machine you are targeting with your cross-compiler, and the machine that will host the built binaries. That's why I got confused.

Re: Subtleties dealing with cross compilers, libraries, etc.

Posted: Fri Apr 17, 2020 9:13 am
by sunnysideup
I'm still experimenting with the meaty skeleton code.
Here's a question. Why does one need startup files (crt*.o) when linking to make the kernel??
I've followed the tutorial and the layout of the kernel ELF binary is as follows:

Code: Select all

myos.kernel:     file format elf32-i386

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         00000622  00100000  00100000  00001000  2**12
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  1 .init         0000000f  00100622  00100622  00001622  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  2 .fini         0000000a  00100631  00100631  00001631  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  3 .eh_frame     000001c0  0010063c  0010063c  0000163c  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  4 .rodata.str1.1 00000016  001007fc  001007fc  000017fc  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  5 .data         00000004  00101000  00101000  00002000  2**12
                  CONTENTS, ALLOC, LOAD, DATA
  6 .ctors        00000008  00101004  00101004  00002004  2**2
                  CONTENTS, ALLOC, LOAD, DATA
  7 .dtors        00000008  0010100c  0010100c  0000200c  2**2
                  CONTENTS, ALLOC, LOAD, DATA
  8 .bss          00004030  00102000  00102000  00002014  2**12
                  ALLOC
  9 .debug_line   000004e7  00000000  00000000  00002014  2**0
                  CONTENTS, READONLY, DEBUGGING
 10 .debug_info   000009b5  00000000  00000000  000024fb  2**0
                  CONTENTS, READONLY, DEBUGGING
 11 .debug_abbrev 00000508  00000000  00000000  00002eb0  2**0
                  CONTENTS, READONLY, DEBUGGING
 12 .debug_aranges 00000110  00000000  00000000  000033b8  2**3
                  CONTENTS, READONLY, DEBUGGING
 13 .debug_str    00000456  00000000  00000000  000034c8  2**0
                  CONTENTS, READONLY, DEBUGGING
 14 .debug_ranges 00000108  00000000  00000000  00003920  2**3
                  CONTENTS, READONLY, DEBUGGING
 15 .comment      00000011  00000000  00000000  00003a28  2**0
                  CONTENTS, READONLY
 16 .debug_loc    0000048b  00000000  00000000  00003a39  2**0
                  CONTENTS, READONLY, DEBUGGING
The linker script provides entry(_start), which is present in the .init section. Well, wouldn't it be simpler to just provide an entry symbol in .text and ignore crt*.o objects when linking kernel modules. Also, note that the grub multiboot specification is in .text. Therefore, it seems really unnecessary to manually specify crt*.o objects (We use -nostdlib)

Re: Subtleties dealing with cross compilers, libraries, etc.

Posted: Fri Apr 17, 2020 12:30 pm
by nullplan
sunnysideup wrote:Here's a question. Why does one need startup files (crt*.o) when linking to make the kernel??
Depends. Old C++ compilers might be using _init and _fini to register global constructors and destructors. In that case, you need crti.o and crtn.o before and after the list of object files to paste the sections together correctly. The older mechanism worked like this: All code that must be run at startup is put into a section ".init". crti.o also contains a section ".init" containing the symbol "_init" and possibly a function prologue if necessary. crtn.o then contains a section ".init" with a function epilogue. If the linker pastes the ".init" sections together in the order it found them, this will generate one large initialization function. Similar for ".fini".

However, this approach invites problems with reordering of sections. The busybox build system had to be changed to stop sorting input sections by alignment, because it would break this old system. To solve the problem, the ".init_array" system was created. Now all the init code is put into ".text", pointers to it are in ".init_array", and the linker will generate the symbols to iterate over this array. And this will remain stable after sorting.
sunnysideup wrote:The linker script provides entry(_start), which is present in the .init section.
The linker script merely references that entry point. It is defined in kernel/arch/i386/boot.S, and placed into section ".text". If you actually linking in crt1.o, then this cannot work for a kernel.