Page 1 of 1

[SOLVED] GCC Object Output Format

Posted: Thu Feb 25, 2016 1:37 pm
by BASICFreak
Hello again fellow OSDevs,

I have spent the past 2 days trying to get GCC to play nice with my code/data layout. I've decided I want to be "lazy" and mix C in my "OSLoader".

So, for a few modules this works fine - e.g. when the C code is at the highest processing mode (32/64 bit) of the binary, but this is not always the case.

My binaries are a mix of 16/32/64 bit code. NASM will output any and all of the source into an elf64-x86-64 object with no issues, and LD links everything together nicely here; BUT, I have a few places where I need 32-bit C code, and 16/64-bit ASM Codes - but GCC when compiling 32-bit code only outputs elf32-i386 objects which is not compatible with LD along side elf64-x86-64 objects.

So, before I create a utility to convert the format of the objects myself - I wanted to ask here if there is a simple way to get GCC to output an elf64-x86-64 object when compiling 32 bit code or another simple tool that can convert these elf32-i386 objects to elf64-x86-64. (I've looked into objcopy, but it didn't seen to have options to do this - though I could have missed it)

I hope I explained this clearly enough :wink: (but if not ask and I'll do my best to explain better)


Best regards,

B!

Re: GCC Object Output Format

Posted: Thu Feb 25, 2016 2:19 pm
by iansjack
There are hacks to do what you want but, in my opinion, it's better to keep the 32- and 64-bit code separate. I use Grub which loads the 32-bit code as the main kernel and the 64-bit code as a separate module. It works beautifully and it just seems more elegant this way. If you want to roll your own boot loader, just make it act the same way.

Re: GCC Object Output Format

Posted: Thu Feb 25, 2016 4:53 pm
by BASICFreak
iansjack wrote:There are hacks to do what you want but, in my opinion, it's better to keep the 32- and 64-bit code separate. I use Grub which loads the 32-bit code as the main kernel and the 64-bit code as a separate module. It works beautifully and it just seems more elegant this way. If you want to roll your own boot loader, just make it act the same way.
I got a quote for this :D
Adam Savage wrote:I reject your reality and substitute my own.
(^ that's not meant offensively)

Yes, I fully comprehend and understand what you are saying; but the design I have does not allow for this.

I have thrown out every standard concept of POSIX and the standard way an OS is loaded. Which means that most everything in my code would be considered a "hack" by most everyone on here.

So I found myself in this situation due to a "chicken and egg" scenario, which I shall *attempt* to explain:

There are 3 stages to my OSLoader:
Stage 1 - MBR/VBR, This just loads Stage 1.5 and has a public "ReadFile" function.
Stage 1.5 - The First Binary ["SysInit.bin"] <- This is where the "chicken & egg" comes it
Stage 2 - The linker

So, Stage 1.5 does the following (may not be in order):
1. Checks CPUID to determine x86 or AMD64
2. Installs proper GDT for the system
3. Loads a less limited driver for the boot device (main entry point is 32-bits, but is still allowed to use BIOS for I/O) <- This is no issue as I can get NASM to output elf32-i386 to link to the C source
4. Load Stage 2

Due to Stage 2 requiring malloc and vmm_map to properly link everything without carrying the relocation tables around I have decided to load and initialize both memory managers (between step 3&4 above)

Now this will load in the memory managers based on the detected CPU, but as we know to get to Long Mode you must have a page directory - so I want the initialization function to be 32-bits (which is fine if I create this function in ASM - but due to it only being called once I really don't want to spend the time and prefer C here)


I thought about having Stage 1.5 create a temporary PDIR and load the PMM and VMM in stage 2, but the issue here is the hackish way I would have to load and link to the current running BINARY (not elf) - which I have done with a variable though I hate hard-coding function locations where I need not.

(I hope that made sense, as I was distracted many times while trying to write this...)

I understand what I am doing, I just don't think I'm able to properly explain it :)

Stage 2 takes the objects either elf32-i386 or elf64-x86-64 without me having to send through LD - so anything after Stage 1.5 the issue does not exist.



But would you have any information on these "hacks" you mentioned?



Best regards,

B!

Re: GCC Object Output Format

Posted: Fri Feb 26, 2016 1:07 am
by iansjack
Here is an excerpt from my Makefile when I used this braindead way of working:

Code: Select all

mem32.o: mem32.c $(INC)/memory.h
	$(CC) -m32 -D CODE_32 $(CFLAGS) $(CPPFLAGS) $(INCLUDES) -S mem32.c
	cat code32.s mem32.s >tmem32.s
	$(AS) tmem32.s -o mem32.o
	rm tmem32.s mem32.s
code32.s contains the single line:

Code: Select all

   .code32
I still don't recommend it, but it works.

Re: GCC Object Output Format

Posted: Fri Feb 26, 2016 1:11 am
by Techel
What about loading needed files into memory at boot when in real mode?

Re: GCC Object Output Format

Posted: Fri Feb 26, 2016 2:46 am
by Kevin
I would avoid trying to have both 32 and 64 bit code in the binary at the same level to protect myself from confusing the two and accidentally calling a function in the wrong part. What you could do is compile one part into a separate binary so that you have a 32 bit and a 64 bit binary, and then you can use the linker to embed one into the other as a binary blob.

Re: GCC Object Output Format

Posted: Fri Feb 26, 2016 3:02 am
by iansjack
I agree that it's an ugly hack to be avoided. But you can guard against the situation that you envisage by naming all 32-bit functions something like foo32. You would then have to be pretty dozy to make the mistake.

[SOLVED] GCC Object Output Format

Posted: Fri Feb 26, 2016 12:30 pm
by BASICFreak
EDIT 2:
Use objcopy to convert the object: "objcopy -O elf64-x86-64 in.o out.o" - It's cleaner, easier, safer, and better all around... #-o
Do not use anything I mentioned below!



ORIGINAL:

Thanks for all the input (and [constructive] criticism),

@iansjack, I attempted your method - but GAS is just as braindead as GCC (complaining about variables pushed onto the stack not 8 byte [64-bit] alligned)

So for future me and anyone else that may come across this issue I have finally found a working solution:

A program called objconv from http://www.agner.org/optimize/ (about halfway down the page) was required.

Though this program couldn't convert straight from elf32 to elf64, it could convert the elf32 to NASM syntax (with only two issues)

The only down side is this cannot *easily* be automated.

So after disassembling the ELF32 to NASM one must remove the .eh_frame (unless omitted by gcc flag "-fno-asynchronous-unwind-tables"), add "bits 32" to the top, and change the global attributes from "global [FUNCTIONNAME]: function" to "global [FUNCTIONNAME]" (without the ": function")

Finally it can be sent to NASM: "nasm -felf64 in.s -o out.o"

(I may modify the objconv source to not place the ": function" on the global attributes, which would make this very easily automatable. <- is that a real word?)


That being said, you only need to convert it to ELF64 to have LD place relocation tables.
LD will complain about mixing ELF32 and ELF64 input, but if you do not need relocation you can simply pass LD "-noinhibit-exec" and it will still create the binary.



Best regards,

B!



EDIT / UPDATE: So I did change the source of objconv, for anyone interested:
comment out line 2817 of disasm2.cpp (or properly modify it the if-else statement :roll:)
Then the only thing to add to the output source is "bits 32".