Page 1 of 1

Mixing 32-bit and 64-bit code?

Posted: Sun Jun 26, 2016 8:20 pm
by eof
I'm trying to figure out how to mix 32-bit and 64-bit code. What I would like to accomplish is the following:

1. Have 32-bit code that GRUB jumps to written in assembly, which sets up my stack and calls the code in (2).

2. Code written in C which sets up page tables and returns back to (1), which enables long mode and makes the jump to (3).

3. Long mode entry point is 64-bit code written in C.

For simplicity, I would like for all this code to be linked into the same ELF binary. The main problem is that the code in (3) generates an object file of type elf64-x86-64, while if I compile the code in (2) with -m32, I get a 32-bit ELF target. I can't seem to be able to get the linker to combine these into the same output file. My current solution has been to write the code in (2) also in assembly and just set .code32 in my assembly files while having the assembler generate an elf64-x86-64 object file.

This has to be a problem many others have stumbled on, so how can I mix 32-bit and 64-bit code generated from a high-level language in the same binary? If there's no way of doing this easily, is there a way to get GRUB to load two ELF files at boot, so I can make the first one 32-bit and the second 64-bit and then jump from the first into the second to enable long mode?

Re: Mixing 32-bit and 64-bit code?

Posted: Sun Jun 26, 2016 9:00 pm
by BrightLight
eof wrote:If there's no way of doing this easily, is there a way to get GRUB to load two ELF files at boot, so I can make the first one 32-bit and the second 64-bit and then jump from the first into the second to enable long mode?
TBH, I have no experience with your main question, but I do know how to do this.
You basically load a 32-bit kernel as the main multiboot kernel, and you load the 64-bit kernel as GRUB module. The 32-bit kernel does all the initialization (stack, paging, long mode) and maps the 64-bit kernel in the area it expects to be loaded at. Then, it probably passes information to it and calls it.

Re: Mixing 32-bit and 64-bit code?

Posted: Sun Jun 26, 2016 10:37 pm
by ~
eof wrote:I'm trying to figure out how to mix 32-bit and 64-bit code. What I would like to accomplish is the following:

1. Have 32-bit code that GRUB jumps to written in assembly, which sets up my stack and calls the code in (2).

2. Code written in C which sets up page tables and returns back to (1), which enables long mode and makes the jump to (3).

3. Long mode entry point is 64-bit code written in C.

For simplicity, I would like for all this code to be linked into the same ELF binary. The main problem is that the code in (3) generates an object file of type elf64-x86-64, while if I compile the code in (2) with -m32, I get a 32-bit ELF target. I can't seem to be able to get the linker to combine these into the same output file. My current solution has been to write the code in (2) also in assembly and just set .code32 in my assembly files while having the assembler generate an elf64-x86-64 object file.

This has to be a problem many others have stumbled on, so how can I mix 32-bit and 64-bit code generated from a high-level language in the same binary? If there's no way of doing this easily, is there a way to get GRUB to load two ELF files at boot, so I can make the first one 32-bit and the second 64-bit and then jump from the first into the second to enable long mode?
You need Assembly language that is capable of automatically selecting a specific register/word size or the maximum explicit register/word size, and that also select the address and operand overrides, that are correct for all CPU modes, with the exact same source code files.

See a project I have made called "x86 Portable". It's a NASM/YASM assembly header file with definitions. It uses things like "wideax" and "wideword" to explicitly and automatically select the maximum architecture register width automatically across all x86 CPU modes.

It is configured with the "__PLATFORMBITS_" definition, which uses:

16 -- pure 16-bit code
1632 -- 386+ 16/32-bit code ready for Unreal Mode
32 -- 32-bit code
64 -- 64-bit code

It defines registers _r8 to _r15 as full, automomatically sized machine words present in 16, 32 and 64-bit modes so they can be used as word-manipulating instructions without breaking 64-bit code if it's ported to 32 bits for example. It allows you to select 32 or 64-bit registers or memory variables when you need them, but you must think up and design your code so that it correctly uses those full-size automatically sizable variables at assembly-time when they are really needed in the program's algorithm:

0000000__x86_Portable.asm


0000000__X86_Portable.asm -- Source Code Text Recording

Re: Mixing 32-bit and 64-bit code?

Posted: Sun Jun 26, 2016 11:02 pm
by alexfru
~ wrote:
eof wrote:I'm trying to figure out how to mix 32-bit and 64-bit code. What I would like to accomplish is the following:

1. Have 32-bit code that GRUB jumps to written in assembly, which sets up my stack and calls the code in (2).

2. Code written in C which sets up page tables and returns back to (1), which enables long mode and makes the jump to (3).

3. Long mode entry point is 64-bit code written in C.
...
You need Assembly language that is capable of automatically selecting a specific register/word size or the maximum explicit register/word size, and that also select the address and operand overrides, that are correct for all CPU modes, with the exact same source code files.
Did you not read item 2 on the list?

Re: Mixing 32-bit and 64-bit code?

Posted: Sun Jun 26, 2016 11:21 pm
by ~
alexfru wrote:
~ wrote:
eof wrote:I'm trying to figure out how to mix 32-bit and 64-bit code. What I would like to accomplish is the following:

1. Have 32-bit code that GRUB jumps to written in assembly, which sets up my stack and calls the code in (2).

2. Code written in C which sets up page tables and returns back to (1), which enables long mode and makes the jump to (3).

3. Long mode entry point is 64-bit code written in C.
...
You need Assembly language that is capable of automatically selecting a specific register/word size or the maximum explicit register/word size, and that also select the address and operand overrides, that are correct for all CPU modes, with the exact same source code files.
Did you not read item 2 on the list?
I thought about implementing C header files for the same sort of low-level portability.

I haven't developed them so it would be necessary to think about how to implement such thing in C.

Anyway, it's probably badly implemented if the lowest-level code of the program isn't packed in a platform-dependent and platform-supporting engine of the application itself to put it away from the rest of the code, from most of the application, which should be platform-independent, just call the low-level functions and glue up the whole application or library.

The code that changes across platforms should surely be put in a set of standard ever-present functions that will internally change. The main application should never handle platform-dependent details that definitely cannot be made portable through all existing hardware.

Doing so will give the code a dramatically longer and more robust life cycle to the program itself.

So I would redesign point 2 to move ALL platform-dependent details to a single low level layer for the application, and make the rest, the vast majority of it, fully platform-independent and portable, truly as much as scripting languages like JavaScript, but being C.

Re: Mixing 32-bit and 64-bit code?

Posted: Mon Jun 27, 2016 1:21 am
by Velko
eof wrote:1. Have 32-bit code that GRUB jumps to written in assembly, which sets up my stack and calls the code in (2).

2. Code written in C which sets up page tables and returns back to (1), which enables long mode and makes the jump to (3).

3. Long mode entry point is 64-bit code written in C.
That's almost identical to my kernel's startup routine. There are few more things to take care of in 64-bit assembly, before launching C code, but that's it.

I compile 32-bit C code using -m32 and then objcopy the result into elf64-x86-64 format:

Code: Select all

x86_64-pc-myos-gcc -o tmp_paging32.o -c -m32 -ffreestanding tmp_paging32.c
x86_64-pc-myos-objcopy -O elf64-x86-64 -j .lowtext32 tmp_paging32.o tmp_paging64.o
I marked functions in C file with __attribute__((section(".lowtext32"))), for better control where it is linked.

You have to be careful with relocations, though. Do not reference anything external from this code. If you need some memory addresses or something, you can pass them as input parameters.

Re: Mixing 32-bit and 64-bit code?

Posted: Mon Jun 27, 2016 1:30 am
by iansjack
omarrx024 wrote:
eof wrote:If there's no way of doing this easily, is there a way to get GRUB to load two ELF files at boot, so I can make the first one 32-bit and the second 64-bit and then jump from the first into the second to enable long mode?
TBH, I have no experience with your main question, but I do know how to do this.
You basically load a 32-bit kernel as the main multiboot kernel, and you load the 64-bit kernel as GRUB module. The 32-bit kernel does all the initialization (stack, paging, long mode) and maps the 64-bit kernel in the area it expects to be loaded at. Then, it probably passes information to it and calls it.
Yes. That's by far the easiest and cleanest way to do it if you are using GRUB. I pass the necessary information between modules in variables stored in a fixed location in low memory. It works a treat.

Re: Mixing 32-bit and 64-bit code?

Posted: Mon Jun 27, 2016 1:34 am
by iansjack
~ wrote:Anyway, it's probably badly implemented if the lowest-level code of the program isn't packed in a platform-dependent and platform-supporting engine of the application itself to put it away from the rest of the code, from most of the application, which should be platform-independent, just call the low-level functions and glue up the whole application or library.
That is absolutely wrong. I'd say that you write as much code as you can in C, using platform-dependent assembler only for this bits that absolutely cannot be handled by C.

Check out some industrial strength operating systems (Linux, FreeBSD) and see what they do.

Re: Mixing 32-bit and 64-bit code?

Posted: Mon Jun 27, 2016 5:50 am
by ~
iansjack wrote:
~ wrote:Anyway, it's probably badly implemented if the lowest-level code of the program isn't packed in a platform-dependent and platform-supporting engine of the application itself to put it away from the rest of the code, from most of the application, which should be platform-independent, just call the low-level functions and glue up the whole application or library.
That is absolutely wrong. I'd say that you write as much code as you can in C, using platform-dependent assembler only for this bits that absolutely cannot be handled by C.

Check out some industrial strength operating systems (Linux, FreeBSD) and see what they do.
I think that Linux and the rest of open source projects are currently just massive tests of computer science concepts to choose and develop the best. Although they are ready for daily real world use and are industrial grade, but they are still tests for waiting to develop a better code base as a result.

So we could say that they, and Linux, are slightly broken or unoptimized if you consider that you can just implement a low level layer of each application as a machine engine to use (write it in C and Assembly), and then you won't need anything else than the main languages for the application, no longer needing to handle any non-portable machine details in the main application itself.

If you see that you need to use Assembly or make important and abundant optimizations for a set of functions, the right thing to do would be to write them fully in portable assembly for all CPUs and architectures, and then just call them from clean, fully portable, regular application code.

The application will become so much more portable, clean and will survive much more time in this way. Will acquire more portability properties than it's really evident right away. For example, being able to port Grub and multiboot specifications to platforms other than x86 much more cleanly and easy to read.

Re: Mixing 32-bit and 64-bit code?

Posted: Mon Jun 27, 2016 5:54 am
by iansjack
~ wrote: If you see that you need to use Assembly or make important and abundant optimizations for a set of functions, the right thing to do would be to write them fully in portable assembly for all CPUs and architectures, and then just call them from clean, fully portable, regular application code.
Adding an extra layer of complexity is hardly optimizing functions. If you think that you can provide the ultimate optimization by using assembler (in which case you are almost certainly mistaken) you don't need any cobbled-on scaffolding. You use proper assembly code, not some mish-mash of macros trying to force a one size fits all solution.

Re: Mixing 32-bit and 64-bit code?

Posted: Mon Jun 27, 2016 6:00 am
by ~
iansjack wrote:
~ wrote: If you see that you need to use Assembly or make important and abundant optimizations for a set of functions, the right thing to do would be to write them fully in portable assembly for all CPUs and architectures, and then just call them from clean, fully portable, regular application code.
Adding an extra layer of complexity is hardly optimizing functions. If you think that you can provide the ultimate optimization by using assembler (in which case you are almost certainly mistaken) you don't need any cobbled-on scaffolding. You use proper assembly code, not some mish-mash of macros trying to force a one size fits all solution.
It's at the same nesting level than the rest of the application as it's embedded.

It's just like putting any function needing assembly or machine details, in the "arch" directory and categorize them inside an embedded API. Then most of interesting tricks will be there, and they will need to be easy to call to make for a cleanly written application. The application will then just drive the lower level algorithms and hardware-manipulating functions to simply provide an interface to make them usable by an user.

It's less complex actually, cleaner.