OSDev.org

Posted: **Sun Aug 15, 2021 4:24 pm**

Hello, I have an issue with GCC where it for some reason generates FPU instructions for 64 bit math when building for i686.

Example:

Code: Select all

c0171197:	df 2d 70 21 1a c0    	fild   QWORD PTR ds:0xc01a2170
c017119d:	df 7d e0             	fistp  QWORD PTR [ebp-0x20]

This is the piece of code generated when calling my sleep function.

Code: Select all

inline void for_seconds(u64 time)
{
    until(Timer::nanoseconds_since_boot() + (time * Time::nanoseconds_in_second));
}

When i tried adding -mgeneral-regs-only the linker generated a bunch of undefined references to __atomic_load_8.
Why does this happen?

....

I think I got it. Unfortunately i was trying to use a 64 bit atomic counter in my kernel and i think on 32 bit GCC implements
it via FPU QWORD load and store

... Is that true? Can anyone link any resources that talk about this?

Posted: **Sun Aug 15, 2021 4:28 pm**

8infy wrote:When i tried adding -mgeneral-regs-only the linker generated a bunch of undefined references to __atomic_load_8.

The GCC docs refer to those as builtins. Are you linking with libgcc?

Posted: **Sun Aug 15, 2021 4:33 pm**

nexos wrote:
8infy wrote:When i tried adding -mgeneral-regs-only the linker generated a bunch of undefined references to __atomic_load_8.
The GCC docs refer to those as builtins. Are you linking with libgcc?

I do. I don't get those errors if i don't pass the general-regs-only flag. I think the reason is because GCC doesn't have an implementation
for those functions that doesn't include FPU instructions.

Posted: **Sun Aug 15, 2021 5:29 pm**

8infy wrote:I think the reason is because GCC doesn't have an implementation for those functions that doesn't include FPU instructions.

Sounds reasonable, that probably is why. Sounds kind of like a bug maybe.

Posted: **Sun Aug 15, 2021 10:01 pm**

8infy wrote:Can anyone link any resources that talk about this?

The Intel SDM (volume 3A section 8.1.1) says aligned 64-bit loads and stores are atomic on Pentium and later processors. When you don't need the compare or exchange parts of CMPXCHG8B, it's pretty reasonable to choose some other instruction, and the only other options for i686 are floating-point.

Incidentally, Clang is able to use CMPXCHG8B when floating-point instructions are not available. (But if you try this with your own copy of Clang you'll notice that support for -mgeneral-regs-only was added very recently.)

Posted: **Mon Aug 16, 2021 4:39 am**

Octocontrabass wrote:
8infy wrote:Can anyone link any resources that talk about this?
The Intel SDM (volume 3A section 8.1.1) says aligned 64-bit loads and stores are atomic on Pentium and later processors. When you don't need the compare or exchange parts of CMPXCHG8B, it's pretty reasonable to choose some other instruction, and the only other options for i686 are floating-point.

Incidentally, Clang is able to use CMPXCHG8B when floating-point instructions are not available. (But if you try this with your own copy of Clang you'll notice that support for -mgeneral-regs-only was added very recently.)

Thanks, that was a big TIL moment for me.

Posted: **Mon Aug 16, 2021 11:12 am**

If you wanna be sure that no compiler-emitted FPU instructions will be part of your kernel, you can compile it with:

Code: Select all

-mno-80387 -mno-mmx -mno-sse -mno-avx

Using FPU instructions, with some exceptions, is dangerous in the kernel because it will corrupt the state of the FPU registers used by the current user process, unless you always save and restore all the FPU regs even when you do just a mode switch (jump kernel and back). However, saving and restoring the FPU regs is a very expensive operation, so you don't have much of a choice if you care about performance. Modern compilers emit FPU instructions all the time at higher level of optimizations even with -ffreestanding, unless you explicitly tell them not to.

Posted: **Tue Aug 17, 2021 12:48 am**

vvaltchev wrote:If you wanna be sure that no compiler-emitted FPU instructions will be part of your kernel, you can compile it with:
Code: Select all
-mno-80387 -mno-mmx -mno-sse -mno-avx
Using FPU instructions, with some exceptions, is dangerous in the kernel because it will corrupt the state of the FPU registers used by the current user process, unless you always save and restore all the FPU regs even when you do just a mode switch (jump kernel and back). However, saving and restoring the FPU regs is a very expensive operation, so you don't have much of a choice if you care about performance. Modern compilers emit FPU instructions all the time at higher level of optimizations even with -ffreestanding, unless you explicitly tell them not to.

This is good advice, I’ve not really thought about the FPU… I’m definitely not saving the FPU state during a context switch! I’ve not had any unexplained crashes yet, probably down to luck more than anything

OSDev.org

GCC generates FPU instructions for 64 bit integers

GCC generates FPU instructions for 64 bit integers

Re: GCC generates FPU instructions for 64 bit integers

Re: GCC generates FPU instructions for 64 bit integers

Re: GCC generates FPU instructions for 64 bit integers

Re: GCC generates FPU instructions for 64 bit integers

Re: GCC generates FPU instructions for 64 bit integers

Re: GCC generates FPU instructions for 64 bit integers

Re: GCC generates FPU instructions for 64 bit integers