Enabling compiler optimizations ruins the kernel
Enabling compiler optimizations ruins the kernel
The code generated by the compiler without optimizations is generally a piece of trash, it contains a lot of bloat. Enabling compiler optimizations in other apps removes this bloat and makes the app faster, but on the kernel it just have an undefined behaviour, it this normal ? and why ?
Re: Enabling compiler optimizations ruins the kernel
Check for:
1. Uninitialised variables.
2. Potential array overruns.
3. Variables that should be declared “volatile”.
4. Variables being accessed after they have gone out of scope.
The short answer is - there are bugs in your code.
1. Uninitialised variables.
2. Potential array overruns.
3. Variables that should be declared “volatile”.
4. Variables being accessed after they have gone out of scope.
The short answer is - there are bugs in your code.
Re: Enabling compiler optimizations ruins the kernel
The reasons for that is a long, complex and controversial story. I'd prefer to avoid re-starting that controversial topic here.devc1 wrote:The code generated by the compiler without optimizations is generally a piece of trash, it contains a lot of bloat. Enabling compiler optimizations in other apps removes this bloat and makes the app faster, but on the kernel it just have an undefined behaviour, it this normal ? and why ?
The only thing I can say to help you practically is to learn what UB is and how to avoid it, especially with modern compilers which brutally take advantage of it. With the right combination of compiler options, "syntactic approaches" towards certain things and runtime testing using the UBSAN, you could potentially write kernel code and compile it with -O3 without having problems. My operating system compiles and runs with all the optimization levels, from -O0 -fno-inline-functions to -O3.
The list of what is actually UB in C is very long, even if plenty of people try to mention UB things here, there could still be non-trivial UB cases. The general idea to avoid introducing UB in your code is thinking that it should run correctly on the "abstract C virtual machine" mentioned by the standard, NOT on your target architecture. For example, if signed integers wrap-around on your architecture (e.g. x86), don't assume that's the case for the "abstract C virtual machine". If you assume that, you're introducing UB so the compiler reserves the right to do whatever it likes. Specifically:
Code: Select all
int a = INT_MAX;
a++; // this is undefined behavior. Don't make assumptions about the assembly instructions that the compiler will generate here.
Another example: dereferencing a null pointer. With -O0, the compiler just do what you asked to do, so it will try to dereference the pointer and, typically, that will cause some kind of CPU exception. With -O3 the compiler will assume that the pointer `ptr` cannot be NULL (even if it can prove statically that it could be NULL!!). Therefore, it might generate code that does not really de-reference the pointer is case it's NULL, but do something else completely (that is obviously useless, but could be faster for some reason in the base case where the pointer is not NULL).
In summary, when you turn on optimizations, the compiler feels entitled to make assumptions that some things will NEVER happen so it generates code accordingly. You have to forget the concept of "I told you exactly what to do". When you really need to force the compiler to do something, as I said above, compiler options, attributes (e.g. volatile), extensions or inline assembly are required. Note that even inline assembly can be moved around unless it's marked as volatile inline assembly.
Be prepared with a lot of patience to learn how to avoid UB. Good luck!
Tilck, a Tiny Linux-Compatible Kernel: https://github.com/vvaltchev/tilck
Re: Enabling compiler optimizations ruins the kernel
Yes, No variable in my kernel is declared volatile. I guess without optimizations every variable is volatile so there are no problems. I will try this suggestion.
Re: Enabling compiler optimizations ruins the kernel
Now calling memcpy tripple faults even when doing return NULL in the start of the function : )))
Re: Enabling compiler optimizations ruins the kernel
Your compiler can output an assembly listing so that you can verify if the generated code does what you think it does. It is also essential to have some way of stepping through code, even if it is just the Bochs debugger which I still revert to when I manage to break things too much. As for it crashing when you call memcpy, it sounds like the stack pointer was corrupted. A likely cause could be assembly functions not adhering to the correct calling convention, for example not saving and restoring required registers or adjusting the stack pointer by a wrong amount, or incorrect inline assembly.
Re: Enabling compiler optimizations ruins the kernel
in MSVC, /O2 is the equivalent of all optimizations. Enabling all optimizations does the same exception, but removing the /g (Global optimizations) fixes it. The memcpy function does not get even called (because I set a while loop on it)
Re: Enabling compiler optimizations ruins the kernel
Just to see how it is slow to just call a function without optimizations, Drawing the bezier curve which is called from assembly took almost 50000 draws per second, with optimizations it took 300000 draws per second.
-
- Member
- Posts: 5587
- Joined: Mon Mar 25, 2013 7:01 pm
Re: Enabling compiler optimizations ruins the kernel
Without seeing your code, it's impossible to say what the problem is.
Did you try looking at the generated code to figure out where the compiled code doesn't work the way you want? Did you try debugging it to see where the problem might be?
Did you try looking at the generated code to figure out where the compiled code doesn't work the way you want? Did you try debugging it to see where the problem might be?
-
- Member
- Posts: 426
- Joined: Tue Apr 03, 2018 2:44 am
Re: Enabling compiler optimizations ruins the kernel
If you've implemented your own memcpy, it might be that the optimization has looked at your code, thought "hey, this looks like a memcpy", and replaced the code with a call to memcpy.devc1 wrote:in MSVC, /O2 is the equivalent of all optimizations. Enabling all optimizations does the same exception, but removing the /g (Global optimizations) fixes it. The memcpy function does not get even called (because I set a while loop on it)
The now recursive call blows up your stack. Check the assembler output, it will be pretty obvious if this is the case.
Re: Enabling compiler optimizations ruins the kernel
/Og is deprecated and it is enabled when setting /O2 (Maximum optimizations and equivalent of : /Otyb2g /GF /Gy), it enables some kind of optimizations. However this is how this thing triple faults and doesn't even call my memcpy :
The memcpy is called at the GetBezierPoint() function:
in C :
Assembly output :
I guess the call to the relocation pointer thing at 0x1A is not set by the linker ?
The memcpy is called at the GetBezierPoint() function:
in C :
Code: Select all
UINT64 GetBezierPoint(float* cordinates, float* beta, UINT8 NumCordinates, float percent){
memcpy(beta, cordinates, NumCordinates << 2);
while(1);
...
}
LPVOID memcpy(LPVOID dest, LPCVOID src, size_t size){
while(1); // memcpy does not get called with maximum optimizations (even when intrinsics are disabled)
}
Code: Select all
0000000000000000 <GetBezierPoint>:
0: 48 83 ec 28 sub $0x28,%rsp
4: 4c 8b ca mov %rdx,%r9
7: 45 0f b6 c0 movzbl %r8b,%r8d
b: 48 8b d1 mov %rcx,%rdx
e: 41 c1 e0 02 shl $0x2,%r8d
12: 49 8b c9 mov %r9,%rcx
15: e8 00 00 00 00 callq 1a <GetBezierPoint+0x1a>
1a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
20: eb fe jmp 20 <GetBezierPoint+0x20>
0000000000000000 <memcpy>:
0: 66 90 xchg %ax,%ax
2: eb fe jmp 2 <memcpy+0x2>
Re: Enabling compiler optimizations ruins the kernel
This is how the function gets called :
in C :
Output :
in C :
Code: Select all
void TripleFaultingFunction() {
UINT XOff = 200;
UINT YOff = 300;
float XCords[] = {0, 50, 100, 150};
float YCords[] = {0, 50, -50, 0};
float betabuffer[0x10] = {0};
float IncValue = 0.1;
float X0 = GetBezierPoint(XCords, betabuffer, 4, 0.1), X1 = GetBezierPoint(XCords, betabuffer, 4, 0.2), Y0 = GetBezierPoint(YCords, betabuffer, 4, 0.1), Y1 = GetBezierPoint(YCords, betabuffer, 4, 0.2);
while(1);
double Distance = __sqrt(pow(X1 - X0, 2) + pow(Y1-Y0, 2));
....
}
Code: Select all
0000000000000000 <TripleFaultingFunction>:
0: 48 8b c4 mov %rsp,%rax
3: 48 81 ec 88 00 00 00 sub $0x88,%rsp
a: 0f 28 05 00 00 00 00 movaps 0x0(%rip),%xmm0 # 11 <TripleFaultingFunction+0x11>
11: 48 8d 50 b8 lea -0x48(%rax),%rdx
15: 0f 28 0d 00 00 00 00 movaps 0x0(%rip),%xmm1 # 1c <TripleFaultingFunction+0x1c>
1c: 48 8d 48 98 lea -0x68(%rax),%rcx
20: f3 0f 10 1d 00 00 00 movss 0x0(%rip),%xmm3 # 28 <TripleFaultingFunction+0x28>
27: 00
28: 41 b0 04 mov $0x4,%r8b
2b: 0f 11 40 98 movups %xmm0,-0x68(%rax)
2f: 0f 57 c0 xorps %xmm0,%xmm0
32: 0f 11 40 b8 movups %xmm0,-0x48(%rax)
36: 0f 11 40 c8 movups %xmm0,-0x38(%rax)
3a: 0f 11 40 d8 movups %xmm0,-0x28(%rax)
3e: 0f 11 40 e8 movups %xmm0,-0x18(%rax)
42: 0f 11 48 a8 movups %xmm1,-0x58(%rax)
46: e8 00 00 00 00 callq 4b <TripleFaultingFunction+0x4b>
4b: f3 0f 10 1d 00 00 00 movss 0x0(%rip),%xmm3 # 53 <TripleFaultingFunction+0x53>
52: 00
53: 48 8d 54 24 40 lea 0x40(%rsp),%rdx
58: 41 b0 04 mov $0x4,%r8b
5b: 48 8d 4c 24 20 lea 0x20(%rsp),%rcx
60: e8 00 00 00 00 callq 65 <TripleFaultingFunction+0x65>
65: f3 0f 10 1d 00 00 00 movss 0x0(%rip),%xmm3 # 6d <TripleFaultingFunction+0x6d>
6c: 00
6d: 48 8d 54 24 40 lea 0x40(%rsp),%rdx
72: 41 b0 04 mov $0x4,%r8b
75: 48 8d 4c 24 30 lea 0x30(%rsp),%rcx
7a: e8 00 00 00 00 callq 7f <TripleFaultingFunction+0x7f>
7f: f3 0f 10 1d 00 00 00 movss 0x0(%rip),%xmm3 # 87 <TripleFaultingFunction+0x87>
86: 00
87: 48 8d 54 24 40 lea 0x40(%rsp),%rdx
8c: 41 b0 04 mov $0x4,%r8b
8f: 48 8d 4c 24 30 lea 0x30(%rsp),%rcx
94: e8 00 00 00 00 callq 99 <TripleFaultingFunction+0x99>
99: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
a0: eb fe jmp a0 <TripleFaultingFunction+0xa0>
-
- Member
- Posts: 5587
- Joined: Mon Mar 25, 2013 7:01 pm
Re: Enabling compiler optimizations ruins the kernel
Are you disassembling one of your intermediate object files or the final kernel binary?devc1 wrote:I guess the call to the relocation pointer thing at 0x1A is not set by the linker ?
Re: Enabling compiler optimizations ruins the kernel
Intermediate object files.
Re: Enabling compiler optimizations ruins the kernel
This thing is weird, removing while(1) from TripleFaultingFunction() not from memcpy or GetBezierPoint() makes a tripple fault.
But now I enabled /Ox (Full Speed optimization) which generates a very optimized output of assembly.
With or Without global optimizations, there are no visible differences.
However, enabling /O2 (Maximize speed and better than /Ox) along with disabling Global optimizations /Og- works fine.
I also disable intrinsics, because enabling them and removing my functions makes some : "unresolved reference to symbol : memset"
Maybe disabling intrinsics is the reason ?
this is my compiling flags :
But now I enabled /Ox (Full Speed optimization) which generates a very optimized output of assembly.
With or Without global optimizations, there are no visible differences.
However, enabling /O2 (Maximize speed and better than /Ox) along with disabling Global optimizations /Og- works fine.
I also disable intrinsics, because enabling them and removing my functions makes some : "unresolved reference to symbol : memset"
Maybe disabling intrinsics is the reason ?
this is my compiling flags :
Code: Select all
set CFLAGS= /GS- /O2g-i-