Since I have finally enabled SSE in my bootloader, I played a bit with compiler optimizers. Here are my foundings:
First, I've used a known to be correct and perfectly working code (with "-O0") as a baseline. Then I've recompiled it with "-O2" and various optimizer flags and "-fsanitize=*" turning some features separately off. I have to say I wasn't pleased with the results, but maybe someone here knows the solution for one or more of my problems.
Incorrect resolving of struct field addresses
I map the first page as supervisor only, so that any user code that tries to dereference NULL will cause a page fault. This works great. But I didn't wanted to waste 4k, therefore I put some process related, for kernel's eye only variables there. Unfortunately both gcc and Clang miscompiles the following struct reference (where pid is not the first field in the struct, hence the accessed memory address is definitely not 0):
Code: Select all
p = *((proc_struct*)0)->pid
Bad code generation
Someting I cannot understand, that gcc generated bad, misaligned code. I've called a C function, and in it's function prologue (before the first instruction compiled from the first expression in the C code) it crashed. Debugging revealed that the faulty instruction was a "movaps [rsp], xmm0". Checking rsp it was 0xfffffffffffa8, which is not 16 bytes aligned. Not only SysV ABI expects 16 bytes stack alignment upon function calls, so does movaps. Now the problem with this is, that the programmer cannot influence the stack pointer from C directly, nor can he tell the compiler to use movups, so it is definitely the compiler's responsibility.
The only solution I could came up with was to add "-mno-sse", which quite defeats the whole purpose of my SSE optimization experiment. And this is the better part, as at least with this kind of errors I could see in run-time that the generated code was wrong.
Changed schematics
The most extremely annoying thing, that the optimizers changed the code's behaviour silently. No errors, no run-time faults, just the code does not do what the algorithm in the C source means. This should not happen under no circumstances IMHO. One of the problematic code was:
Code: Select all
void kpanic(char *reason, ...) {
va_list args;
va_start(reason, args);
char strbuf[128];
vsprintf(strbuf, reason, args);
kprintf("%s", strbuf); <--- this worked
if (debugger_enabled) {
debugger(strbuf);
} else {
kprintf("Panic: ");
kprintf("%s", strbuf); <--- this doesn't
...
Another problematic code was in the logger
Code: Select all
char *old = ptr;
...here are a bunch of sprintf()s to concatenate the formatted date and the message to ptr...
/* if debug console enabled, print the log message */
if (debugconsole_enabled) {
while (old<ptr)
debugconsole_putc(*old++);
}
Code: Select all
for (;old<ptr;ptr++)
Code: Select all
while(*old)
Conclusion: I didn't feel I wanted to guess what part of my code's schematics will be changed silently next time, so for now I went back with "-O0" and manual optimization. Maybe it's just me, but I want my generated code to do as the C source says. I just don't want any "if(ptr!=NULL)" silently removed. And I obviously don't want broken, faulty code to be generated either.
What are your experience with gcc and Clang optimizers regarding kernel development?
bzt