Keeping up compiler optimizations
- NickJohnson
- Member
- Posts: 1249
- Joined: Tue Mar 24, 2009 8:11 pm
- Location: Sunnyvale, California
Keeping up compiler optimizations
So far in writing my system (mostly but not exclusively the kernel), I've been trying to make sure everything is safe for GCC to optimize with "-fomit-frame-pointer -Os" flags. Until recently, it's been working, but now it breaks randomly unless I set it to "-fomit-frame-pointer -O0". The optimizations at least make bootup faster, even though it's still fractions of a second on a 1MHz emulated i586. I believe I can fix it with some significant work; if I stop trying to keep it optimization safe, it will probably never be optimization safe again. Do you think it's worth it to try and get optimizations working, or will it cause more trouble than it's worth in the end?
Re: Keeping up compiler optimizations
If optimization causes your kernel to crash then you are using some feature incorrectly. I think you should find it important to ensure that you are using GCC correctly. I know it can be frustrating to debug these problems, but let me tell you, -O0 can hide a lot of problems with inline asm. I ran into a lot of trouble myself because of incorrectly specified clobber-lists. This kind of problem doesn't show when -O0 is enabled because there is no function inlining and therefore the incorrect asm is often protected by register saving due to the x86 C calling conventions. Also it does not help that many of the examples of GCC inline asm available on the Internet are out-of-date and do not work correctly anymore. I had to learn this the hard way.
Don't be afraid to disassemble your functions and find out if GCC is really doing what you think it should be doing.
Don't be afraid to disassemble your functions and find out if GCC is really doing what you think it should be doing.
-
- Member
- Posts: 65
- Joined: Sat Jul 04, 2009 9:39 pm
Re: Keeping up compiler optimizations
This article: http://codingrelic.geekhold.com/2008/03 ... n-and.html.
It discusses how the optimizer causes bugs to become more noticeable. If turning up optimization causes problems, you need to fix them instead of just turning off optimization. The bugs are still there, just harder to notice.
I personally use -O2 when compiling. It caused a few annoying bugs when I first tried it, but fixing them makes the code more reliable.
It discusses how the optimizer causes bugs to become more noticeable. If turning up optimization causes problems, you need to fix them instead of just turning off optimization. The bugs are still there, just harder to notice.
I personally use -O2 when compiling. It caused a few annoying bugs when I first tried it, but fixing them makes the code more reliable.
Just the opposite, it's more trouble to ignore. As I've already said, the bugs are still there, you just don't notice until the optimizer starts taking shortcuts (as described in the article I linked to).NickJohnson wrote:Do you think it's worth it to try and get optimizations working, or will it cause more trouble than it's worth in the end?
- NickJohnson
- Member
- Posts: 1249
- Joined: Tue Mar 24, 2009 8:11 pm
- Location: Sunnyvale, California
Re: Keeping up compiler optimizations
Interesting, I guess I'll keep trying to fix it with optimizations enabled. About the inline asm difficulties - is it safe to not specify a clobber list if no general purpose registers are modified? This is the extent of the inline asm I have:
Additionally, is it a problem that I don't use enter/leave or equivalent in assembly functions that are called from C? Mostly those functions just do some simple exclusively in-register transformation of arguments and then return.
Code: Select all
asm volatile ("cli");
asm volatile ("hlt");
asm volatile ("outb %1, %0" : : "dN" (port), "a" (val));
asm volatile("inb %1, %0" : "=a" (ret) : "dN" (port));
asm volatile ("mov %0, %%cr3" :: "r" (map));
asm volatile ("invlpg %0" :: "m" (target));
asm volatile ("invlpg %0" :: "m" (page));
asm volatile("mov %%cr3, %0" : "=r" (cr3));
asm volatile("mov %0, %%cr3" : : "r" (cr3));
asm volatile ("sti");
asm volatile ("hlt");
asm volatile ("movl %%cr2, %0" : "=r" (cr2));
asm volatile ("sti");
asm volatile ("hlt");
-
- Member
- Posts: 368
- Joined: Sun Sep 23, 2007 4:52 am
Re: Keeping up compiler optimizations
enter/leave are useless instructions. They are not needed, ever.
- NickJohnson
- Member
- Posts: 1249
- Joined: Tue Mar 24, 2009 8:11 pm
- Location: Sunnyvale, California
Re: Keeping up compiler optimizations
I mean the saving of the stack frame in general.Craze Frog wrote:enter/leave are useless instructions. They are not needed, ever.
-
- Member
- Posts: 65
- Joined: Sat Jul 04, 2009 9:39 pm
Re: Keeping up compiler optimizations
Saving the stack frame shouldn't be necessary for a simple function. However, the C calling convention requires that certain registers are preserved within a function call. They are EBP, EBX, ESI, EDI, and ESP. So if you modify EBX, that could be a problem.
As far as the inline assembly, I sure hope there isn't a problem with it because that's pretty much how mine looks .
As far as the inline assembly, I sure hope there isn't a problem with it because that's pretty much how mine looks .
-
- Member
- Posts: 368
- Joined: Sun Sep 23, 2007 4:52 am
Re: Keeping up compiler optimizations
Stack frames pointers in a register are totally useless except for some cases of debugging. (PS. You can find out which function was currently executing when the program crashes without doing a stack trace.)NickJohnson wrote:I mean the saving of the stack frame in general.Craze Frog wrote:enter/leave are useless instructions. They are not needed, ever.
You should really know this before doing OSdev, it is basic knowledge of assembly and calling conventions.
- NickJohnson
- Member
- Posts: 1249
- Joined: Tue Mar 24, 2009 8:11 pm
- Location: Sunnyvale, California
Re: Keeping up compiler optimizations
Facepalm. Turns out that was actually the entire optimization problem to begin with. I replaced my memcpy() and memset() functions with assembly ones that use rep movsb, but didn't preserve ESI or EDI in either. All of the functions using memcpy() or memset(), which is a ton of them, had a couple of their variables randomly modified to large addresses. I thought the calling convention only preserved ESP, EBP, and EBX. Should've looked that one up - I was just about to write some usermode system call stubs without preserving EDI/ESI!manonthemoon wrote:However, the C calling convention requires that certain registers are preserved within a function call. They are EBP, EBX, ESI, EDI, and ESP. So if you modify EBX, that could be a problem.
I guess my C code is flawless then, because it works at -O3 now - that's a big relief.
Re: Keeping up compiler optimizations
Sounds like exactly the same issue I came across. These are my working versions of memset/memcpy:
Modern GCC does not allow you to specify a register on both the inputs and the clobber list. For whatever reason, you have to treat the register as an in-out operand. That is why, for example, you see "=c" (cb) as an output, and "0" (cb) as an input referring to the output parameter.
Though I should point out that while REP STOS is always supposed to be the fastest way to memset, the Intel docs do indicate various trade-offs (startup costs, etc) when using REP MOVS that may make it less efficient for some cases.
Personally, I relegated that bit of knowledge to the "really premature optimization" bin.
Clobbers on my inline asm syscall stubs was the other place I ran into troubles -- because there was no logic to enforce the C calling convention, there was nothing stopping a syscall from clobbering EBX and therefore when the call was inlined, I would get page faults on seemingly arbitrary addresses.
Putting a correct clobber list on every syscall stub cured that ill.
Code: Select all
static inline void *
memset (void *p, int ch, uint32 cb)
{
asm volatile ("cld; rep stosb"
:"=D" (p), "=a" (ch), "=c" (cb)
:"0" (p), "1" (ch), "2" (cb)
:"memory","flags");
return p;
}
static inline void *
memcpy (void *pDest, const void *pSrc, uint32 cb)
{
asm volatile ("cld; rep movsb"
:"=c" (cb), "=D" (pDest), "=S" (pSrc)
:"0" (cb), "1" (pDest), "2" (pSrc)
:"memory","flags");
return pDest;
}
Though I should point out that while REP STOS is always supposed to be the fastest way to memset, the Intel docs do indicate various trade-offs (startup costs, etc) when using REP MOVS that may make it less efficient for some cases.
Personally, I relegated that bit of knowledge to the "really premature optimization" bin.
Clobbers on my inline asm syscall stubs was the other place I ran into troubles -- because there was no logic to enforce the C calling convention, there was nothing stopping a syscall from clobbering EBX and therefore when the call was inlined, I would get page faults on seemingly arbitrary addresses.
Putting a correct clobber list on every syscall stub cured that ill.
- NickJohnson
- Member
- Posts: 1249
- Joined: Tue Mar 24, 2009 8:11 pm
- Location: Sunnyvale, California
Re: Keeping up compiler optimizations
Hmm... now I am getting a couple of problems that appear only *without* optimizations. Here's a more interesting question - if I work to support -O3 and -Os, should I also work to support -O0? Obviously there is a bug (and probably a small one), and I want my code to be right, so I'm going to fix it, but what would good general advice be? Because once you support -Os or -O3, you'll never want to go back to -O0, right?
Edit: actually, it's not just without optimizations, but more specifically without optimizations while using a beta version of TinyCC, but the point still holds.
Edit: actually, it's not just without optimizations, but more specifically without optimizations while using a beta version of TinyCC, but the point still holds.
Re: Keeping up compiler optimizations
You should ensure that your kernel compiles under all safe optimization settings -- otherwise you are violating some invariant that will come back to bite you later. -O0 through -O3 and -Os should all be "safe" ie. they are semantics-preserving transformations of the program. Of course, the more optimizations you enable, the more likely that compiler bugs are going to manifest. You shouldn't assume compiler bugs are at fault when you have trouble ("beta versions of tiny CC" notwithstanding) but it can happen.
-O0 is meant to make debugging easier. If you are having trouble at this level of optimization, get your debugger out and go at it.
-O0 is meant to make debugging easier. If you are having trouble at this level of optimization, get your debugger out and go at it.
Re: Keeping up compiler optimizations
Don't think of the different optimization settings as "problems" or "settings to support". Correct code should compile no matter what. Code that doesn't compile that way is buggy, no matter if you'd actually want to use that particular compiler option or not.
The optimization settings are merely what's needed to expose the bugs that are already there.
"Correct code" is not a matter of "it works for all my test cases" or "unless I use XYZ". It's a pure, unrelated-to-environment, feature of a piece of code to be correct. Difficult to attain in non-trivial projects, but the highest order of achievement in coding.
The optimization settings are merely what's needed to expose the bugs that are already there.
"Correct code" is not a matter of "it works for all my test cases" or "unless I use XYZ". It's a pure, unrelated-to-environment, feature of a piece of code to be correct. Difficult to attain in non-trivial projects, but the highest order of achievement in coding.
Every good solution is obvious once you've found it.