[SOLVED] A problem in GCC Optimization and Bochs
[SOLVED] A problem in GCC Optimization and Bochs
Today I change the -O3 in my os option to -O0, then a lot of strange problem occurs. First, the keyboard and mouse triggered reserved int handler, then when I used bochs debugger, the bochs said ">>PANIC<< keyboard error " and exited. However, these problems will not occur under -O3.
-O0:
-03:
If needed, I can also post my source code.
-O0:
-03:
If needed, I can also post my source code.
Last edited by nbdd0121 on Wed Aug 21, 2013 5:40 pm, edited 1 time in total.
Re: A problem in GCC Optimization and Bochs
Well, later I found that the optimzation in
will influence the OS. I wondered where is/are wrong.
Code: Select all
------------------------------------------
FASTCALL void init_apic(u_addr io_apic_addr){
/* Initialize the page table for APICs */
u32* apic_ptes=ALLOC_PAGE();
memset(apic_ptes, 0, PAGE_SIZE);
/* Map I/O APIC */
assert(io_apic_addr>0xFC000000,":-( I/O APIC Address Out of Range.");
apic_ptes[va2pte(IO_APIC_ADDR)]=io_apic_addr+0b11011;
/* Map Local APIC */
u64 msr=asm_rdmsr(IA32_APIC_BASE); /* Read IA32_APIC_BASE MSR */
asm_wrmsr(IA32_APIC_BASE, BIT_SET(BIT_SET(APIC_REG_ADDR,8),11));
/* Enable APIC if not */
apic_ptes[va2pte(APIC_REG_ADDR)]=APIC_REG_ADDR+0b11011;
/* Map the page table */
kernel_pde[va2pde(APIC_REG_ADDR)]=va2pa(apic_ptes)+0b11011;
asm_setCR3(va2pa(kernel_pde));
u8 apic_id=apic_read(APIC_ID_REG)>>24;
u32 id;
for(id=0;id<16;id++){
io_apic_write(IO_APIC_RED(id)+1,apic_id<<(56-32));
io_apic_write(IO_APIC_RED(id),0x10000+0x20+id);
}
for(;id<24;id++){
io_apic_write(IO_APIC_RED(id)+1,apic_id<<(56-32));
io_apic_write(IO_APIC_RED(id),0x10000);
}
for(id=0;id<16;id++){
if(IRQOverride[id]!=id)
io_apic_write(IO_APIC_RED(IRQOverride[id]),0x10000+0x20+id);
}
/* Display Info */
puts("Local APIC ID:");
dispByte(apic_id);
puts("\r\n");
puts("I / O APIC ID:");
dispByte(io_apic_read(0)>>24);
puts("\r\n");
}
----------------------------------------
-
- Member
- Posts: 368
- Joined: Sun Sep 23, 2007 4:52 am
Re: A problem in GCC Optimization and Bochs
I read somewhere that O3 does optimizations that aren't always suitable for a kernel. However, O0, O1 and O2 should always give the same result, so you may try these. If these give different results you have a bug somewhere in your code (may be anywhere really) that just happens to surface with that particular option. If you don't fix it now, it will surface at another time later. The typical problem is writing past the end of a memory buffer.
For instance, ALLOC_PAGE() may fail or give a smaller buffer than PAGE_SIZE. Or memset() may write more bytes than PAGE_SIZE due to a fencepost error. This is a typical kind of bug that comes and goes mysteriously. However, just because your program crashes in that function doesn't mean that's where the error is.
For instance, ALLOC_PAGE() may fail or give a smaller buffer than PAGE_SIZE. Or memset() may write more bytes than PAGE_SIZE due to a fencepost error. This is a typical kind of bug that comes and goes mysteriously. However, just because your program crashes in that function doesn't mean that's where the error is.
Code: Select all
u32* apic_ptes=ALLOC_PAGE();
memset(apic_ptes, 0, PAGE_SIZE);
- Combuster
- Member
- Posts: 9301
- Joined: Wed Oct 18, 2006 3:45 am
- Libera.chat IRC: [com]buster
- Location: On the balcony, where I can actually keep 1½m distance
- Contact:
Re: A problem in GCC Optimization and Bochs
Nope, nope, nope.I read somewhere that O3 does optimizations that aren't always suitable for a kernel. However, O0, O1 and O2 should always give the same result
O0 gives the dumb equivalent of the source code - typically resulting in something that's overly verbose and relatively hard to follow.
O1 does some basic optimisation patterns that can be done relatively quickly. From experience it has demonstrated to be capable of breaking things if you're not aware of what "volatile" does. Generally comes with the advantage of producing more condensed and readable assembly.
O2 is often considered the norm, and it is already very capable of breaking your code when you didn't write it properly according to the C standard. Typically the reason why nobody was using O3 is because the programmer couldn't write standards-compliant C code, and for a while O2 actually deliberately contained safeguards for not optimising certain known broken patterns from stupid programmers. If you're using GCC 4.x you'd better write your code right the first time or suffer the grave consequences at this stage.
O3 optimises speed over pretty much everything - mostly size, but does far less regarding to situations that weren't already aggravated by the things earlier optimisations will have made for you. It again boils down to the fact that programmer errors allow optimisations to break code.
Re: A problem in GCC Optimization and Bochs
Thank you for giving your opinion. However, I found the one which cause error:
asm_wrmsr(IA32_APIC_BASE, BIT_SET(BIT_SET(APIC_REG_ADDR,8),11));
I read the disassembled kernel, and I found that GCC didn't compile it correctly.
My Definition of wrmsr is like this:
#define asm_wrmsr(reg,val) do{asm volatile("wrmsr"::"A"(val),"c"(reg));}while(0)
However, I found out that GCC did not set the value of %edx. This error didn't occur in -O3 is because in -O3, %edx is luckly 0. I wondered if descriptor "A" is not working.
asm_wrmsr(IA32_APIC_BASE, BIT_SET(BIT_SET(APIC_REG_ADDR,8),11));
I read the disassembled kernel, and I found that GCC didn't compile it correctly.
My Definition of wrmsr is like this:
#define asm_wrmsr(reg,val) do{asm volatile("wrmsr"::"A"(val),"c"(reg));}while(0)
However, I found out that GCC did not set the value of %edx. This error didn't occur in -O3 is because in -O3, %edx is luckly 0. I wondered if descriptor "A" is not working.
Re: A problem in GCC Optimization and Bochs
Now, the issue was fixed.
After I rewrote the macros related to "A" or "=A", all bugs were not continued to appear.
#define asm_rdmsr(reg) ({u32 low,high;asm volatile("rdmsr":"=a"(low),"=d"(high):"c"(reg));low|((u64)high<<32);})
#define asm_wrmsr(reg,val) do{asm volatile("wrmsr"::"a"((u32)val),"d"((u32)((u64)val>>32)),"c"(reg));}while(0)
After I rewrote the macros related to "A" or "=A", all bugs were not continued to appear.
#define asm_rdmsr(reg) ({u32 low,high;asm volatile("rdmsr":"=a"(low),"=d"(high):"c"(reg));low|((u64)high<<32);})
#define asm_wrmsr(reg,val) do{asm volatile("wrmsr"::"a"((u32)val),"d"((u32)((u64)val>>32)),"c"(reg));}while(0)
-
- Member
- Posts: 368
- Joined: Sun Sep 23, 2007 4:52 am
Re: A problem in GCC Optimization and Bochs
Now you just chopped out part of what I said out of context, making it appear wrong. O0, O1 and O2 should always give the same result, provided the input code is correct, including statements such as volatile. That's why I write, "If these give different results you have a bug somewhere in your code".Combuster wrote:Nope, nope, nope.I read somewhere that O3 does optimizations that aren't always suitable for a kernel. However, O0, O1 and O2 should always give the same result
O3, on the other hand, has been known for sure to break correct code. Sure, it's not supposed to, and those particular bugs are probably fixed, but it has happened, and because O3 is far less tested than O2 chances are it can happen again.
- Owen
- Member
- Posts: 1700
- Joined: Fri Jun 13, 2008 3:21 pm
- Location: Cambridge, United Kingdom
- Contact:
Re: A problem in GCC Optimization and Bochs
Your code was wrong. The constant "11" is of type int. the "A" specifier has very well defined behavior: it allocates EAX:EDX as appropriate for the passed data type. Being as int is 32-bit, it only allocated EAX.
Perhaps declare the type involved correctly. Say, use an inline function rather than a mess of macros...
Perhaps declare the type involved correctly. Say, use an inline function rather than a mess of macros...
Re: A problem in GCC Optimization and Bochs
Thank you, I will change my macros to inline functions.Owen wrote:Your code was wrong. The constant "11" is of type int. the "A" specifier has very well defined behavior: it allocates EAX:EDX as appropriate for the passed data type. Being as int is 32-bit, it only allocated EAX.
Perhaps declare the type involved correctly. Say, use an inline function rather than a mess of macros...