GCC/AS bug when compiling for x86-64?

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
User avatar
Zenith
Member
Member
Posts: 224
Joined: Tue Apr 10, 2007 4:42 pm

GCC/AS bug when compiling for x86-64?

Post by Zenith »

I'm not sure if OS Development is the right forum to put this in, but it is pretty OSdev related...

I've been porting my kernel to x86-64 from x86 (better earlier than later), and I've run into a strange problem with my scrolling text function. Once scrolling is triggered, in the x86-64 kernel only, an invalid opcode occurs. In x86, it scrolls normally.

Here's the non-working part of the scrolling code, which shifts each line to the previous one:

Code: Select all

	if (text_ypos >= TEXT_ROWS)
	{
		// Non-working code:
		for (i = 0; i < TEXT_NUMOFCHARS - TEXT_COLUMNS; i++)
		{
			text_vidmem[i] = text_vidmem[i + TEXT_COLUMNS];
		}
		(more working code)
Remember that this code does work in the x86 kernel, that text_vidmem is an array of uint16_ts which starts at 0xB8000, and that the macros have proper values (TEXT_NUMOFCHARS = TEXT_COLUMNS*TEXT_ROWS, TEXT_COLUMNS = 80, TEXT_ROWS = 25)

Stranger still, when I finally gave up and used inline assembly instead, it actually worked (for both kernels)!

Code: Select all

	if (text_ypos >= TEXT_ROWS)
	{
		asm (
				"mov %0, %%esi;"
				"mov %1, %%edi;"
				"mov %2, %%ecx;"
				"rep movsw;"
				:
				: "n"(TEXT_MEMADDR + (TEXT_COLUMNS * 2)), "n"(TEXT_MEMADDR), "n"(TEXT_NUMOFCHARS - TEXT_COLUMNS)
				: "esi", "edi", "ecx"
		);
		(working fine with this)
I think the weird part is that the exception caused by the original code is an 'invalid opcode' at somewhere in my kernel's virtual address (0xFFFFFFFFC01*****) and it does occur within this for statement, instead of a more common one such as a GPF or a page fault.

Is it my code, or is something in my GNU toolchain messing up?
"Sufficiently advanced stupidity is indistinguishable from malice."
User avatar
JamesM
Member
Member
Posts: 2935
Joined: Tue Jul 10, 2007 5:27 am
Location: York, United Kingdom
Contact:

Post by JamesM »

objdump it and have a look?
User avatar
zaleschiemilgabriel
Member
Member
Posts: 232
Joined: Mon Feb 04, 2008 3:58 am

Post by zaleschiemilgabriel »

What do you mean by "invalid opcode"? An invalid instruction opcode in the binary? If so, I think it's obvious: A compiler should NEVER generate an invalid opcode, unless you expect it to (if you use inline assembly and instruct it to generate that opcode).
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Post by Combuster »

Are you actually using a 64-bit compiler?
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
User avatar
Zenith
Member
Member
Posts: 224
Joined: Tue Apr 10, 2007 4:42 pm

Post by Zenith »

Never mind, it doesn't seem like a bug: it's just GCC optimizing a little too much...

Combuster: Yes, I am using an x86_64-pc-elf toolchain ( GCC 4.3.0, binutils 2.18 ).

zaleschiemilgabriel: Well, the opcode itself is valid, its just the processor generates a #UD when executing the instruction.

It's interesting though, what x86_64-pc-elf-objdump -dS shows.

This is the line that generates the #UD exception, according to my fault handler:

Code: Select all

ffffffffc0107d00:	66 41 0f 6f 04 11    	movdqa (%r9,%rdx,1),%xmm0
Looking at the Intel Manuals, it says that a #UD is generated for MOVDQA in 64-bit mode when either CR0.EM = 1, CR4.OSFXSR = 0, CPUID.01H:EDX.SSE2 = 0, or if the LOCK prefix is used. Looking into those now...

Well, I think GCC/AS is just thinking that the opcode is fine to use because it assumes that the environment is set up properly so that it can use this optimization.

So who should we blame? Myself, for writing such 'horrible' code, or GCC/AS for making such assumptions?

(The question's pretty one-sided :wink:)
"Sufficiently advanced stupidity is indistinguishable from malice."
User avatar
bluecode
Member
Member
Posts: 202
Joined: Wed Nov 17, 2004 12:00 am
Location: Germany
Contact:

Post by bluecode »

karekare0 wrote:So who should we blame? Myself, for writing such 'horrible' code, or GCC/AS for making such assumptions?
Yourself for not using the correct gcc switches: -mno-sse is what you seek. :)
User avatar
Zenith
Member
Member
Posts: 224
Joined: Tue Apr 10, 2007 4:42 pm

Post by Zenith »

Yeah, you're right - but you have to admit, all x86-64 processors should support at least SSE2, so I'd think GCC is safe making this assumption.

Doing CR4.OSFXSR = 1 solved the problem!
"Sufficiently advanced stupidity is indistinguishable from malice."
User avatar
bluecode
Member
Member
Posts: 202
Joined: Wed Nov 17, 2004 12:00 am
Location: Germany
Contact:

Post by bluecode »

karekare0 wrote:Doing CR4.OSFXSR = 1 solved the problem!
But remember to save/restore/initialise the state of the SSE registers (which might not be what you want within your kernel) or sometime things silently go b00m.
User avatar
Zenith
Member
Member
Posts: 224
Joined: Tue Apr 10, 2007 4:42 pm

Post by Zenith »

Already done :) . It's FXSAVE and FXRSTOR and a 512-byte memory area, right? I had to add the 'q' suffix so GAS compiles it to use the promoted-operand map.

Thanks for the help!
"Sufficiently advanced stupidity is indistinguishable from malice."
Post Reply