Strange error after moving to 64-bit

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
User avatar
zity
Member
Member
Posts: 99
Joined: Mon Jul 13, 2009 5:52 am
Location: Denmark

Strange error after moving to 64-bit

Post by zity »

Hello again :)

I have finally got some time to play around with 64-bit OS development. I've decided to continue using grub as my bootloader, and I switched to 64-bit ELF using AOUT_KLUDGE. I've now managed to get the computer into long mode, and I can jump to my C kernel. I all seem to work very well, but I've encountered a strange error while trying to implement a simple text mode video driver. My clear screen function uses memsetw() to clear the screen buffer. In the following I've created to different implementations of this function, one that uses pointers (1) and one that use the screenbuffer as an array (2). I my 32-bit OS I use the one with pointers (1) and it works very well. In my new 64-bit test system, it causes a reboot in both qemu and bochs, but (2) works fine.

Number 1, with pointers

Code: Select all

uint16_t *memsetw(uint16_t *dest, uint16_t val, int len)
{
   uint16_t *dp = (uint16_t *)dest;
   while(len-- > 0) *dp++ = val;
   return dest;
}
Number 2, as an array

Code: Select all

uint16_t *memsetw(uint16_t *dest, uint16_t val, int len)
{
   while(len-- > 0) dest[len] = val;
   return dest;
}
Anybody who have an idea about, why it does not work with pointers, but as an array? Bochs returns the following error:

Code: Select all

00055164694e[CPU0 ] interrupt(long mode): IDT entry extended attributes DWORD4 TYPE != 0  
00055164694e[CPU0 ] interrupt(long mode): IDT entry extended attributes DWORD4 TYPE != 0  
00055164694e[CPU0 ] interrupt(long mode): IDT entry extended attributes DWORD4 TYPE != 0  
00055164694i[CPU0 ] CPU is in long mode (active)                                          
00055164694i[CPU0 ] CS.d_b = 16 bit                                                       
00055164694i[CPU0 ] SS.d_b = 32 bit                                                       
00055164694i[CPU0 ] EFER   = 0x00000501
My linker script and assembler stub can be downloaded here:
http://pub.mitsted.dk/link.ld
http://pub.mitsted.dk/boot.asm
User avatar
zity
Member
Member
Posts: 99
Joined: Mon Jul 13, 2009 5:52 am
Location: Denmark

Re: Strange error after moving to 64-bit

Post by zity »

I've been playing around with my source for a while now, and I simply can't seem to figure out the problem.. I doesn't seem to have anything to do with the long mode enabling code (mapping, gdt and so on), because even if I jump to the main function in my C kernel immediately after the start label in my assembly stub (when I'm still in protected mode) it does not work.

I think it has something to do with the way grub loads my kernel or the way it's linked. But I don't know what to try next. I've changed my assembly stub a little, so the stack is setup properly (or at least i think so), but it did not change anything. I change my linker script too, without any difference :(
User avatar
NickJohnson
Member
Member
Posts: 1249
Joined: Tue Mar 24, 2009 8:11 pm
Location: Sunnyvale, California

Re: Strange error after moving to 64-bit

Post by NickJohnson »

Did you try looking at the assembly output for both functions? As far as I can tell, both should produce the same result, but have very different mechanisms. The first one doesn't use offsets and fills from low to high addresses, and the seconds uses offsets and fills from high to low offsets. Both should have short assembly representations, so you could post them.
User avatar
zity
Member
Member
Posts: 99
Joined: Mon Jul 13, 2009 5:52 am
Location: Denmark

Re: Strange error after moving to 64-bit

Post by zity »

I went to wash the dishes, when I began to think whether my parameters for gcc could cause all this trouble. Guess what, THEY DID. I have always been using -O3 optimization flag, but if I use -O0 (or anything below 3) instead, my code works just fine. So the optimization must have broken my code. Guess I'll stop using -O3 then :)
User avatar
Creature
Member
Member
Posts: 548
Joined: Sat Dec 27, 2008 2:34 pm
Location: Belgium

Re: Strange error after moving to 64-bit

Post by Creature »

The only difference I can see at the first glance is that pointers in 64-bit are usually 64-bit and in 32-bit they are 32-bit. But that shouldn't change anything in the functions themselves, unless you are doing some crazy operations on the pointers that assume they are 32-bit or something.

I'm also not sure (haven't tried long mode yet), but isn't your CS supposed to be 32-bits (unless you're in 16-bit protected mode or real mode or something)?

EDIT: Nevermind :P, optimizations tended to break my memcpy and memset too, until I put volatile all over the place to stop the bastard from optimizing it away.
When the chance of succeeding is 99%, there is still a 50% chance of that success happening.
User avatar
zity
Member
Member
Posts: 99
Joined: Mon Jul 13, 2009 5:52 am
Location: Denmark

Re: Strange error after moving to 64-bit

Post by zity »

Nick: I actually though about getting the assembly output, but luckily I checked gcc first :)

Creature: I'm wondering about CS being 16 bit too, but it seems to work anyway. Maybe someone with a little more knowledge on 64-bit can answer this question? :)
User avatar
NickJohnson
Member
Member
Posts: 1249
Joined: Tue Mar 24, 2009 8:11 pm
Location: Sunnyvale, California

Re: Strange error after moving to 64-bit

Post by NickJohnson »

zity wrote:I went to wash the dishes, when I began to think whether my parameters for gcc could cause all this trouble. Guess what, THEY DID. I have always been using -O3 optimization flag, but if I use -O0 (or anything below 3) instead, my code works just fine. So the optimization must have broken my code. Guess I'll stop using -O3 then :)
The thing is, optimizations don't break correctly written code. If your function works with -O0, it may still be technically incorrect, just not enough to break at that level. I would really look at the assembly and try to figure out what -O3 did to break it, and try to fix that in C. Just a suggestion though - you can still have a functioning kernel with -O0 of course, but it may break with other compilers and architectures.
dosfan
Member
Member
Posts: 65
Joined: Tue Oct 14, 2008 1:18 pm
Location: Scotland

Re: Strange error after moving to 64-bit

Post by dosfan »

Those messages look like a page fault occuring without a valid IDT caused by the screwy compiler ouput.

BTW Bochs reports 16 bit CS for me also in long mode. Haven't looked into it yet
All your base are belong to us.
stlw
Member
Member
Posts: 357
Joined: Fri Apr 04, 2008 6:43 am
Contact:

Re: Strange error after moving to 64-bit

Post by stlw »

dosfan wrote:Those messages look like a page fault occuring without a valid IDT caused by the screwy compiler ouput.

BTW Bochs reports 16 bit CS for me also in long mode. Haven't looked into it yet
Bochs will always print debugdump with 16-bit CS in long mode.
"16-bit CS" here just means that CS.D=0 (64-bit mode indicated by CS.L=1, CS.D=0. CS.L=1, CS.D=1 is illegal combination).

Stanislav
Post Reply