Loading GDT causes triple-fault

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
User avatar
KrnlHckr
Member
Member
Posts: 36
Joined: Tue Jul 17, 2007 9:16 am
Location: Washington, DC Metro Area
Contact:

Post by KrnlHckr »

UPDATES:

Still no dice with GDT code. It continues to die at mov ds, ax instruction in gdt_flush. I've tried various combinations of positioning the far jump and saw some interesting results (each crashing the kernel too). SInce JamesM was able to get my image working and I cannot, I must assume that my machine is the culprit. A quick check to see if SELinux was running proved fruitless, it's is indeed turned off. I'm going to begin the painful task of stepping the kernel code through bochs debugger... :(

FWIW:

I took a copy of Bran's exact code and tried using it. The only changes made were to strip the DOS-styled EOL for UNIX friendly style and clearing out all the underscore characters due to my Linux environment, and editing the build script for UNIX syntax versus DOS. The kernel first bombed with a GRUB 13 error, which corresponded directly with the need to change the

Code: Select all

*(.*rodata)
in link.ld to

Code: Select all

*(.*rodata*)
. After fixing this, the kernel proceeded past the 13 error point, and still failed exactly like my version. :mad:

The plodding process continues. I'll get this licked soon enough... I hope! :)
"If your code won't run, verify that you are, indeed, using the STABLE branches of your toolchain!" -- KrnlHckr, 2007 :oops:
frank
Member
Member
Posts: 729
Joined: Sat Dec 30, 2006 2:31 pm
Location: East Coast, USA

Post by frank »

Just a simple question, are you doing something like assuming that memory will be filled with zeros? Thats not always true you know.
User avatar
os64dev
Member
Member
Posts: 553
Joined: Sat Jan 27, 2007 3:21 pm
Location: Best, Netherlands

Post by os64dev »

funny code :

Code: Select all

#ifndef __GDT_H
#define __GDT_H

/* define gdt entry - packed prevents compiler from dorking it up */
struct gdt_entry
{
	unsigned short limit_lo;
	unsigned short base_lo;
	unsigned char base_mid;
	unsigned char access;
	unsigned char granularity;
	unsigned char base_hi;
} __attribute__ ((__packed__));

/* special pointer - max bytes taken by gdt - 1 */
struct gdt_ptr
{
	unsigned short limit;
	unsigned int base;
} __attribute__ ((__packed__));

/* a simple 3-entry gdt and pointer */
struct gdt_entry gdt[3];
struct gdt_ptr gp;

/* the extern function in loader.asm */
extern void gdt_flush();

void gdt_set_gate(int, unsigned long, unsigned long, unsigned char, unsigned char);
void gdt_install();

#endif
The problem here is the gdt and gp declaration. it should be:

Code: Select all

extern struct gdt_entry gdt[3];
extern struct gdt_ptr gp;
and the structs should be put in gdt.c. At this point the gdt.c included the gdt.h header file and kernel.c include the gdt.h files. this means that the compiler will make two struct gp and two struct gdt[3].

EDIT: checked it and it seems that there is only one gdt and one gp though the address alignment is bad.

Code: Select all

00104000 <gp>:
  104000:	00 00                	add    %al,(%eax)
  104002:	00 00                	add    %al,(%eax)
	...

00104006 <gdt>:
Author of COBOS
User avatar
KrnlHckr
Member
Member
Posts: 36
Joined: Tue Jul 17, 2007 9:16 am
Location: Washington, DC Metro Area
Contact:

Post by KrnlHckr »

frank wrote:Just a simple question, are you doing something like assuming that memory will be filled with zeros? Thats not always true you know.
Hi there, Frank!

It's possible that I -am- assuming as much. I looked over the code and didn't see anything that smelled like zeroing of the GDT. The only changes to the code in the past few days I've made were the addition of some #defines to get rid of magic numbers and a typedef here and there. The comments in Bran's code don't refer to any zeroing of the GDT and I suspect, if I may be so bold, that Bran might be assuming too.

I'm going to take a stab at writing up some zeroing code. And given my track record, when I fail at that I guess I'll have to be a tool and ask for guidance! :D
"If your code won't run, verify that you are, indeed, using the STABLE branches of your toolchain!" -- KrnlHckr, 2007 :oops:
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Bochs 'info cpu'

Post by Brendan »

Hi,

Just a few quick questions...

Code: Select all

struct gdt_ptr
{
        uint16 limit;
        uint8  base;
} __attribute__ ((__packed__));
Why is "gp.base" an unsigned 8-bit value?

Code: Select all

        gp.base  = (uint16)&gdt;
Why are you type-casting it to an unsigned 16-bit data type?

It should be an unsigned 32-bit value, or perhaps even "gdt_entry *base;"...

Of course this might not help if your GDT limit is zero - not sure what's causing that.

Also, I don't think failing to fill your ".bss" section with zero would cause this specific bug - everything being relied on is set to something before being used.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
User avatar
KrnlHckr
Member
Member
Posts: 36
Joined: Tue Jul 17, 2007 9:16 am
Location: Washington, DC Metro Area
Contact:

Post by KrnlHckr »

os64dev wrote: The problem here is the gdt and gp declaration. it should be:

Code: Select all

extern struct gdt_entry gdt[3];
extern struct gdt_ptr gp;
and the structs should be put in gdt.c. At this point the gdt.c included the gdt.h header file and kernel.c include the gdt.h files. this means that the compiler will make two struct gp and two struct gdt[3].
I surely profess no wizardry in my C coding, but the way I understand extern, putting the extern struct declarations in gdt.h and the struct implementation in gdt.c causes the compiler to see an external definition prior to any implementation details being provided:

Code: Select all

include/gdt.h:4: error: array type has incomplete element type
Of course, I could simply be hopelessly lost in a maze of twisty passages, all alike, and likely to be eaten by a grue. :lol:
EDIT: checked it and it seems that there is only one gdt and one gp though the address alignment is bad.

Code: Select all

00104000 <gp>:
  104000:	00 00                	add    %al,(%eax)
  104002:	00 00                	add    %al,(%eax)
	...

00104006 <gdt>:
As in mine. This really is perplexing and I am resolved to see this to the end. I'm sure it'll work out eventually, and I sure do appreciate any ideas, thoughts, comments, etc. Anything would be helpful at this point! :)
"If your code won't run, verify that you are, indeed, using the STABLE branches of your toolchain!" -- KrnlHckr, 2007 :oops:
User avatar
KrnlHckr
Member
Member
Posts: 36
Joined: Tue Jul 17, 2007 9:16 am
Location: Washington, DC Metro Area
Contact:

Re: Bochs 'info cpu'

Post by KrnlHckr »

Brendan wrote:

Code: Select all

struct gdt_ptr
{
        uint16 limit;
        uint8  base;
} __attribute__ ((__packed__));
Why is "gp.base" an unsigned 8-bit value?

Code: Select all

        gp.base  = (uint16)&gdt;
Why are you type-casting it to an unsigned 16-bit data type?

It should be an unsigned 32-bit value, or perhaps even "gdt_entry *base;"...

Of course this might not help if your GDT limit is zero - not sure what's causing that.

Also, I don't think failing to fill your ".bss" section with zero would cause this specific bug - everything being relied on is set to something before being used.
Hiya, Brendan! Thanks for your response.

I nearly panicked there for a moment when I saw the uint's! My initial posting's attachment has the code as I work on it now. I had attempted to add some "portability" into the code by setting up some defines, but I realized that it was cluttering up the mind. I set things back to using unsigned int, unsigned short, etc.

Here is what gdt.h and gdt.c look like now (using a bit of typedef shortcutting).

Code: Select all

/* gdt.h */

/* define gdt structure */
typedef struct _gdt_entry {
        unsigned short limit_low;
        unsigned short base_low;
        unsigned char  base_middle;
        unsigned char  access;
        unsigned char  granularity;
        unsigned char  base_high;
} __attribute__ ((__packed__)) gdt_entry;

/* pointer structure to gdt */
typedef struct _gdt_ptr {
        unsigned short limit;
        unsigned long  base;
} __attribute__ ((__packed__)) gdt_ptr;

/* gdt_flush is in start.asm */
extern void gdt_flush();

/* function prototypes */
void gdt_set_gate(int num, unsigned long base, unsigned long limit,
                  unsigned char access, unsigned char gran);

void gdt_install();

/* end gdt.h */

Code: Select all

/* gdt.c */

#include <gdt.h>

gdt_entry gdt[3];
gdt_ptr   gdtp;

/* set up a descriptor in the gdt */
void gdt_set_gate(int num, unsigned long base, unsigned long limit,
                  unsigned char access, unsigned char gran)
{
        /* setup descriptor base address */
        gdt[num].base_low    = (base & 0xFFFF);
        gdt[num].base_middle = (base >> 16) & 0xFF;
        gdt[num].base_high   = (base >> 24) & 0xFF;

        /* setup descriptor limits */
        gdt[num].limit_low   = (limit & 0xFFFF);
        gdt[num].granularity = ((limit >> 16) & 0x0F);

        /* set up granularity and access flags */
        gdt[num].granularity |= (gran & 0xF0);
        gdt[num].access      =  access;

} /* gdt_set_gate() */

/* initialize the gdt */
void gdt_install()
{
        /* set up gdt pointer and limit */
        gdtp.limit = (sizeof(gdt_entry) * 3) - 1;
        gdtp.base  = &gdt;

        /* null descriptor */
        gdt_set_gate(0, 0, 0, 0, 0);

        /* code segment descriptor */
        gdt_set_gate(1, 0, 0xFFFFFFFF, 0x9A, 0xCF);

        /* data segment descriptor */
        gdt_set_gate(2, 0, 0xFFFFFFFF, 0x92, 0xCF);

        /* flush old gdt and install new one */
        gdt_flush();

} /* gdt_install() */

/* end gdt.c */
The limit for the each null segment is 0, and 0xFFFFFFFF for code and data segments. What makes this so much fun is the inconsistencies I get. JamesM even reports that he was able to run the code.

Code: Select all

<bochs:1> c
Next at t=25605718
(0) [0x0010003e] 0008:10003e (unk. ctxt): mov ds, ax                ; 8ed8
<bochs:2> info cpu
eax:0x00000000, ebx:0x00000000, ecx:0x00000000, edx:0x00000543
ebp:0x00000000, esp:0x00000000, esi:0x00000000, edi:0x00000000
eip:0x0000fff0, eflags:0x00000002, inhibit_mask:0
cs:s=0xf000, dl=0x0000ffff, dh=0xff009bff, valid=1
ss:s=0x0000, dl=0x0000ffff, dh=0x00009300, valid=1
ds:s=0x0000, dl=0x0000ffff, dh=0x00009300, valid=1
es:s=0x0000, dl=0x0000ffff, dh=0x00009300, valid=1
fs:s=0x0000, dl=0x0000ffff, dh=0x00009300, valid=1
gs:s=0x0000, dl=0x0000ffff, dh=0x00009300, valid=1
ldtr:s=0x0000, dl=0x0000ffff, dh=0x00008200, valid=1
tr:s=0x0000, dl=0x0000ffff, dh=0x00008300, valid=1
gdtr:base=0x00000000, limit=0xffff
idtr:base=0x00000000, limit=0xffff
dr0:0x00000000, dr1:0x00000000, dr2:0x00000000
dr3:0x00000000, dr6:0xffff0ff0, dr7:0x00000400
cr0:0x00000010, cr1:0x00000000, cr2:0x00000000
cr3:0x00000000, cr4:0x00000000
done
<bochs:3>
This is despite the code for gdt_flush saying:

Code: Select all

global gdt_flush                ; allow c code to find gdt_flush()
extern gdtp                     ; gp is in gdt.h
gdt_flush:
        lgdt    [gdtp]          ; load gdt
        mov     ax, 0x10        ; offset in gdt to data segment
        mov     ds, ax
        mov     es, ax
        mov     fs, ax
        mov     gs, ax
        mov     ss, ax
        jmp     0x08:gdt_flush2 ; offset to code segmend - far jmp
gdt_flush2:
        ret
I see that eax holds 0x0, yet I mov ax, 0x10. Bochs dies when eip get to mov ds, ax. Ho hum... For giggles I've attached the latest tarball.
Attachments
kernel-0.0.1.tar.gz
(4.8 KiB) Downloaded 101 times
"If your code won't run, verify that you are, indeed, using the STABLE branches of your toolchain!" -- KrnlHckr, 2007 :oops:
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Bochs 'info cpu'

Post by Brendan »

Hi,
KrnlHckr wrote:I see that eax holds 0x0, yet I mov ax, 0x10.
I'm not so sure about that - the information you just posted says "eax = 0x00000000" but it also says CS:IP = 0xF000:FFFF0 - it's from just after CPU reset, not from just before the exception.

I took a look at the code and didn't see anything too wrong. Do you have a bootable floppy image or something so I can test it on Bochs?


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
User avatar
KrnlHckr
Member
Member
Posts: 36
Joined: Tue Jul 17, 2007 9:16 am
Location: Washington, DC Metro Area
Contact:

Re: Bochs 'info cpu'

Post by KrnlHckr »

Brendan wrote:Hi,
KrnlHckr wrote:I see that eax holds 0x0, yet I mov ax, 0x10.
I'm not so sure about that - the information you just posted says "eax = 0x00000000" but it also says CS:IP = 0xF000:FFFF0 - it's from just after CPU reset, not from just before the exception.

I took a look at the code and didn't see anything too wrong. Do you have a bootable floppy image or something so I can test it on Bochs?
Whoops! You're right about that one! Submit before engaging brain. :roll:

(tried to attach images, but violated size restrictions. pm your email addy?)
"If your code won't run, verify that you are, indeed, using the STABLE branches of your toolchain!" -- KrnlHckr, 2007 :oops:
pcmattman
Member
Member
Posts: 2566
Joined: Sun Jan 14, 2007 9:15 pm
Libera.chat IRC: miselin
Location: Sydney, Australia (I come from a land down under!)
Contact:

Post by pcmattman »

I think that's why you should load segment registers after loading the GDT (save some confusion).
User avatar
KrnlHckr
Member
Member
Posts: 36
Joined: Tue Jul 17, 2007 9:16 am
Location: Washington, DC Metro Area
Contact:

Post by KrnlHckr »

What the deuce!?!

Code: Select all

; working kernel
0010002b <gdt_flush>:
  10002b:       0f 01 15 00 40 10 00    lgdtl  0x104000

<snip>

00104000 <gp>:
  104000:       00 00                   add    %al,(%eax)

Code: Select all

; triple-faulting kernel
00100033 <gdt_flush>:
  100033:       0f 01 15 e2 4f 00 00    lgdtl  0x4fe2

<snip>
0010501c <gdtp>:
        ...
(Paying no attention to the change from gp to the newer gdtp --- they serve the same purpose...)

In the working kernel, lgdt points at 0x00104000 and there is data there.
In the bad kernel, lgdt points at ??? 0x4fe2 while gdtp is actually at 0x0010501c (which seems to be empty...)

The Fellowship of the Ring0 continues unthwarted! :P

EDIT: Ahha! Running 'nm -g kernel.bin' revealed that gdtp is in the uninitialized data section. Of course, I now know that this is what BSS is and objdump said gdpt was in bss :oops:. Still don't know, though, why the address that lgdt wants to pull from is 0x4fe2.
"If your code won't run, verify that you are, indeed, using the STABLE branches of your toolchain!" -- KrnlHckr, 2007 :oops:
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Post by Brendan »

Hi,
KrnlHckr wrote:

Code: Select all

; working kernel
0010002b <gdt_flush>:
  10002b:       0f 01 15 00 40 10 00    lgdtl  0x104000

<snip>

00104000 <gp>:
  104000:       00 00                   add    %al,(%eax)

Code: Select all

; triple-faulting kernel
00100033 <gdt_flush>:
  100033:       0f 01 15 e2 4f 00 00    lgdtl  0x4fe2

<snip>
0010501c <gdtp>:
        ...
(Paying no attention to the change from gp to the newer gdtp --- they serve the same purpose...)
I'm not getting the same behaviour - with your previously uploaded source code and running "make" and "objdump -d", I get:

Code: Select all

00100033 <gdt_flush>:
  100033:       0f 01 15 1c 50 10 00    lgdtl  0x10501c
  10003a:       66 b8 10 00             mov    $0x10,%ax
  10003e:       8e d8                   movl   %eax,%ds
  100040:       8e c0                   movl   %eax,%es
  100042:       8e e0                   movl   %eax,%fs
  100044:       8e e8                   movl   %eax,%gs
  100046:       8e d0                   movl   %eax,%ss
  100048:       ea 4f 00 10 00 08 00    ljmp   $0x8,$0x10004f
  10004f:       c3                      ret
Doing "nm" confirms this:

Code: Select all

0010501c B gdtp
I'm wondering if there's a difference between the toolchain you're using and the toolchain other people are using. I'm using GCC 4.1.2 and LD 6.9 on a Gentoo/Linux system...


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
User avatar
Candy
Member
Member
Posts: 3882
Joined: Tue Oct 17, 2006 11:33 pm
Location: Eindhoven

Post by Candy »

KrnlHckr wrote:

Code: Select all

; triple-faulting kernel
00100033 <gdt_flush>:
  100033:       0f 01 15 e2 4f 00 00    lgdtl  0x4fe2

<snip>
0010501c <gdtp>:
        ...
So the code you run is at 0x100033 with a relocation at 0x100036 and you're exactly 0x100036 off the actual goal... I would suggest you have to relocate this code properly to get the proper result.
User avatar
KrnlHckr
Member
Member
Posts: 36
Joined: Tue Jul 17, 2007 9:16 am
Location: Washington, DC Metro Area
Contact:

SOLVED!!!

Post by KrnlHckr »

Brendan wrote:I'm wondering if there's a difference between the toolchain you're using and the toolchain other people are using. I'm using GCC 4.1.2 and LD 6.9 on a Gentoo/Linux system...
:D DING! Brendan, you've hit a home run!

At the start of this thread, I and others were thinking, "nah, toolchain should be fine, check your [my] code." And I learned a LOT about the tool chain: ld scripts, nm, and objdump especially. And then you mentioned tool chain again...

I said what the heck...

Code: Select all

$ nasm -v
nasm-0.99.01
Hit nasm's homepage to download an "older release" just in case. Found out that nasm-0.99 is a dev branch. Major DOH! I installed nasm-0.98 and BINGO! :D

The kernel builds and objdump shows the right location for my gdt:

Code: Select all

00100033 <gdt_flush>:
  100033:       0f 01 15 1c 50 10 00    lgdtl  0x10501c
THANK YOU THANK YOU THANK YOU to everyone!
"If your code won't run, verify that you are, indeed, using the STABLE branches of your toolchain!" -- KrnlHckr, 2007 :oops:
SpooK
Member
Member
Posts: 260
Joined: Sun Jun 18, 2006 7:21 pm

Post by SpooK »

Nice catch.

It would be a wise thing to stay away from the latest NASM dev branch as it is a mess due to the major updates it is receiving in both for x64 support and code clean-up.

NASM 0.98.39 (STABLE) works just fine :)
Post Reply