Page 1 of 1

General protection falt - paging (solved)

Posted: Fri Feb 02, 2007 10:55 pm
by Android Mouse
I'm trying to write a higher half kernel and so far I can't figure out why it keeps generating a general protection fault. The fault occurs right after paging is enabled and right before the next instruction is run.

edit: Please see the code in my second response to this thread. The code contained in this post has issues that were fixed in the second one.

Below is the debug output of bochs (its not letting me copy the text so i'm using a screenshot):
Image

My linker script:

Code: Select all

ENTRY (entry)
SECTIONS{
	. = 0xC0000000;
	.text :AT(ADDR(.text) - 0xBFF00000){*(.text)}
	.data :AT(ADDR(.data) - 0xBFF00000){*(.data)}
	.bss :AT(ADDR(.bss) - 0xBFF00000){*(.bss)}
} 
And my assembly:

Code: Select all

/* multiboot header */
.int 0x1BADB002
.int 1<<0 | 1<<1
.int -(0x1BADB002 + 1<<0 | 1<<1)

.equ VIRTUAL_OFFSET, 0xBFF00000 /* 0xC0000000 (virtual location) - 1mb (physical location) */

.comm page_table, 0x2000, 0x1000 /* Page table for 4mbs (1024 (number of ints) * 8 bytes) = 0x2000 */
.comm page_directory, 0x8, 0x1000 /* Page directory for 1 entry (1 int = 8 bytes) */

.comm page_data, 0x8 /* temporary page data is put here then moved out to page_table */
.comm counter, 0x8 /* counter for which entry page_table needs to be filled out */

.global entry /* set up paging, must be postition independent */
.global higher_half /* first section to run in paging */
entry:
	/* set counter to 0 and load it into ecx for use */
	movl $0x0, (counter - VIRTUAL_OFFSET) /* set counter to 0 */
	mov (counter - VIRTUAL_OFFSET), %ecx
	fill_page_table: /* fill out page_table */
		/* calculate data for page_table entry */
		mov $0x1000, %eax /* 4k */
		mul %ecx /* multiply counter (ecx) to 4k (eax) */
		add $0xC0000003, %eax /* add 3 (flags) and 0xC0000000 (offset) to eax (counter * 4k) */
		mov %eax, (page_data - VIRTUAL_OFFSET) /* move out data to page_data, so eax can be used */
		/* calculate location of where to store the data in the page_table */
		mov $0x20, %eax /* 32 to eax (size of page_table entry) */
		mul %ecx /* 32 (size of page_table entry) * counter */
		add $(page_table - VIRTUAL_OFFSET), %eax /* add page_table location to eax (offset) */
		mov (page_data - VIRTUAL_OFFSET), %ecx /* move temporary page_data into ecx */
		mov %ecx, (%eax) /* write data to offset */
		/* increase counter */
		mov (counter - VIRTUAL_OFFSET), %ecx /* move counter to ecx in order to increase it */
		inc %ecx /* counter ++ */
		mov %ecx, (counter - VIRTUAL_OFFSET) /* write counter */
		/* continue? */
		cmp $0x401, %ecx /* compare counter to see if we need to loop continue or not */
		jne fill_page_table /* loop if not equal */
	
	/* set (single) page directory entry */ 
	mov $(page_table - VIRTUAL_OFFSET), %eax /* move location of page_table to eax */
	add $0x3, %eax /* add 3 (flags) */
	mov %eax, (page_directory - VIRTUAL_OFFSET) /* write entry */
	
	/* set cr3 (PDBR) */
	mov $(page_directory - VIRTUAL_OFFSET), %eax
	mov %eax, %cr3

	/* set cr0 */
	mov %cr0, %eax
	xor $0x80000000, %eax /* set PG bit */
	mov %eax, %cr0

	lea higher_half, %eax /* load address -- this is where the general protection falt occurs */
	jmp *%eax /* long jump */

higher_half:
	mov $0x1, %ecx /* set ecx to 1 in order to see if the code got this far */
	jmp 0xFFFFFFF /* crash, in order to make sure no other random instructions run */
Any ideas on what is going wrong?

Posted: Sat Feb 03, 2007 3:42 am
by AJ
Hi,

Where you would normally expect to see the faulting instruction is just "(invalid)". This says to me that one of the instructions to be executed was not paged in. I have not trawled through the source as I only have a few minutes but:

* When you enable paging, is the instruction immediately after this in a page marked present?

* Are your IDT and ISRs in a 'present' page?

* Is your stack in a 'present' page?

* Are all pages with code marked 'executable'?

HTH,
Adam

Posted: Sat Feb 03, 2007 4:32 am
by SpooK
Here is some code that may help, as I know it works... nothing too fancy, though. I've made it as close to your example as practically possible, but it is in NASM, so you will have to translate to AT&T syntax yourself... I was never much of a masochist ;)

Code: Select all

VIRTUAL_OFFSET equ 0xC0000000
PHYSICAL_OFFSET equ ??? ;Physical location of paged code/data
PAGE_DIRECTORY equ ??? ;Fill this in with the Physical location of the page directory
PAGE_TABLE equ ??? ;Fill this in with the Physical location of the page table
PAGE_COUNT equ ??? ;Number of Page Table entries to fill (e.g. size of the kernel divided by 4K and round-up for safety)

;Setup initial page directory
	mov	ebx,PAGE_DIRECTORY
	mov	edi,PAGE_TABLE+3
	mov	DWORD[ebx+(VIRTUAL_OFFSET >> 20)],edi
	mov	cr3,ebx

;Setup initial page table entries
	mov	ecx,PAGE_COUNT
	mov	eax,PHYSICAL_OFFSET
	add	eax,3
	sub	edi,3
	cld
.pt_fill_loop:
	stosd
	add	eax,0x1000
	dec	ecx
	cmp	ecx,0
	jnz	.pt_fill_loop

;Start paging!!!
	mov	eax,cr0
	or	 eax,0x80000000
	mov	cr0,eax
	jmp	STRAIGHT_OFF_A_CLIFF:AND_PRAY

AND_PRAY:
;print a message that we made it!!!

Assume that "STRAIGHT_OFF_A_CLIFF" selects a valid GDT code entry ;)

HtH :)

Posted: Sun Feb 04, 2007 1:20 am
by Android Mouse
Hi again.

I tried the code SpooK posted but couldn't get it to work, after converting it to at&t syntax. Eventually I came up with the below:

Code: Select all

/* multiboot header */
.int 0x1BADB002
.int 1<<0 | 1<<1
.int -(0x1BADB002 + 1<<0 | 1<<1)

.equ VIRTUAL_OFFSET, 0xBFF00000 /* 0xC0000000 (virtual location) - 1mb (physical location) */

.comm page_table, 0x2000, 0x1000 /* Page table for 4mbs (1024 (number of ints) * 8 bytes) = 0x2000 */
.comm page_directory, 0x10, 0x1000 /* Page directory for 1 entry (1 int = 8 bytes) */
.comm multiboot_header, 0x8 /* Multboot header location */

.global entry
.global long_jmp
entry:
	mov %ebx, (multiboot_header - VIRTUAL_OFFSET)
	xor %eax, %eax /* counter */
	mov $0x3, %ebx /* data 0x3 are the flags */
	mov $(page_table - VIRTUAL_OFFSET), %ecx /* pointer */
	fill_table:
		mov %ebx, (%ecx)
		add $0x1000, %ebx /* increase 4k for next page */
		add $0x20, %ecx /* increase pointer 32 bits */
		inc %eax /* increase counter */
		cmp $0x401, %eax /* check */
		jnz fill_table /* repeat */
		
	/* fill page directory */
	mov $(page_table - VIRTUAL_OFFSET + 0x3), %eax /* load pointer + 3 (flags) */
	mov %eax, (page_directory - VIRTUAL_OFFSET)
	
	/* load PDBR */
	mov $(page_directory - VIRTUAL_OFFSET), %eax
	mov %eax, %cr3

	/* load the page table entry for the 1mb page into edx (for debugging).
	 * In order for it to identity map correctly it should equal 0x100003,
    * which it does show in the debugger */
	mov (page_table - VIRTUAL_OFFSET + 0x2000), %edx /* offset (0x2000) = (1mb/4kb) * 32 */

	/* enable paging */
	mov %cr0, %ebx
	or $0x80000000, %ebx
	mov %ebx, %cr0
	/* the crash occurs right between these instructions */
	jmp $0x8, $(long_jmp - VIRTUAL_OFFSET)
	
long_jmp:
	hlt /* bochs will alerts if interrupts aren't enabled so we know it got here */
While it still crashes and causes a general protection fault, it fixes the problems I found in the code I first posted.

New debug output:
Image
Where you would normally expect to see the faulting instruction is just "(invalid)". This says to me that one of the instructions to be executed was not paged in. I have not trawled through the source as I only have a few minutes but:
This issue is now fixed in the code I just posted, only now instead of showing an invalid instruction it shows a random one. I'm guessing this means that the page isn't mapped correctly meaning it is probably reading data and interpreting it as code. Although if this is the case I do not see why the page for the 1mb boundary contains the correct value but apparently isn't mapping to that value.

I'm answering the below questions on the code I just posted, not the original code I posted.
* When you enable paging, is the instruction immediately after this in a page marked present?

* Are your IDT and ISRs in a 'present' page?

* Is your stack in a 'present' page?

* Are all pages with code marked 'executable'?
* Yes, all pages are identity mapped and marked read/write and present.

* I haven't set up the IDT yet.

* No stack setup either at this point. I'm assuming it isn't necessary yet since the bare bones page doesn't set one up until after paging is enabled, I also don't use the stack either.

* I'm not sure what you mean. There isn't an option in the page table entries or the page directory entry to do this. I am although accessing everything through segment 0x8, which is the code segment GRUB setup.

Thanks for the help so far.

Posted: Sun Feb 04, 2007 2:38 am
by SpooK
Noticed one thing off the bat...
Android Mouse wrote: add $0x20, %ecx /* increase pointer 32 bits */
That instruction increases ecx by 32 bytes, not 32 bits. You want to increase ecx by 4 bytes, as there are 8 bits in a byte and 32/8=4 ;)

Posted: Sun Feb 04, 2007 9:33 am
by frank
Well it looks like you are having a page fault at 0x00104000. I can only guess that your paging setup code is somehow flawed and that your code pages are not getting mapped in.

Posted: Sun Feb 04, 2007 1:58 pm
by Android Mouse
SpooK wrote:Noticed one thing off the bat...
Android Mouse wrote: add $0x20, %ecx /* increase pointer 32 bits */
That instruction increases ecx by 32 bytes, not 32 bits. You want to increase ecx by 4 bytes, as there are 8 bits in a byte and 32/8=4 ;)
Thanks for spottng that. After fixing that part, it now works :D

Another (non fatal) problem was that I was assuming that there were 8 bytes in an int, not 4. So I was reserving 2x the amount of space needed for the page table and page directory. I guess I never bothered to think that 8*8 != 32.

Thanks once again for the help.

Posted: Sun Feb 04, 2007 4:16 pm
by SpooK
Android Mouse wrote:
SpooK wrote:Noticed one thing off the bat...
Android Mouse wrote: add $0x20, %ecx /* increase pointer 32 bits */
That instruction increases ecx by 32 bytes, not 32 bits. You want to increase ecx by 4 bytes, as there are 8 bits in a byte and 32/8=4 ;)
Thanks for spottng that. After fixing that part, it now works :D

Another (non fatal) problem was that I was assuming that there were 8 bytes in an int, not 4. So I was reserving 2x the amount of space needed for the page table and page directory. I guess I never bothered to think that 8*8 != 32.

Thanks once again for the help.
NP ;)

AFAIK, INT was designed to be specific to the "current mode" that the processor is running in. When you compiled for 16-bit programs (i.e. MS-DOS), INT was SHORT (2 bytes). For 32-bit programs (i.e. Win32), INT defaults as a LONG (4 bytes). Perhaps with 64-bit programs, INT is defined as DOUBLE/LONG LONG (8 bytes)???

Anyhow, such use of INT helps minimize the work in porting software.

With exception to non-standard variations of INT, the C/C++ buffs here can feel free to correct me, I primarily program in ASM :P

Posted: Mon Feb 05, 2007 8:51 am
by JAAman
AFAIK, INT was designed to be specific to the "current mode" that the processor is running in. When you compiled for 16-bit programs (i.e. MS-DOS), INT was SHORT (2 bytes). For 32-bit programs (i.e. Win32), INT defaults as a LONG (4 bytes). Perhaps with 64-bit programs, INT is defined as DOUBLE/LONG LONG (8 bytes)???
(not a C/C++ buff)

not so much the 'current mode' (as most CPUs dont have 'modes') but whatever the native register size is for the target machine

in LMode, int is usually still 32bits, since the default operation size for the CPU in LMode is still 32bits (64bit operations require a REX override prefix)