Instruction page faults appear when more code is linked

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
User avatar
NickJohnson
Member
Member
Posts: 1249
Joined: Tue Mar 24, 2009 8:11 pm
Location: Sunnyvale, California

Instruction page faults appear when more code is linked

Post by NickJohnson »

After playing around for a while with various tutorial code, I've decided to truly start writing my OS. However, I ran into this really strange problem right off the bat. I have a ~70 line piece of NASM code that sets up segmentation and paging, and moves the kernel into the higher half. By itself, it works flawlessly. But when I link some other code to it (about 50 lines of C worth, just some string manipulation stuff), it will page fault on an instruction right after paging is enabled. *None of that C code is executed at all*, and the page fault happens when everything is still being read from 0x00000000 - 0x00400000. It uses the same 4MB page trick that the higher half barebones guide does on the wiki. If I don't link that code, it works again. I checked the binary, and no symbols exceed about 0x(F8)10B000. Any ideas about what could be causing the problem?

Here is the file (boot.s):

; Multiboot stuff
MODULEALIGN equ 1<<0
MEMINFO equ 1<<1
FLAGS equ MODULEALIGN | MEMINFO
MAGIC equ 0x1BADB002
CHECKSUM equ -(MAGIC + FLAGS)

section .data

; Initial kernel address space
global init_kmap
align 0x1000
init_kmap:
dd 0x00000083 ; Identity map first 4 MB
times 991 dd 0 ; Fill until 0xF8000000
dd 0x00000083 ; Map first 4 MB again in higher mem
times 31 dd 0 ; Fill remainder of map

STACKSIZE equ 0x2000
init_stack:
align 4
times STACKSIZE dd 0

gdt:
align 4
dd 0x00000000, 0x00000000
dd 0x0000FFFF, 0x00CF9A00
dd 0x0000FFFF, 0x00CF9200
dd 0x0000FFFF, 0x00CFFA00
dd 0x0000FFFF, 0x00CFF200

gdt_ptr:
align 4
dw 0x0027 ; 39 bytes limit
dd gdt - 0xF8000000 ; Points to physical GDT

section .text

; Multiboot header
MultiBootHeader:
align 4
dd MAGIC
dd FLAGS
dd CHECKSUM

global start
start:
mov ecx, gdt_ptr - 0xF8000000 ; Load (real) GDT pointer
lgdt [ecx] ; Load new GDT
mov ecx, 0x10 ; Load all kernel data segments
mov ds, cx
mov es, cx
mov fs, cx
mov gs, cx
mov ss, cx
mov ecx, init_kmap - 0xF8000000 ; Get physical address of the kernel address space
mov cr3, ecx ; Load address into CR3
mov ecx, cr4
mov edx, cr0
or ecx, 0x00000010 ; Set 4MB page flag
or edx, 0x80000000 ; Set paging flag
mov cr4, ecx ; Return to CR4 (make 4MB pgs)
mov cr0, edx ; Return to CR0 (start paging)
jmp 0x08:.upper ; Jump to the higher half (in new code segment)

.upper:
mov esp, (init_stack + STACKSIZE) ; Setup init stack
mov ebp, (init_stack + STACKSIZE) ; and base pointer
push eax ; Push multiboot magic number
add ebx, 0xF8000000 ; Make multiboot pointer virtual
push ebx ; Push multiboot pointer
hlt ; Halt for now

By the way, the reason the kernel is being mapped so far up is because it only really handles the allocation of page tables; the block caches will be in various usermode driver processes. There's enough space for the creation of 10,000+ processes' page tables - plenty for me.
User avatar
xenos
Member
Member
Posts: 1121
Joined: Thu Aug 11, 2005 11:00 pm
Libera.chat IRC: xenos1984
Location: Tartu, Estonia
Contact:

Re: Instruction page faults appear when more code is linked

Post by xenos »

Probably it would be helpful to see 1. your linker script (or the linker command, just in case you don't use a linker script) and 2. the faulting address (i.e. the contents of CR2 after the page fault, which can be found in a bochs log file), maybe also 3. a linker map. It looks like linking with the C code somehow messes up your page directory...
Programmers' Hardware Database // GitHub user: xenos1984; OS project: NOS
User avatar
NickJohnson
Member
Member
Posts: 1249
Joined: Tue Mar 24, 2009 8:11 pm
Location: Sunnyvale, California

Re: Instruction page faults appear when more code is linked

Post by NickJohnson »

Here's the linker script:

ENTRY(start)
OUTPUT_FORMAT(elf32-i386)

SECTIONS {
. = 0xF8100000;

.text : AT(ADDR(.text) - 0xF8000000) {
*(.text)
*(.rodata*)
}

.data ALIGN (0x1000) : AT(ADDR(.data) - 0xF8000000) {
*(.data)
}

.bss : AT(ADDR(.bss) - 0xF8000000) {
_sbss = .;
*(COMMON)
*(.bss)
_ebss = .;
}

end = .; _end = .; __end = .;
}

The value of CR2 after the fault was 0x00000040, but the value of EIP was 0x00100333 (which it should be). The output does say "(instruction unavailable) page not present" though.

And this is the output of readelf -s for the kernel binary:

with code linked:

Symbol table '.symtab' contains 44 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: f8100000 0 SECTION LOCAL DEFAULT 1
2: f8101000 0 SECTION LOCAL DEFAULT 2
3: f810a038 0 SECTION LOCAL DEFAULT 3
4: 00000000 0 SECTION LOCAL DEFAULT 4
5: 00000000 0 FILE LOCAL DEFAULT ABS std.c
6: 00000000 0 FILE LOCAL DEFAULT ABS string.c
7: 00000000 0 FILE LOCAL DEFAULT ABS printk.c
8: f8101000 4 OBJECT LOCAL DEFAULT 2 video_mem
9: f810a038 2 OBJECT LOCAL DEFAULT 3 c_base
10: f810a03a 2 OBJECT LOCAL DEFAULT 3 cursor
11: f8101004 1 OBJECT LOCAL DEFAULT 2 attr
12: f8100108 149 FUNC LOCAL DEFAULT 1 scroll
13: f810019d 259 FUNC LOCAL DEFAULT 1 cwrite
14: 00000000 0 FILE LOCAL DEFAULT ABS boot.s
15: 00000001 0 NOTYPE LOCAL DEFAULT ABS MODULEALIGN
16: 00000002 0 NOTYPE LOCAL DEFAULT ABS MEMINFO
17: 00000003 0 NOTYPE LOCAL DEFAULT ABS FLAGS
18: 1badb002 0 NOTYPE LOCAL DEFAULT ABS MAGIC
19: e4524ffb 0 NOTYPE LOCAL DEFAULT ABS CHECKSUM
20: f8000000 0 NOTYPE LOCAL DEFAULT ABS KERNEL_VIRTUAL_BASE
21: 000003e0 0 NOTYPE LOCAL DEFAULT ABS KERNEL_PAGE_NUMBER
22: 00002000 0 NOTYPE LOCAL DEFAULT ABS STACKSIZE
23: f8102008 0 NOTYPE LOCAL DEFAULT 2 init_stack
24: f810a008 0 NOTYPE LOCAL DEFAULT 2 gdt
25: f810a030 0 NOTYPE LOCAL DEFAULT 2 gdt_ptr
26: f81002f0 0 NOTYPE LOCAL DEFAULT 1 MultiBootHeader
27: f810033a 0 NOTYPE LOCAL DEFAULT 1 start.upper
28: 00000000 0 FILE LOCAL DEFAULT ABS main.c
29: f81000a1 64 FUNC GLOBAL DEFAULT 1 strcpy
30: f810001d 26 FUNC GLOBAL DEFAULT 1 inb
31: f810a038 0 NOTYPE GLOBAL DEFAULT 3 _sbss
32: f8100038 54 FUNC GLOBAL DEFAULT 1 memcpy
33: f8101008 0 NOTYPE GLOBAL DEFAULT 2 init_kmap
34: f810a03c 0 NOTYPE GLOBAL DEFAULT 3 _ebss
35: f810a03c 0 NOTYPE GLOBAL DEFAULT ABS end
36: f810a03c 0 NOTYPE GLOBAL DEFAULT ABS __end
37: f8100350 13 FUNC GLOBAL DEFAULT 1 init
38: f8100000 29 FUNC GLOBAL DEFAULT 1 outb
39: f810006e 51 FUNC GLOBAL DEFAULT 1 memset
40: f810a03c 0 NOTYPE GLOBAL DEFAULT ABS _end
41: f81002fc 0 NOTYPE GLOBAL DEFAULT 1 start
42: f81000e1 36 FUNC GLOBAL DEFAULT 1 strlen
43: f81002a0 67 FUNC GLOBAL DEFAULT 1 cleark

without code linked:


Symbol table '.symtab' contains 25 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: f8100000 0 SECTION LOCAL DEFAULT 1
2: f8101000 0 SECTION LOCAL DEFAULT 2
3: 00000000 0 SECTION LOCAL DEFAULT 3
4: 00000000 0 FILE LOCAL DEFAULT ABS boot.s
5: 00000001 0 NOTYPE LOCAL DEFAULT ABS MODULEALIGN
6: 00000002 0 NOTYPE LOCAL DEFAULT ABS MEMINFO
7: 00000003 0 NOTYPE LOCAL DEFAULT ABS FLAGS
8: 1badb002 0 NOTYPE LOCAL DEFAULT ABS MAGIC
9: e4524ffb 0 NOTYPE LOCAL DEFAULT ABS CHECKSUM
10: 00002000 0 NOTYPE LOCAL DEFAULT ABS STACKSIZE
11: f8102000 0 NOTYPE LOCAL DEFAULT 2 init_stack
12: f810a000 0 NOTYPE LOCAL DEFAULT 2 gdt
13: f810a028 0 NOTYPE LOCAL DEFAULT 2 gdt_ptr
14: f8100000 0 NOTYPE LOCAL DEFAULT 1 MultiBootHeader
15: f810004a 0 NOTYPE LOCAL DEFAULT 1 start.upper
16: 00000000 0 FILE LOCAL DEFAULT ABS main.c
17: f810a030 0 NOTYPE GLOBAL DEFAULT 2 _sbss
18: f8101000 0 NOTYPE GLOBAL DEFAULT 2 init_kmap
19: f810a030 0 NOTYPE GLOBAL DEFAULT 2 _ebss
20: f810a030 0 NOTYPE GLOBAL DEFAULT ABS end
21: f810a030 0 NOTYPE GLOBAL DEFAULT ABS __end
22: f8100060 13 FUNC GLOBAL DEFAULT 1 init
23: f810a030 0 NOTYPE GLOBAL DEFAULT ABS _end
24: f810000c 0 NOTYPE GLOBAL DEFAULT 1 start
User avatar
NickJohnson
Member
Member
Posts: 1249
Joined: Tue Mar 24, 2009 8:11 pm
Location: Sunnyvale, California

Re: Instruction page faults appear when more code is linked

Post by NickJohnson »

Aha! I noticed that the symbol "start" is moving from 0x0000000c to 0x0000027c when the linking is done. I switched the order of arguments to the linker so boot.o is first, and that fixed the problem. However, I still don't know why the original problem should have happened at all. For future reference, it would be really nice to know...
User avatar
xenos
Member
Member
Posts: 1121
Joined: Thu Aug 11, 2005 11:00 pm
Libera.chat IRC: xenos1984
Location: Tartu, Estonia
Contact:

Re: Instruction page faults appear when more code is linked

Post by xenos »

Exactly whatI expected ;) Look at these symbols:

Code: Select all

8: f8101000 4 OBJECT LOCAL DEFAULT 2 video_mem
11: f8101004 1 OBJECT LOCAL DEFAULT 2 attr
33: f8101008 0 NOTYPE GLOBAL DEFAULT 2 init_kmap
If you put your C file first, the global variables defined in that file are put at the beginning of your page aligned data section. This pushes the page directory 8 bytes forward, so it is not page aligned anymore. Changing the order of the object files brings init_kmap back to its page aligned location.

I suggest placing init_kmap into a separate section and page align this section in the linker script.
Programmers' Hardware Database // GitHub user: xenos1984; OS project: NOS
Post Reply