Page 1 of 1
Weird page fault triggering bug...
Posted: Fri Mar 17, 2023 3:33 pm
by Matt1223
So I'm having this very strange bug that triggers a page fault. Apparently, those are the instructions that triggers the page fault:
Code: Select all
(0) [0x000000402f42] 0008:0000000000402f42 (unk. ctxt): add eax, 0x2345e064 ; 0564e04523
<bochs:8> s
Next at t=1451281218
(0) [0x000000402f47] 0008:0000000000402f47 (unk. ctxt): add dword ptr ds:[eax], eax ; 0100
It makes sense as 0x2345e064 is not a valid adress. The problem however is that when I look at my kernel.exe in IDA it shows following instructions at the same spot:
Code: Select all
.text:00402F42 add eax, offset word_40E064
.text:00402F47 movzx eax, word ptr [eax]
It looks like the code is being modified while the OS is running...
What's more interresting. I started having this issue after changing this line of code:
Code: Select all
terminal_print(term, "\tAvailable memory: %uMB\n", memory_get_available()/1024/1024);
to this one:
Code: Select all
terminal_print(term, "\tAvailable memory: %uMB\n", memory_get_available()/1000000);
in a funtion that has absolutely nothing to do with the page fault triggering function.
I have absolutely no idea what is going on in here. Can you help me with understanding it?
Re: Weird page fault triggering bug...
Posted: Fri Mar 17, 2023 3:44 pm
by Octocontrabass
You've got a pointer somewhere that points to something it shouldn't, and at some point something uses that pointer to write the value 0x12345 into memory.
It's impossible to narrow it down further without more information.
Re: Weird page fault triggering bug...
Posted: Fri Mar 17, 2023 3:50 pm
by Matt1223
Octocontrabass wrote:You've got a pointer somewhere that points to something it shouldn't, and at some point something uses that pointer to write the value 0x12345 into memory.
It's impossible to narrow it down further without more information.
Is there a way to make bochsdbg inform me when memory at the particular adress gets modified?
Re: Weird page fault triggering bug...
Posted: Fri Mar 17, 2023 3:54 pm
by Octocontrabass
Re: Weird page fault triggering bug...
Posted: Fri Mar 17, 2023 4:17 pm
by Matt1223
So it looks like this adress is being written to only once, when kernel is being loaded by the bootloader, and it's already wrong. Maybe there is a bug in my bootloader load_kernel funtion:
Code: Select all
load_kernel: ; ret eax = entry point , edx = ImageBase, edi = SizeOfImage
mov eax, [0x00010000+0x3C];PE offset
xor ecx, ecx
mov ecx, [0x00010000+eax+0x50] ;SizeOfImage
push ecx; push SizeOfImage
xor ecx, ecx
mov cx, [0x00010000+eax+6] ;number of section
mov edx, [0x00010000+eax+0x34] ;image base
push edx ; push ImageBase
xor ebx, ebx
mov bx, [0x00010000+eax+0x14];optional Header Size
add ebx, 0x00010000+0x18
add ebx, eax; ebx - section table
mov eax,[0x00010000+eax+0x28]
add eax, edx ;eax - AddressOfEntryPoint
.l1:
push ecx
mov ecx, [ebx+0x10];SizeOfRawData
mov edi, [ebx+0xC];VirtualAdress
mov esi, [ebx+0x14];PointerToRawData
add edi, edx
add esi, 0x00010000
cld
rep movsb
add ebx, 0x28
pop ecx
cmp ecx, 5
loop .l1
pop edx ; edx = ImageBase
pop edi ; edi = SizeOfImage
ret
Re: Weird page fault triggering bug...
Posted: Fri Mar 17, 2023 4:29 pm
by Octocontrabass
That function is supposed to copy the kernel from one location to another, right?
Have you checked to see if it's already corrupt before the copy happens?
Re: Weird page fault triggering bug...
Posted: Fri Mar 17, 2023 5:03 pm
by Matt1223
Octocontrabass wrote:That function is supposed to copy the kernel from one location to another, right?
Right. The kernel.exe file is present at 0x10000.
Octocontrabass wrote:Have you checked to see if it's already corrupt before the copy happens?
I'm working on it right now. There are some interresting things happening. So it looks like the load_kernel function breaks the file it is supposed to be copying. The instruction is right before it's copied to the place it is supposed to be:
Code: Select all
<bochs:14> disasm 0x00000012342
0000000000012342: ( ): add eax, 0x0040e064 ; 0564e04000
Then after the load_kernel function is called this happens:
Code: Select all
<bochs:17> disasm 0x00000012342
0000000000012342: ( ): add eax, 0x2345e064 ; 0564e04523
<bochs:18> disasm 0x000000402f42
0000000000402f42: ( ): add eax, 0x2345e064 ; 0564e04523
The code that was meant to be just copied changed and this changed code was copied to 0x402f42.
Re: Weird page fault triggering bug...
Posted: Fri Mar 17, 2023 5:19 pm
by Matt1223
I've found the problem!
So before loading kernel I was enabling a20 with a following function:
Code: Select all
enable_a20:
;set a20 http://wiki.osdev.org/A20 (Fast A20 Gate)
pushad ;test a20
mov edi,0x112345 ;odd megabyte address.
mov esi,0x012345 ;even megabyte address.
mov [esi],esi ;making sure that both addresses contain diffrent values.
mov [edi],edi ;(if A20 line is cleared the two pointers would point to the address 0x012345 that would contain 0x112345 (edi))
cmpsd ;compare addresses to see if the're equivalent.
popad
jne A20_on ;if not equivalent , A20 line is set.
in al, 0x92 ;enable a20
test al, 2
jnz A20_on
or al, 2
and al, 0xFE
out 0x92, al
A20_on:
ret
Code: Select all
mov edi,0x112345 ;odd megabyte address.
mov esi,0x012345 ;even megabyte address.
mov [esi],esi ;making sure that both addresses contain diffrent values.
mov [edi],edi ;(if A20 line is cleared the two pointers would point to the address 0x012345 that would contain 0x112345
This looks suspicious ,right?
What a stupid bug to make!