Invalid page fault, caused by what?

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
giszo
Member
Member
Posts: 124
Joined: Tue Nov 06, 2007 2:37 pm
Location: Hungary

Invalid page fault, caused by what?

Post by giszo »

Hi,

I've a really strange invalid page fault. It happens when I try to start one of the applications inside my OS. What makes it stranger is that it happens only 1 time from 15-20 tries, so it's happening randomly...

The information I got my page fault handler is the following:

Code: Select all

Invalid page fault at 0x4000b160 (no region for address)
Error code: 4                                                         
EAX=0 EBX=0 ECX=0 EDX=0                                               
ESI=0 EDI=0                                                           
EBP=0                                                                 
CS:EIP=1b:40000150                                                    
SS:ESP=23:c0007fae                                                    
Process: taskbar thread: main                                         
Memory context dump:                                                  
  region count: 3                                                     
  region #0                                                           
    id: 70 name: ro                                                   
    start: 40000000 size: 20480                                       
    flags: 1 alloc method: 1                                          
  region #1                                                           
    id: 71 name: rw                                                   
    start: 40005000 size: 8192                                        
    flags: 3 alloc method: 1
  region #2
    id: 72 name: stack
    start: c0000000 size: 32768
    flags: 13 alloc method: 2
Data at EIP:
   e8 3b 35 0 0 eb fe 90 90 90 90 90 90 90 90 90
The informations I posted above are from the stack frame that the page fault handler pushed to the stack.

As you can see the fault was caused because the thread tried to acces the data at address: 0x4000b160. The memory content at the EIP location is also dumped and it is the same as I see in the objdump output. The real instruction there is the following:

Code: Select all

40000150 <_start>:
40000150:       e8 3b 35 00 00          call   40003690 <__libc_start_main>
The strange thing is that this is a call instruction. The stack pointer is valid and that address is mapped as you can see from the memory context dump.

Now my question is the following: what the hell caused the page fault?

Thanks,
giszo
User avatar
AJ
Member
Member
Posts: 2646
Joined: Sun Oct 22, 2006 7:01 am
Location: Devon, UK
Contact:

Re: Invalid page fault, caused by what?

Post by AJ »

Hi,

From the error code, the PFE happened due to an attempted write to a read only page (this ties in with your context dump). Oddly, this was an attempted data write, not an instruction fetch. This would make sense if you use lazy loading for your processes and something is trying to load executable code to that region, perhaps?

Cheers,
Adam
giszo
Member
Member
Posts: 124
Joined: Tue Nov 06, 2007 2:37 pm
Location: Hungary

Re: Invalid page fault, caused by what?

Post by giszo »

AJ wrote:Hi,

From the error code, the PFE happened due to an attempted write to a read only page (this ties in with your context dump). Oddly, this was an attempted data write, not an instruction fetch. This would make sense if you use lazy loading for your processes and something is trying to load executable code to that region, perhaps?

Cheers,
Adam
I'm using lazy binary loading, yes. But there are two things that makes me think it's not a problem in the lazy loader stuff:
  • The fault address is not page aligned. If it was caused by the lazy loader, then it would happen at the beginning of a page and not in the middle.
  • In the case of an error in the lazy loader the EIP value should be from the kernel somewhere, or not? :)
giszo
giszo
Member
Member
Posts: 124
Joined: Tue Nov 06, 2007 2:37 pm
Location: Hungary

Re: Invalid page fault, caused by what?

Post by giszo »

I think I found the problem :D

The problem was that I didn't save the value of the CR2 register during a context switch. So if the context switch happened between the PF handler start point and the point where I get the value of CR2, it caused a problem. When that thread that was interrupted got time to run again and read a bad CR2 value from the previous thread just crashed simply with that invalid page fault.

giszo
Post Reply