Page 1 of 1

Page Fault not Activating on Trapping Memory Reference

Posted: Sun Jan 27, 2013 4:27 pm
by zalman
Following the James Molloy OS development tutorial, I implemented paging and frame allocation for my kernel (just a simple placement allocator; no best-fit algorithm, or anything like that yet...)

The page fault handler works because I tested it by firing interrupt 14 (int $0x0E) and seeing a message printed on the screen; but, when I try to dereference an illegal address as follows:

Code: Select all

unsigned int *ptr = (unsigned int *)0xA000000;
unsigned int trigger_page_fault = *ptr;    // dereference
This does not work. What could be preventing the page fault from not firing?

The dereferenced value at that location is 0xFFFFFFFF (which is just garbage...) and interestingly enough, the page table entry, when 0xA0000000 is passed to my get_page() function, is 0x0 (meaning that the page is not even present), so why is the pointer being followed, and page fault not triggered?

Re: Page Fault not Activating on Trapping Memory Reference

Posted: Sun Jan 27, 2013 4:37 pm
by iansjack
The dereferenced value at that location is 0xFFFFFFFF (which is just garbage...)
If you can read the data at that address, garbage or not, then it means that you have a page mapped to it (or you have not enabled paging). It's not a problem with your page fault handler; such a fault would likely lead to a triple fault.

Re: Page Fault not Activating on Trapping Memory Reference

Posted: Sun Jan 27, 2013 6:54 pm
by zalman
The problem I was having was that paging was not being enabled properly. The value of control register CR0 was 0x60000011, but after I rewrote the following:

Code: Select all

int flag;
flag = 0x80000000;
__asm__ __volatile__ ("movl %%cr0, %%eax\n\t"
                      "orl %0, %%eax\n\t"
                      "movl %%eax, %%cr0"
                      :: "r" (flag) : "memory");
to:

Code: Select all

__asm__ __volatile__ ("movl %%cr0, %%eax\n\t"
                      "orl $0x80000000, %%eax\n\t"
                      "movl %%eax, %%cr0" : );
I get a triple fault. I was also getting a triple fault with the following:

Code: Select all

int cr0;
__asm__ __volatile__ ("movl %%cr0, %0" : "=r" (cr0));
cr0 |= 0x80000000;
__asm__ __volatile__ ("movl %0, %%cr0" :: "r" (cr0));
Note: I am enabling paging after loading CR3 with the page directory memory address.

What could be the problem?

Re: Page Fault not Activating on Trapping Memory Reference

Posted: Sun Jan 27, 2013 7:09 pm
by Griwes
What does Bochs triple fault output tell you?

Re: Page Fault not Activating on Trapping Memory Reference

Posted: Sun Jan 27, 2013 7:54 pm
by zalman
Solved:

It was a silly bug. I had a PAGE_SIZE declared at the top as 1 << 12, but I needed to put parenthesis around that: (1 << 12). So now it works...

What's interesting is that page fault also triggers a global protection fault. Is this even expected?

Re: Page Fault not Activating on Trapping Memory Reference

Posted: Sun Jan 27, 2013 11:50 pm
by Brendan
Hi,
zalman wrote:What's interesting is that page fault also triggers a global protection fault. Is this even expected?
The general protection fault shouldn't happen.
zalman wrote:The page fault handler works because I tested it by firing interrupt 14 (int $0x0E) and seeing a message printed on the screen; ...
For most exceptions, the CPU pushes an extra error code on the stack before passing control to your interrupt handler. For IRQs and software interrupts there is no error code. If your exception handler works when you do "int $0x0E", then it doesn't take this error code into account (e.g. and doesn't remove the CPU's error code from the stack before doing IRET), and therefore your exception handler won't work for a real exception (in your case, it probably displays dodgy values for "return CS:EIP" and then crashes when it gets to the IRET).


Cheers,

Brendan

Re: Page Fault not Activating on Trapping Memory Reference

Posted: Mon Jan 28, 2013 8:22 pm
by zalman
My exception handler is working for real exceptions. For instance, my exception handler for division by zero (int $0x0) is being called when I try the following:

Code: Select all

int i;
i = 500 / 0;
The problem is that any exception that all exceptions in the range 0x0 - 0x1F raise a GPF. This includes page fault, NMI, stack fault, double fault, etc. On the other hand, IRQs from 0x20 to 0x2F do not raise exceptions.

I get error code 0x10 with error flags 0x10046 for the global protection fault. I don't have a copy of the Intel manual with me at the moment, so I wouldn't know what these mean, but something is not right.

Re: Page Fault not Activating on Trapping Memory Reference

Posted: Tue Jan 29, 2013 1:33 am
by iansjack
You didn't answer Brendan's question as to whether you are popping the error code from the stack (when there is one) before the iret. It sounds as if you aren't.

Re: Page Fault not Activating on Trapping Memory Reference

Posted: Tue Jan 29, 2013 3:12 am
by Brendan
Hi,
zalman wrote:My exception handler is working for real exceptions. For instance, my exception handler for division by zero (int $0x0) is being called when I try the following:

Code: Select all

int i;
i = 500 / 0;
For the divide error exception the CPU doesn't push an error code onto the exception handlers stack; so if none of your exception handlers remove the error code when they should then division by zero would still work.
zalman wrote:The problem is that any exception that all exceptions in the range 0x0 - 0x1F raise a GPF. This includes page fault, NMI, stack fault, double fault, etc. On the other hand, IRQs from 0x20 to 0x2F do not raise exceptions.
If "all exceptions" don't work, then the divide error exception doesn't work, but you've said the divide error exception does work.

I find it extremely unlikely that you've actually tested half of the exceptions you've listed (especially NMI and stack fault). Instead I assume you're using "int n" software interrupts, but software interrupts are not exceptions and have different behaviour. This is like testing a vehicle to see if it will roll along a road, and then deciding "the vehicle does roll along a road, therefore the vehicle should make a good boat" (and then wondering why the vehicle sinks to the bottom of a river when you try to go fishing).
zalman wrote:I get error code 0x10 with error flags 0x10046 for the global protection fault. I don't have a copy of the Intel manual with me at the moment, so I wouldn't know what these mean, but something is not right.
If you ask the internet nicely, the internet might let you borrow its copy of Intel's manual. ;)


Cheers,

Brendan

Re: Page Fault not Activating on Trapping Memory Reference

Posted: Tue Jan 29, 2013 5:39 pm
by zalman
The general protection fault I was receiving was because I wasn't restoring the segment registers correctly before doing IRET. Okay, so I fixed that.

Now, about popping the error code from the runtime stack: I believe I was taking care of that. I would push the interrupt number (after pushing a dummy err code for non-error interrupts). Then I would pop these before doing IRET by adding to ESP a 4-byte word for the error code, and another 4-byte word for the interrupt number.

Anyway, I believe I was having a misconception. "int $0x00" would be considered a software interrupt, but if instead I did the following...

Code: Select all

int i;
i = 5 / 0;
...then this would trigger a hardware interrupt; correct?

Everything is working as expected for traps (they have been working all along, even before I fixed the issue with the segment registers); but, after I made the necessary changes to the assembler code, the hardware interrupts are now being triggered repeatedly. No GPF being raised; it just blocks, waiting for something to be done about the exception. In the tutorial, the author just calls a non-preemptive PANIC. I don't know how else exceptions are supposed to be handled. I guess I'll just read along. Thanks for the help.