Page 2 of 3
Posted: Sat Jun 07, 2008 6:10 am
by suthers
kmcguire wrote:suthers wrote:Thanks, I've managed to fix it.
@Candy: Thanks for the post, it allowed me to figure out what was wrong, I was still adding 4 to esp on top of pushing because I add stuff to the stack in the isr (as you probably figured out, that was the common part of the isr that... I was pushing a byte and for some reason adding 4, (Yah I'm pretty stupid sometimes...) so changing it to 1 fixed it.
Thanks,
Jules
edit: Oh and fixing this uncovered a series off other bugs...
I am still lost at how you were pushing a byte? You pushed sixty-four bits of data onto the stack inside the ISR stub as two zeros..
The
pop eax and
add esp, 4 looked correct. I can not find a instruction for pushing a immediate eight byte value in the 80386 instruction set while only incrementing the stack pointer by one. It increments it by four even though you only push one byte.
Didn't know that, even so, Its the only config that works, though it might explain the hundreds of errors that occur afterwards...
I'll try and debug it and post if i find anything interesting...
Thanks,
Jules
Posted: Sat Jun 07, 2008 9:20 am
by Dex
I think kmcguire second comment is right, i think your PROBLEM could be that some
errors push error code on to the stack others do not, you seem to be push dummy error code onto the stack for ALL errors.
Also would you not be best using a structure registers_t, which is a representation of all the registers you pushed ?.
NOTE: This is from a none C coders point of view.
Posted: Sat Jun 07, 2008 11:17 am
by suthers
I have only put dummy error codes where they are necessary...
I don't really see the point in passing all my registers to the function...
Still can't find any thing that would cause the error, though I haven't been searching for long (Just got back from school).
Thanks,
Jules
Posted: Mon Jun 09, 2008 7:17 am
by suthers
Ok, so I'm still trying to trace this error (I don't have much time because of exams, so it's taking ages...), firstly it turns, out that adding 4 to the esp works, but it still causes loads of errors, one after the other directly after an interrupt, I think its because it returns to the wrong address after the iretd.
So I wanted to ask, what's the best way to trace were the interrupt returns to?
Thanks in advance,
Jules
Posted: Mon Jun 09, 2008 6:19 pm
by suthers
It would be useful to know in what order an interrupt call pops SS, EIP, ESP, CS and the return address on the stack...
It would allow me to see what the return address is and would help me to debug...
Does anybody know in what order this is done?
Thanks in advance,
Jules
Posted: Mon Jun 09, 2008 11:13 pm
by inx
I could tell you the order, but I think it would be better if I just told you it's in the Intel manuals.
Posted: Tue Jun 10, 2008 3:53 am
by suthers
fair enough, I'll find it (I should have done that in the first place anyway, sorry for breaking forum rules...).
Thanks,
Jules
Posted: Tue Jun 10, 2008 4:26 am
by AJ
Hi,
You could look at the Intel Manuals, but for something like this, I always find it quicker to use
Sandpile (look up x86->structures->stack frame).
Cheers,
Adam
Posted: Tue Jun 10, 2008 6:10 am
by suthers
@AJ: Thanks, sandpile is incredibly useful, I've bookmarked it.
I found that the IRET was returning to the exact same instruction that caused the div 0 exception, hence explaining the div 0 error loop I was getting with an eventual GPF....
Am I supposed to increment the EIP on the descriptor for the iret, what should I do?
Thanks in advance,
Jules
Posted: Tue Jun 10, 2008 6:56 am
by suthers
Program flow continued normally once I added some code to increment the return address, but how would you deal with this otherwise in the kernel thread?
Thanks in advance,
Jules
Posted: Tue Jun 10, 2008 7:01 am
by AJ
Hi,
Personally, I would terminate a program on a Div 0 Exception. Why? Because the program has obviously got some data it's manupulating where something has gone wrong and you can't possibly know what the program was supposed to do. If you do anything other than terminating the program, you leave the system in an indeterminate state which is a Bad Thing(tm).
Cheers,
Adam
[edit]Going back to sandpile, the exception list there gives you some idea whether the exception is a fault or a trap and so whether EIP points to the erroneous instruction or the following instruction.[/edit]
Posted: Tue Jun 10, 2008 7:37 am
by suthers
AJ wrote:Hi,
Personally, I would terminate a program on a Div 0 Exception. Why? Because the program has obviously got some data it's manupulating where something has gone wrong and you can't possibly know what the program was supposed to do. If you do anything other than terminating the program, you leave the system in an indeterminate state which is a Bad Thing(tm).
Cheers,
Adam
[edit]Going back to sandpile, the exception list there gives you some idea whether the exception is a fault or a trap and so whether EIP points to the erroneous instruction or the following instruction.[/edit]
Thanks, that can be really useful (I assumed that they all didn't increment the EIP and all would point to the instruction that caused the error....)
Thanks,
Jules
P.S. Yay, I've finally fixed it
Posted: Tue Jun 10, 2008 10:56 am
by Dex
Could you post the working code, so it may help others, i am also interested in seeing the working code, to see the fault
.
Posted: Tue Jun 10, 2008 11:39 am
by suthers
No problem, I should have done so in the first place:
Code: Select all
_isr:
_isr0:
cli
push byte 0
push byte 0
jmp _isr
_isr1:
cli
push byte 0
push byte 1
jmp _isr
_isr2:
cli
push byte 0
push byte 2
jmp _isr
...
_isr31:
cli
push byte 0
push byte 31
jmp _isr
these are the blocks that I use to handle the interupts and they call:
Code: Select all
_isr:
mov [store1], eax
pop eax
mov [store2], eax
mov eax, [store1]
pusha
push es
push ds
push fs
push gs
mov ax, 0x10
mov ds, ax
mov es, ax
mov fs, ax
mov gs, ax
mov eax, _int_handler
push dword [store2]
call eax
pop eax
pop gs
pop fs
pop ds
pop es
popa
add esp, 4
pop eax
inc eax
push eax
sti
iret
...
SECTION .data
...
store1 dd 0
store2 dd 0
int_handler:
Code: Select all
void int_handler(int err_code)
{
k_printf("\nKERNEL PANIC:", 0x04);
k_printf(exception_messages[err_code], 0x04);
return;
};
(exception_messages is a char* array)
In the code I used:
Code: Select all
volatile int test = 10;
....
test /= 0;
to cause a div 0 exception.
The first error was caused by the amount I added to esp at the end, because I used "push
byte", I assumed it only negate 1 from the stack, where as I now know it always negates 4, so by adding 4 to esp at the end to pass over the (error code/0 added to keep stack frame balanced), one can repair this error...
The second was because when the CPU calls the interrupt the value of the EIP that was pushed onto the stack was the address of the instuction that caused the error, so when i did an iret, it returned to that instruction, causing an exception loop...
So to work round this, I added code to increment the value of EIP that is on the stack by one (though of course this won't necessarily work as instructions have different lengths).
So for now i've decided to halt the CPU when an exception occurs in the kernel..., I'll change that when I get to implementing multi threading and user mode threads....
So the only reason that increasing the value of EIP used by iret by 1 worked was because the compiler converted:
into (something like):
So, the only reason this works is because 'div [reg]' happens to be 2 bytes long, and the second part was probably a 1byte opcode that didn't do anything much...
(So actually its a better idea to add 2 to the esp used by iret....)
But doing this is a bit pointless because if you miss out an instruction, this might interfere with the working off your kernel, so its better to hlt the CPU (which is what I now do after displaying the error message)
Thanks for all the help.
Jules
Posted: Tue Jun 10, 2008 12:32 pm
by Dex
Thanks for the update