Odd stack problem on function call inside nested for-loops
Posted: Tue Dec 27, 2022 6:29 pm
Hello, a couple of days ago I was having lots of odd bugs in a function. I found out that the logic in the function wasn't the problem, as I could reproduce the bug by just having a function call inside a particular nested for-loop.
I wrote 3 tests, and two of them work fine.
My setup: I have paging and multitasking implemented.
I tested this with my gcc cross-compiler and with clang. It runs well on qemu but fails on bochs.
Test 1:
(This overwrites the kernel stack)
Test 2:
This works well:
Test 3:
This works well also:
Things I tried:
I tried disabling interrupts during the execution of the function, tried using a bigger stack for the kernel(I'm using 16kb), tried all levels of optimization on GCC(and clang), tried without optimizations, tried with and without function alignment(-fno-align-functions), tried single stepping through that piece of code to find the problematic assembly instruction(it doesn't fail during single stepping). I have also tried using volatile variables for the loop.
As I said, it works on qemu but fails on bochs.
It always overwrites the same thing. After executing test1() the eip on the next task switch becomes 0x03 instead of the stored value for the next process, so iret jumps to 0x3 and raises an invalid opcode exception. Maybe it's overwriting the stack, maybe it's overwriting memory elsewhere...
When multitasking is off it works. Multitasking works well with everything else thrown at it(multiple ring0 and ring3 processes, etc).
There's something I think could be causing this... I was having trouble implementing multiple ring0 tasks because iret doesn't push ESP when a privilege change doesn't occur, so, I read someone stating that it was ok to not change the stack for ring0 tasks and it worked until now.
What's the proper way of implementing multiple ring0 tasks?
What else could be happening here?
How just moving the "return()" to outside the loop prevents the bug from happening?
I wrote 3 tests, and two of them work fine.
My setup: I have paging and multitasking implemented.
I tested this with my gcc cross-compiler and with clang. It runs well on qemu but fails on bochs.
Test 1:
(This overwrites the kernel stack)
Code: Select all
DWORD fInternal(DWORD b){
return(10+b);
}
DWORD test1(){
DWORD a, b = fInternal(20);
for(DWORD i = 0; i < 10; i++)
for(DWORD j = 0; j < 10; j++)
for(DWORD k = 0; k < 10; k++)
for(DWORD l = 0; l < 10; l++)a = b + i + j + k + l + fInternal(20);
return(a);
}
This works well:
Code: Select all
DWORD fInternal(DWORD b){
return(10+b);
}
DWORD test2(){
DWORD b = fInternal(20);
for(DWORD i = 0; i < 10; i++)
for(DWORD j = 0; j < 10; j++)
for(DWORD k = 0; k < 10; k++)
for(DWORD l = 0; l < 10; l++)return(b + i + j + k + l + fInternal(20));
}
This works well also:
Code: Select all
DWORD fInternal(DWORD b){
return(10+b);
}
DWORD test3(){
DWORD a, b = fInternal(20);
for(DWORD i = 0; i < 10; i++)
for(DWORD j = 0; j < 10; j++)
for(DWORD k = 0; k < 10; k++)a = b + i + j + k + fInternal(20);
return(a);
}
I tried disabling interrupts during the execution of the function, tried using a bigger stack for the kernel(I'm using 16kb), tried all levels of optimization on GCC(and clang), tried without optimizations, tried with and without function alignment(-fno-align-functions), tried single stepping through that piece of code to find the problematic assembly instruction(it doesn't fail during single stepping). I have also tried using volatile variables for the loop.
As I said, it works on qemu but fails on bochs.
It always overwrites the same thing. After executing test1() the eip on the next task switch becomes 0x03 instead of the stored value for the next process, so iret jumps to 0x3 and raises an invalid opcode exception. Maybe it's overwriting the stack, maybe it's overwriting memory elsewhere...
When multitasking is off it works. Multitasking works well with everything else thrown at it(multiple ring0 and ring3 processes, etc).
There's something I think could be causing this... I was having trouble implementing multiple ring0 tasks because iret doesn't push ESP when a privilege change doesn't occur, so, I read someone stating that it was ok to not change the stack for ring0 tasks and it worked until now.
What's the proper way of implementing multiple ring0 tasks?
What else could be happening here?
How just moving the "return()" to outside the loop prevents the bug from happening?