Hi,
I had this problem for a while that was really frustrating me;
And for some reason it doesn't seem to affect only the stack but the global variables also, which is maybe because something corrupts the registers but it is not interrupts since I save all registers used in each interrupt.
I have disabled compiler optimization, and, I really don't know where should I even look at.
I don't know what to do, and using Bochs debugger and just stepping on each instruction would be insane.
I would really appreciate if you find something wrong.
Stack (and global variables) corruption in kernel
Stack (and global variables) corruption in kernel
- Attachments
-
- Quartz-master.rar
- Entire repository
- (40.7 KiB) Downloaded 22 times
Re: Stack (and global variables) corruption in kernel
If you use gdb in conjunction with qemu you can set watches on variables or memory locations. These will break into the program when the item being watched changes. Set a watch on one of the variables that is being corrupted; when it changes unexpectedly you have isolated the problem to a particular line of code.
Re: Stack (and global variables) corruption in kernel
I... have never used GDB before... But I have found something [url]here[/url], but of what I see this is for ELF files, but my kernel is a PE file. Are there any Windows or just PE alternatives? Because porting code to GCC, adding support for ELF in bootloader, etc. would take a while. Thanks anyway.
Did you had (to the reader) some similar problems in past? Because possibly our problems may be similar, as the "tree" for OS developement don't have a lot of "branches" in the beginning.
Did you had (to the reader) some similar problems in past? Because possibly our problems may be similar, as the "tree" for OS developement don't have a lot of "branches" in the beginning.
Re: Stack (and global variables) corruption in kernel
I have no experience with pe files, but gdb does support them. http://www.delorie.com/gnu/docs/gdb/gdb_145.html
I suspect that everyone who develops an is has problems with stack or variable corruption at some time. It could be caused by just about anything.
I suspect that everyone who develops an is has problems with stack or variable corruption at some time. It could be caused by just about anything.
Re: Stack (and global variables) corruption in kernel
Figuring from the symptoms, the memory of your global variables (.data, .bss or its PE equivalents) and stack may be overlapping.
I took a quick look at your code:
This is very fragile way to initialize the stack. I can't even start to speculate what happens when you change both ESP and EBP in the middle of C function. You should never do that. Kernel's entrypoint should be in assembly code, you set up things there and then call C code.
I took a quick look at your code:
Code: Select all
void main()
{
_asm
{
mov ax, 0x08
mov ds, ax
mov es, ax
mov fs, ax
mov gs, ax
mov ss, ax
mov esp, 0x7c00
mov ebp, esp
}
If something looks overcomplicated, most likely it is.
Re: Stack (and global variables) corruption in kernel
I was not able to work on Quartz for some time, but here I am now
@Velko:
Thanks for the tip. Well I can just add "naked" attribute to the function... But I do not. Instead I just put it in an assembly function "_entry".
But sadly that did not fixed the problem. What I noticed is that disabling (e.g. masking) keyboard interrupt and therefore not executing its handler, no stack corruption occur.
I'm not even sure is this really stack corruption or whatever, but I know it's something with keyboard interrupt handler.
Through the fact that it operates correctly in bochsdbg which is not a case with bochs and the fact that each boot gives another random corruption makes me think that this may be something with time, as there is no randomness in computers. (and ironically current time is the key factor of computers we use at homes today random number generation)
...and the fact that I can not debug stack without bochsdbg!
All of my previous interrupt handlers for Quartz I wrote in assembly because of what I remember that every time I wrote them in C (and before C++) some kind of corruption happened.
"interrupt" at the beginning of function declaration is just a "#define" to "declspec(naked)" (e.g. "naked" function attribute) in order to code not execute between the start of function block and assembly block (and correspondingly to the end of the function).
Although I can write this handler in assembly that's not really a solution... What do I do wrong (or compiler does, doing its optimization) which I can fix?
@Velko:
Thanks for the tip. Well I can just add "naked" attribute to the function... But I do not. Instead I just put it in an assembly function "_entry".
But sadly that did not fixed the problem. What I noticed is that disabling (e.g. masking) keyboard interrupt and therefore not executing its handler, no stack corruption occur.
I'm not even sure is this really stack corruption or whatever, but I know it's something with keyboard interrupt handler.
Through the fact that it operates correctly in bochsdbg which is not a case with bochs and the fact that each boot gives another random corruption makes me think that this may be something with time, as there is no randomness in computers. (and ironically current time is the key factor of computers we use at homes today random number generation)
...and the fact that I can not debug stack without bochsdbg!
Code: Select all
uint8 _val;
interrupt keyb_irqHandler()
{
_asm pusha
if (inb(KEYBC)&STATUS_READ)
{
_val = inb(KEYBE);
switch (_val)
{
case EXTENDED_SCANCODE1:
prev_ext= k_ext1; break;
case EXTENDED_SCANCODE2:
prev_ext= k_ext2; break;
default:
if (_val & 0x80)//Release
{
_val ^= 0x80;
keyb_down[_val] ^= prev_ext;
keyb_queue[kq_end].ext = prev_ext;
keyb_queue[kq_end++].val = _val;
if (kq_end == kq_max)
{
kq_end ^= kq_end; //Clearing to zero (This way looks familiar, doesn't it?)
}
kq_count++;
if (kq_count == kq_max)
{
//Buffer overflow
}
}
else
{
keyb_down[_val] |= prev_ext;
// TODO : LED update
// Forced
}
prev_ext &= 0;
}
}
intend(1);
_asm
{
popa
iretd
}
}
"interrupt" at the beginning of function declaration is just a "#define" to "declspec(naked)" (e.g. "naked" function attribute) in order to code not execute between the start of function block and assembly block (and correspondingly to the end of the function).
Although I can write this handler in assembly that's not really a solution... What do I do wrong (or compiler does, doing its optimization) which I can fix?
Re: Stack (and global variables) corruption in kernel
Don't do this. ASM snippets that manipulate stack are liable to confuse the compiler into generating broken code. Instead of surrounding your code with ASM snippets, use an ASM stub function to do the register saving and the "iretd", and just have it call a normal C function. If you don't want to get an assembler involved, try to use ASM on file level to generate the stub. As in:
This approach also works with other compilers. And, if you ever want to go multi-platform, it is easier to separate out the arch-dependent code this way.
Code: Select all
_asm {
keyb_irqHandler:
pusha
call _keyb_irqHandler_c
popa
iretd
}
void keyb_irqHandler_c(void) {
/* foo */
}
Carpe diem!
Re: Stack (and global variables) corruption in kernel
And stack corrupts no more thanks to @nullplan! Thanks brother!
Thanks to Velko for showing me a better and less fragile practice of making an entry from bootloader, and thanks to Iansjack for showing me another (and probably better) ways of debugging C in an emulator.
Now back to the FDC...
Thanks to Velko for showing me a better and less fragile practice of making an entry from bootloader, and thanks to Iansjack for showing me another (and probably better) ways of debugging C in an emulator.
Now back to the FDC...