Strang Bugs in my OS

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
LIC
Member
Member
Posts: 44
Joined: Mon Jun 04, 2018 8:10 am
Libera.chat IRC: lic

Strang Bugs in my OS

Post by LIC »

Hi all,

I am trying to develop a small kernel but run into several very strange bugs that I really don't understand.
The first main one is with isr handler, which is supposed to print the exception error message but prints the memory address 0x00000000 (I checked the asm code).
Then there is some strange behavior with the keyboard: when a key is pressed it prints the corresponding character (everything normal up to here) but then runs into GP fault, except for the 'L' key!!
And the most strange thing is that when I add/remove some files, or when I comment/uncomment some lines of code (even if these lines are executed after the buggy one), it "turns" on or off the bug/bugs I mentioned before...

Here is a link to my complete kernel code: https://github.com/leonard-limon/osdev

Does anyone have an explanation or even a clew to these strange bugs?

Regards
User avatar
lkurusa
Member
Member
Posts: 42
Joined: Wed Aug 08, 2012 6:39 am
Libera.chat IRC: Levex
Location: New York, NY
Contact:

Re: Strang Bugs in my OS

Post by lkurusa »

This looks like your code is overwriting something either via a stack overflow or a mis/unchecked pointer. Good luck finding the bug!
Cheers,

Lev
LIC
Member
Member
Posts: 44
Joined: Mon Jun 04, 2018 8:10 am
Libera.chat IRC: lic

Re: Strang Bugs in my OS

Post by LIC »

hi and thanks for your reply.
I'm afraid this is not a stack overflow issue because that stack pointer when the bug occurs is 0x1ff7c and the kernel code only goes to roughly 0x8000...
Tell me if you think I am wrong
User avatar
eryjus
Member
Member
Posts: 286
Joined: Fri Oct 21, 2011 9:47 pm
Libera.chat IRC: eryjus
Location: Tustin, CA USA

Re: Strang Bugs in my OS

Post by eryjus »

I tend to agree with lkurusa. This sounds like a stack problem to me as well -- such as alignment or overwriting or structure packing discrepancy in asm vs C. For example are you pushing your segment registers in asm and expecting them to be 16 bits in a C structure?

I would get a copy of Bochs and use the internal debugger to step-check your code. If you still need help, some more specifics would help.
Adam

The name is fitting: Century Hobby OS -- At this rate, it's gonna take me that long!
Read about my mistakes and missteps with this iteration: Journal

"Sometimes things just don't make sense until you figure them out." -- Phil Stahlheber
PoisonNinja
Posts: 5
Joined: Thu Jun 02, 2016 9:04 pm
Libera.chat IRC: PoisonNinja

Re: Strang Bugs in my OS

Post by PoisonNinja »

Hey, it looks like you’re passing the registers by value to the interrupt handler so when you return the values are corrupted since functions are allowed to modify their parameters.

Maybe try using a pointer to the registers you pushed onto the stack.
MichaelPetch
Member
Member
Posts: 799
Joined: Fri Aug 26, 2016 1:41 pm
Libera.chat IRC: mpetch

Re: Strang Bugs in my OS

Post by MichaelPetch »

I'd start earlier in the process. Given that removing and deleting files may make things work, i'd make sure the bootloader is actually reading the entire kernel into memory. One likely candidate is that your kernel image is larger than the number of 512 byte sectors you load.

This would explain why in your question you say the error message doesn't print. The .data section is placed after .text. If the .data section isn't fully loaded into memory it is probably reading 0x00 from memory which makes the strings appear to have nothing in them and thus not displayed. Printing numbers would work because those are likely printed without references to the .data section. Of course it is also possible that not all of your .text section is loaded into memory so that could cause functions to fail if the instructions aren't loaded.

This is a very common problem with the questions that get asked on Stackoverflow when someone has developed their own bootloader instead of using something like GRUB/multiboot.

There may well be other serious issues in your code, but I think you need to start eliminating the large scale problems before tackling the smaller bugs (like interrupt handling etc)

Likely not related to your issues (but it is still an issue) If you make your own bootloader you should also create a mechanism where you can zero the .bss section out as memory isn't necessarily guaranteed to be filled with zeroes already. Usually you can create a linker script that sets a symbol to the beginning of the BSS section and the end. Your code can then iterate over that memory and zero it out. If you used GRUB/multiboot this is done for you when loading your ELF executable into memory.
MichaelPetch
Member
Member
Posts: 799
Joined: Fri Aug 26, 2016 1:41 pm
Libera.chat IRC: mpetch

Re: Strang Bugs in my OS

Post by MichaelPetch »

Your interrupt routines don't seem to be re-entrant so I think you should for the time being be sending the EOIs after you call them in your irq handler (not before). Your keyboard handler shouldn't be polling port 0x64 in a loop. When you get a keyboard interrupt you can read the keyboard byte right from port 0x60.
nullplan
Member
Member
Posts: 1801
Joined: Wed Aug 30, 2017 8:24 am

Re: Strang Bugs in my OS

Post by nullplan »

PoisonNinja wrote:Hey, it looks like you’re passing the registers by value to the interrupt handler so when you return the values are corrupted since functions are allowed to modify their parameters.

Maybe try using a pointer to the registers you pushed onto the stack.
I don't know the 32-bit ABI well enough to know how structure passing works, but in the 64-bit ABI, large structures are passed as their pointer. That is, the caller allocates space for a temporary copy, copies the argument there, then passes in a pointer to the copy. And cleans it up afterwards. Which doesn't happen here, so I assume the parameters are already passed in wrong. Your advice is good, though!
MichaelPetch wrote:Your interrupt routines don't seem to be re-entrant so I think you should for the time being be sending the EOIs after you call them in your irq handler (not before). Your keyboard handler shouldn't be polling port 0x64 in a loop. When you get a keyboard interrupt you can read the keyboard byte right from port 0x60.
Not a problem, as the IF remains at 0 the entire time. PIC can re-issue as many interrupts as it likes, the CPU won't recognize them until the IRET instruction.
Carpe diem!
LIC
Member
Member
Posts: 44
Joined: Mon Jun 04, 2018 8:10 am
Libera.chat IRC: lic

Re: Strang Bugs in my OS

Post by LIC »

Thank you for all your replies!
Indeed my loader was not loading enough blocks to load all the kernel... I feel a bit dumb right now #-o . Now that the kernel is fully loaded the exception message is showing perfectly!
If you make your own bootloader you should also create a mechanism where you can zero the .bss section out as memory isn't necessarily guaranteed to be filled with zeroes already
I am not sure what you mean by the .bss section, where is this located in memory ?

I still have the keyboard issue though: depending on what character I type, it goes into General Protection Fault or not...
LIC
Member
Member
Posts: 44
Joined: Mon Jun 04, 2018 8:10 am
Libera.chat IRC: lic

Re: Strang Bugs in my OS

Post by LIC »

Ok I looked at the assembler code of my kernel and here's what happens when I call the print or putc function inside my interrupt handler ...

Code: Select all

extern void irq_handler(const registers_t r) {

	// if interrupt was raised by slave PIC send EOI to slave
	if (r.int_no >= 40) {
        outb(0xa0, 0x20);
	}

	// send EOI to master
	outb(0x20, 0x20);

	// if interrupt handler exists, run it
	//if (interrupt_handlers[r.int_no]) {
    //    interrupt_handlers[r.int_no](r);
	//}

	print("clk\n");

}

Code: Select all

pusha
000000B6  1E                push ds
000000B7  06                push es
000000B8  0FA0              push fs
000000BA  0FA8              push gs
000000BC  66B81000          mov ax,0x10
000000C0  8ED8              mov ds,ax
000000C2  8EC0              mov es,ax
000000C4  8EE0              mov fs,ax
000000C6  8EE8              mov gs,ax
000000C8  8925E2100000      mov [0x10e2],esp
000000CE  E8E1140000        call 0x15b4
000000D3  0FA9              pop gs
000000D5  0FA1              pop fs
000000D7  07                pop es
000000D8  1F                pop ds
000000D9  61                popa
000000DA  81C408000000      add esp,0x8
000000E0  FB                sti
000000E1  CF                iret

Code: Select all

000015B4  83EC0C            sub esp,byte +0xc
000015B7  837C244027        cmp dword [esp+0x40],byte +0x27
000015BC  7612              jna 0x15d0
000015BE  83EC08            sub esp,byte +0x8
000015C1  6A20              push byte +0x20
000015C3  68A0000000        push dword 0xa0
000015C8  E822EDFFFF        call 0x1000002ef
000015CD  83C410            add esp,byte +0x10
000015D0  83EC08            sub esp,byte +0x8
000015D3  6A20              push byte +0x20
000015D5  6A20              push byte +0x20
000015D7  E813EDFFFF        call 0x1000002ef
000015DC  C7442420CB2D0000  mov dword [esp+0x20],0x2dcb
000015E4  83C41C            add esp,byte +0x1c
000015E7  E9F0010000        jmp 0x17dc
The last instruction (jmp 0x17dc) jumps to the print function but it sets the argument first (mov [esp+0x20], 0x2dcb) but at esp+0x20 there is the value of GS which is poped before the iret instruction. So when 0x2dcb is poped into GS I obviously get a General protection fault.

Do you know how to tell the compiler to avoid that?
Octocontrabass
Member
Member
Posts: 5586
Joined: Mon Mar 25, 2013 7:01 pm

Re: Strang Bugs in my OS

Post by Octocontrabass »

Pass the registers_t struct by a pointer instead of by value.

Even though you've defined it as const, that just means you can't write any code that changes its value; the compiler is still free to reuse that stack space for something else. ("Why?" Because the System V ABI says so.)
LIC
Member
Member
Posts: 44
Joined: Mon Jun 04, 2018 8:10 am
Libera.chat IRC: lic

Re: Strang Bugs in my OS

Post by LIC »

Oook, that works now! Thank you for all your replies
Post Reply