[SOLVED] Exception 13 thrown when initializing my kernel

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Kaius
Posts: 20
Joined: Sat Dec 31, 2022 10:59 am
Libera.chat IRC: Kaius

[SOLVED] Exception 13 thrown when initializing my kernel

Post by Kaius »

Hello all.

I'm currently working on interrupts for my small OS, and I've been at it for quite a while, to no avail. It seems that whenever I start up my OS it causes a General Protection Fault (interrupt 13). The full code is available at the link in my signature.

I've scoured the internet searching for a solution, and currently my code is pretty sloppy, I'm just trying to make it work. I always end up on the same webpages telling me the same things.

This is my kernel's main function:

Code: Select all

#include <stdbool.h>
#include <stddef.h>
#include <stdint.h>

// Header files
#include <lib/idt.h>
#include <drivers/io.h>
#include <drivers/pic.h>

#include <drivers/vga.h>
#include <drivers/keyboard.h>

void main(void) {
  idt_init();
  pic_init();

  vga_init();
  // even with all of this commented out, I get exception 13 thrown at me.
//  keyboard_init();

//  vga_prints("Welcome to Nox!");

//  while (true) {
//    keyboard_input();
//  }


  return;
}
As I've provided it above, it throws a General Protection Fault. If I don't initialize the pic (commenting out pic_init()), I get a double fault, and if I don't initialize the IDT I get a reboot loop (presumably a triple-fault not being handled at all).

Once my main function returns, I don't disable interrupts since that's how I intend to get keyboard input down the line, which explains the reboot loop.


I'm very new when it comes to interrupts, exceptions, the IDT in general, and I'm having a bit of trouble wrapping my head around it all. I've read all of the pages on the OSDev Wiki on these subjects, but it doesn't help that I'm having problems getting everything set up.

Does anybody know what why I'm getting these exceptions?
Last edited by Kaius on Thu May 16, 2024 7:36 am, edited 1 time in total.
Octocontrabass
Member
Member
Posts: 5494
Joined: Mon Mar 25, 2013 7:01 pm

Re: Exception 13 thrown when initializing my kernel

Post by Octocontrabass »

All you have is the interrupt number? That's not enough information to figure out what's wrong. Shouldn't your exception handlers tell you more than that?

Since you're using QEMU, you can run it with "-d int" (and maybe also "-no-reboot") to log detailed information about every interrupt.

Here are some other problems I saw in your code:
Kaius
Posts: 20
Joined: Sat Dec 31, 2022 10:59 am
Libera.chat IRC: Kaius

Re: Exception 13 thrown when initializing my kernel

Post by Kaius »

Octocontrabass wrote:All you have is the interrupt number? That's not enough information to figure out what's wrong. Shouldn't your exception handlers tell you more than that?

Since you're using QEMU, you can run it with "-d int" (and maybe also "-no-reboot") to log detailed information about every interrupt.

Here are some other problems I saw in your code:
That's all super helpful, and mostly things I haven't considered. I'll try to get more info from Qemu about interrupts, and I definitely need to clean all of the IDT and PIC stuff once I get it working.

As for the -Wno-return-type, I passed that flag since I tend to perform an operation and return from a function in a single line (e.g. in a void function, using `return x = 25;` instead of `x = 25; \n return;`), and I wanted to hide compiler warnings about that so that other warnings would catch my eye easier. I didn't see anything online about this being good or bad practice, so I decided to stick with it for now.
Octocontrabass
Member
Member
Posts: 5494
Joined: Mon Mar 25, 2013 7:01 pm

Re: Exception 13 thrown when initializing my kernel

Post by Octocontrabass »

Kaius wrote:I didn't see anything online about this being good or bad practice, so I decided to stick with it for now.
Returning a value from a function that's supposed to return void is forbidden by all versions of the C standard, so I'd say it's bad practice.
User avatar
iansjack
Member
Member
Posts: 4683
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Exception 13 thrown when initializing my kernel

Post by iansjack »

You would learn a lot by single-stepping through your code in a debugger. Gdb works well with qemu.
Kaius
Posts: 20
Joined: Sat Dec 31, 2022 10:59 am
Libera.chat IRC: Kaius

Re: Exception 13 thrown when initializing my kernel

Post by Kaius »

Octocontrabass wrote:All you have is the interrupt number? That's not enough information to figure out what's wrong. Shouldn't your exception handlers tell you more than that?

Since you're using QEMU, you can run it with "-d int" (and maybe also "-no-reboot") to log detailed information about every interrupt.

Here are some other problems I saw in your code:
I fixed a few of these issues, and as per your suggestion I ran qemu with "-d int". The last exception I got gave the following:

Code: Select all

check_exception old: 0xffffffff new 0xd
     1: v=0d e=0102 i=0 cpl=0 IP=0008:00200057 pc=00200057 SP=0010:00207000 env->regs[R_EAX]=00000000
EAX=00000000 EBX=00009500 ECX=000003d5 EDX=000003d5
ESI=00000000 EDI=00002000 EBP=00000000 ESP=00207000
EIP=00200057 EFL=00000202 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
CS =0008 00000000 ffffffff 00cf9a00 DPL=0 CS32 [-R-]
SS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
DS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
FS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
GS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
GDT=     002016c7 00000017
IDT=     00207010 00003fff
CR0=00000011 CR2=00000000 CR3=00000000 CR4=00000000
DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000 
DR6=ffff0ff0 DR7=00000400
CCS=00000000 CCD=00206fe0 CCO=EFLAGS
EFER=0000000000000000
I'm in the process of deciphering this all, but if you spot anything immediately wrong with it, that would be super helpful. If I'm correct, it looks like the error code (e=0x0102 -> 0000100000010) indicates that it's referencing an index in the GDT and originated from within the processor?

I'm working on disassembling the OS image, and looking at the addresses around where the exception occurred (0x00200057 + 0x0008), but I'm kind of at a loss at this point.

Edit: I've done some digging, and I traced the problem back to where I call the `lgdt` command:

Code: Select all

Dump of assembler code from 0x200028 to 0x200060:
   0x00200028:  sbb    %eax,(%eax)
   0x0020002a:  add    %al,(%eax)
   0x0020002c:  add    %al,(%eax)
   0x0020002e:  add    %al,(%eax)
   0x00200030 <_start+0>:       mov    $0x207000,%esp
=> 0x00200035 <_start+5>:       lgdtl  0x2016df
   0x0020003c <_start+12>:      mov    $0x10,%ax
   0x00200040 <_start+16>:      mov    %eax,%ds
   0x00200042 <_start+18>:      mov    %eax,%es
   0x00200044 <_start+20>:      mov    %eax,%fs
   0x00200046 <_start+22>:      mov    %eax,%gs
   0x00200048 <_start+24>:      mov    %eax,%ss
   0x0020004a <_start+26>:      ljmp   $0x8,$0x200051
   0x00200051 <_start+33>:      call   0x2016b0 <main>
   0x00200056 <_start+38>:      hlt    
   0x00200057 <_start+39>:      jmp    0x200056 <_start+38>
   0x00200059 <_start.end+0>:   xchg   %ax,%ax
   0x0020005b <_start.end+2>:   xchg   %ax,%ax
   0x0020005d <_start.end+4>:   xchg   %ax,%ax
   0x0020005f <_start.end+6>:   nop
End of assembler dump.
I find this weird, since before I added interrupt / exception handling, my program ran completely fine, and the GDT looks like it's been set up correctly (although that might just be from GRUB). I thought my GDT was fine, I don't know what's wrong with it.
Kaius
Posts: 20
Joined: Sat Dec 31, 2022 10:59 am
Libera.chat IRC: Kaius

Re: Exception 13 thrown when initializing my kernel

Post by Kaius »

In addition to all of that, when I remove the GDT and IDT initialization (so just leaving both as GRUB sets them) I get tons of Segmentation Faults. Removing pieces of code results in more segfaults down the line, usually with "outb" operations. I think the root of the issue lies in the GDT not being set properly.
nullplan
Member
Member
Posts: 1760
Joined: Wed Aug 30, 2017 8:24 am

Re: Exception 13 thrown when initializing my kernel

Post by nullplan »

Kaius wrote:

Code: Select all

 lgdtl  0x2016df
That does not look like a valid GDTR to me. And neither do the actual contents in the interrupt dump. Because as far as I know the GDT must be 8-byte aligned. I'm guessing your .gdt section is linked with alignment 1, which I would suggest increasing to 8.

But that is not your issue. The error code is indeed 0x102, which the AMD APM informs me means that there was an error accessing the selector at offset 0x100 in the IDT. Your IDT actually is long enough for that, but the selectors aren't set up that far back. So that's why you get the error. Something is sending you an interrupt 128.
Carpe diem!
MichaelPetch
Member
Member
Posts: 772
Joined: Fri Aug 26, 2016 1:41 pm
Libera.chat IRC: mpetch

Re: Exception 13 thrown when initializing my kernel

Post by MichaelPetch »

nullplan wrote:The error code is indeed 0x102, which the AMD APM informs me means that there was an error accessing the selector at offset 0x100 in the IDT. Your IDT actually is long enough for that, but the selectors aren't set up that far back. So that's why you get the error. Something is sending you an interrupt 128.
0x102 shifted right 3 bits would be index 0x20 which would be right if they had remapped the master pic to 0x20 and they got a timer interrupt.

The IDT limit of 0x3fff is a bit unusual. You do 64 * 256 - 1. I'd recommend just using idtr.limit = (uint16_t) sizeof(idt) - 1. You only initialize the first 32 exception handlers. The rest are set to 0. When you got the first interrupt (in this case timer interrupt 0x20) it failed because all the entries are 0 for interrupts >= 0x20. You will want to set the IRQ handlers before you allow those interrupts to be received.
Last edited by MichaelPetch on Wed May 15, 2024 9:06 am, edited 2 times in total.
Octocontrabass
Member
Member
Posts: 5494
Joined: Mon Mar 25, 2013 7:01 pm

Re: Exception 13 thrown when initializing my kernel

Post by Octocontrabass »

Kaius wrote:If I'm correct, it looks like the error code (e=0x0102 -> 0000100000010) indicates that it's referencing an index in the GDT and originated from within the processor?
No, it's referencing the descriptor for interrupt 0x20 in your IDT. This sounds like one of the bugs I already mentioned: you've enabled a bunch of interrupt sources even though you haven't set up handlers for them.
Kaius wrote:I've done some digging, and I traced the problem back to where I call the `lgdt` command:
But the exception is occurring around 0x00200057, which is the infinite HLT/JMP loop that runs after main() has returned. The fact that the code in question is near a LGDT instruction doesn't mean the GDT is the problem.
nullplan wrote:That does not look like a valid GDTR to me. And neither do the actual contents in the interrupt dump. Because as far as I know the GDT must be 8-byte aligned.
It's a good idea to align the GDT, but it's not required. The GDTR is only used by the LGDT instruction, so there's no reason to bother aligning it.
Kaius
Posts: 20
Joined: Sat Dec 31, 2022 10:59 am
Libera.chat IRC: Kaius

Re: Exception 13 thrown when initializing my kernel

Post by Kaius »

Octocontrabass wrote:
Kaius wrote:I've done some digging, and I traced the problem back to where I call the `lgdt` command:
But the exception is occurring around 0x00200057, which is the infinite HLT/JMP loop that runs after main() has returned. The fact that the code in question is near a LGDT instruction doesn't mean the GDT is the problem.
I don't think that's accurate, actually. I think that what happened was that I made one post on this chain (containing the segfault address), modified the code a little, and made another containing the disassembly. The result was an inaccurate exception address.

After changing things up a little bit, here's a complete text showing where the segfault occurs:

Code: Select all

Reading symbols from output/os.bin...
(gdb) start
Temporary breakpoint 1 at 0x201666: file ./kernel/kernel.c, line 17.
Starting program: /root/nox/output/os.bin

Program received signal SIGSEGV, Segmentation fault.
0x00200035 in _start ()
(gdb) disas
Dump of assembler code for function _start:
   0x00200030 <+0>:     mov    $0x207000,%esp
=> 0x00200035 <+5>:     lgdtl  0x201688
   0x0020003c <+12>:    mov    $0x10,%ax
   0x00200040 <+16>:    mov    %eax,%ds
   0x00200042 <+18>:    mov    %eax,%es
   0x00200044 <+20>:    mov    %eax,%fs
   0x00200046 <+22>:    mov    %eax,%gs
   0x00200048 <+24>:    mov    %eax,%ss
   0x0020004a <+26>:    ljmp   $0x8,$0x200051
   0x00200051 <+33>:    call   0x201660 <main>
   0x00200056 <+38>:    hlt
   0x00200057 <+39>:    jmp    0x200056 <_start+38>
End of assembler dump.
(gdb)
I've also aligned the GDT to 8 bytes, but that didn't seem to make any difference.
User avatar
iansjack
Member
Member
Posts: 4683
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Exception 13 thrown when initializing my kernel

Post by iansjack »

An interrupt can occur at any time {if interrupts are enabled).
MichaelPetch
Member
Member
Posts: 772
Joined: Fri Aug 26, 2016 1:41 pm
Libera.chat IRC: mpetch

Re: Exception 13 thrown when initializing my kernel

Post by MichaelPetch »

Did you add an STI before you did the `lgdt` instruction? See my previous comment (there was an edit about the external IRQs not being set up in the IDT before receiving an external interrupt like the timer). You shouldn't be running with interrupts enabled until the LIDT is done AND you have proper interrupt handlers in the IDT for the external interrupts you are handling.

There are very few reasons you'd get a #GP exception on the LGDT instruction but if interrupts were on at that point it would make sense. I think that you have interrupts on and when running in the debugger it happens to be triggered on the LGDT rather than in the HLT loop when run outside the debugger.
Kaius
Posts: 20
Joined: Sat Dec 31, 2022 10:59 am
Libera.chat IRC: Kaius

Re: Exception 13 thrown when initializing my kernel

Post by Kaius »

MichaelPetch wrote:Did you add an STI before you did the `lgdt` instruction? See my previous comment (there was an edit about the external IRQs not being set up in the IDT before receiving an external interrupt like the timer). You shouldn't be running with interrupts enabled until the LIDT is done AND you have proper interrupt handlers in the IDT for the external interrupts.

There are very few reasons you'd get a #GP exception on the LGDT instruction but if interrupts were on at that point it would make sense. I think that you have interrupts on and when running in the debugger it happens to be triggered on the LGDT rather than in the HLT loop when run outside the debugger.
To test, I just added `cli` right before `lgdt`, and the segfault actually triggered for the `cli`.

I thought it was a timing-based thing, so I made a ton of dummy instructions at the beginning of `_start` but it didn't throw an exception for any of them.

Additionally, it occurred to me that GDB might not put the binary into a proper multiboot environment, so I might be getting different results with it. I'll try to use GDB with Qemu and see if I get different results.

In case anyone is interested, here's my `boot.asm` file:

Code: Select all

; Multiboot header
FLAGS    equ 0b00000111
MAGIC    equ 0x1BADB002
CHECKSUM equ -(MAGIC + FLAGS) ; checksum + (flags + magic) should be 0

section .multiboot
align 4
  dd MAGIC
  dd FLAGS
  dd CHECKSUM

  ; padding to make it 16 bytes long
  times 5 dd 0

  ; video output flags
  dd 1  ; type (1 = text mode)
  dd 80 ; width
  dd 25 ; height
  dd 0  ; depth, always 0 in text mode


section .bss
stack_bottom:
resb 16384 ; 16 KiB stack
stack_top:

align 8
section .gdt
%include "./boot/gdt.asm"

; this is unused currently for debugging
%include "./boot/idt.asm"

section .text
global _start:function (_start.end - _start)
_start:
  times 16 mov esp, stack_top ; these are dummy instructions to see if it was a timing thing, but the GPF tripped after all of these.

  cli ; if I remove this, the exception is thrown at lgdt

  ; load GDT
  lgdt [gdt]

  ; set up segments
  mov ax, DATA_SEG
  mov ds, ax
  mov es, ax
  mov fs, ax
  mov gs, ax
  mov ss, ax
  ; set CS
  jmp CODE_SEG:.call_main



.call_main:
  extern main

  call main

  ; halt
  cli
.hang:
  hlt
  jmp .hang
.end:
And my `gdt.asm`:

Code: Select all

gdt_start:
  ; null entry
  dd 0
  dd 0

gdt_code:
  dw 0xffff    ; 00-15 limit
  dw 0         ; 16-31 base
  db 0         ; 32-39 base
  db 10011010b ; 40-47 access
  db 11001111b ; 48-55 limit (48-51) and flags (52-55)
  db 0         ; 56-63 base

gdt_data:
  dw 0xffff    ; 00-15 limit
  dw 0         ; 16-31 base
  db 0         ; 32-39 base
  db 10010010b ; 40-47 access
  db 11001111b ; 48-55 limit (48-51) and flags (52-55)
  db 0         ; 56-63 base

gdt_end:


gdt:
  dw gdt_end - gdt_start - 1 ; length
  dd gdt_start               ; address

CODE_SEG equ gdt_code - gdt_start
MichaelPetch
Member
Member
Posts: 772
Joined: Fri Aug 26, 2016 1:41 pm
Libera.chat IRC: mpetch

Re: Exception 13 thrown when initializing my kernel

Post by MichaelPetch »

Can you just update your github with all the changes you have made since your last update?
Post Reply