flexible or hard coded kernel data structures?

yemista · Post by **yemista** » Fri Jan 09, 2009 1:34 pm

I am just curious on what others do as I have now come to this design decision. I need a GDT and an IDT and some ISR's, but when I code the IDT, I need to know the offsets of the ISR's.
Now Im wondering, it seems to be a good programming principle to write flexible code, but in this case, a hard coded solution would be easier. I would have to calculate off hand where I expect the ISR's to start, where the IDT will be, how big everything will be, and if I make 256 descriptors and 256 empty ISR's, then even in the future if I decide to add more interrupts, I wont have to recompute the addresses of these structures, and worst case scenario would be if I add another entry to the GDT, and have to push everything down 8 bytes. So it doesnt seem to bad, especially not if I use macros so I would only have to change a few hard coded
addresses, and my only counter argument is that in principle its good to write flexible. How do you guys deal with this issue?

Craze Frog · Post by **Craze Frog** » Fri Jan 09, 2009 4:25 pm

I define these things statically. Uses less memory, less cpu, less lines of code and just as easy to expand on.

I would have to calculate off hand where I expect the ISR's to start, where the IDT will be, how big everything will be......

No, why would do such a thing? We have symbolic assemblers for a reason.

Generate the ISR's with macros. You can add any number of them without recalculating anything:

Code: Select all

; ------------------------------------------
; Exception/Interrupt Service Routine Stubs
; ------------------------------------------
section '.code'
macro ISR_ZEROERR [NUM] {
  isr#NUM:
        cli
        push 0
        push NUM
        jmp  isr_common_stub
}
macro ISR_CPUERR [NUM] {
  isr#NUM:
        cli
        push NUM
        jmp  isr_common_stub
}
; For the exceptions:
ISR_ZEROERR 0, 1, 2, 3, 4, 5, 6, 7, 9, 15, 16, 17, 18
ISR_CPUERR  8,  10, 11, 12, 13, 14
ISR_ZEROERR 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31
; For the interrupts:
ISR_ZEROERR 32, 33, 34, 35, 36, 37, 38, 39, 40
ISR_ZEROERR 41, 42, 43, 44, 45, 46, 47
; For the system call interface:
ISR_ZEROERR 48, 49, 50

Then you make your IDT, make sure to repeat the correct number of times to cover all the ISRs. Notice the symbol KERNEL_CODE_SEL, it will be computed later, you don't have to hand compute it and fill in the number here.

Code: Select all

; ------------------------------------------
; Interrupt Descriptor Table
; ------------------------------------------
; idt_entry is
;     0 base_low.w
;     2 code_segment_selector.w
;     4 always0.b
;     5 flags.b
;     6 base_high.w
; end idt_entry = 8 bytes
section '.data' align 8
idtr:   dw idt_end - idt - 1    ; IDT limit
        dd idt                  ; address of IDT
; ------------------------------------------
macro make_idt_entry {
        dw 0 ; isr0 and 0xFFFF
        dw KERNEL_CODE_SEL
        db 0
        db 0x8E ; Access flags: Running in kernel mode (| 0x60 for usermode)
        dw 0 ; (isr#num shr 16) and 0xFFFF
}
align 8
idt:    repeat 32+16+3 ; exceptions+interrupts+syscall
                make_idt_entry
        end repeat
idt_end:

Because the elf object format doesn't support shifted symbols (in expressions) the entries has to be filled in at run-time (if you want to assemble to an object file and link it with other object files, if you assemble directly to final executable you can probably fill them in statically). Still, setting the entries is almost automatic with the macro:

Code: Select all

; IDT -> ISR installation
macro set_idt_entry [num] {
        ; base_lo = isr0 and 0xFFFF
        mov     eax, isr#num
        mov     [idt+num*8], ax
        ; base_hi = (isr0 >> 16) and 0xFFFF
        shr     eax, 16
        mov     [idt+num*8+6], ax
}
section '.code'
idt_install:
public idt_install
        ; remap irq here

        ; connect idt entries to isrs
        set_idt_entry 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
        set_idt_entry 11, 12, 13, 14, 15, 16, 17, 18, 19
        set_idt_entry 20, 21, 22, 23, 24, 25, 26, 27, 28, 29
        set_idt_entry 30, 31, 32, 33, 34, 35, 36, 37, 38, 39
        set_idt_entry 40, 41, 42, 43, 44, 45, 46, 47, 48, 49
        set_idt_entry 50
        
        ; Change access rights on int 50 to ring 3 so we can communicate
        ; with the kernel through this interrupt
        or byte [idt+50*8+5], $60

        lidt    [idtr]
ret

The GDT is simple. If things are moved around, no problem, the selector constants are updated automatically. The +3 means user-mode.

Code: Select all

; ------------------------------------------
; Global Descriptor Table
; ------------------------------------------
section '.code'
public gdt_install
gdt_install:
        lgdt [gdtr]
        mov  ax, KERNEL_DATA_SEL
        mov  ds, ax
        mov  es, ax
        mov  fs, ax
        mov  gs, ax
        mov  ss, ax
        jmp  KERNEL_CODE_SEL:@f
@@:
        call tss_install
        ret
section '.data' align 8
; gdt_entry is
;       limit_low.w
;       base_low.w
;       base_middle.b
;       access.b
;       granularity.b
;       base_high.b
; end gdt_entry = 8 bytes
gdtr:   dw gdt_end - gdt - 1    ; GDT limit
        dd gdt                  ; address of GDT
; ------------------------------------------
align 8
; null descriptor
gdt:    dw 0
        dw 0
        db 0
        db 0
        db 0
        db 0
; kernel code
KERNEL_CODE_SEL    =  $-gdt   ; Segment selector
gdt2:   dw 0xFFFF               ; limit 0xFFFFF
        dw 0                    ;
        db 0                    ;
        db 0x9A                 ; present, ring 0, code, readable
        db 0xCF                 ; page-granular, 32-bit
        db 0                    ;
; kernel data
KERNEL_DATA_SEL    =  $-gdt     ; Segment selector
gdt3:   dw 0xFFFF               ; limit 0xFFFFF
        dw 0                    ; base 0
        db 0
        db 0x92                 ; present, ring 0, data, writable
        db 0xCF                 ; page-granular, 32-bit
        db 0
; user code
USER_CODE_SEL      =  $-gdt+3   ; Segment selector
gdt4:   dw 0xFFFF
        dw 0
        db 0
        db 0xFA                 ; present, ring 3, code, readable
        db 0xCF
        db 0
; user data
USER_DATA_SEL      =  $-gdt+3
gdt5:   dw 0xFFFF
        dw 0
        db 0
        db 0xF2                 ; present, ring 3, data, writable
        db 0xCF
        db 0
; user TSS
USER_TSS_SEL       =  $-gdt
gdt6:   dw 103
        dw 0                    ; set to tss
        db 0
        db 0xE9                 ; present, ring 3, 32-bit available TSS
        db 0
        db 0
gdt_end:

yemista · Post by **yemista** » Tue Jan 20, 2009 8:52 am

Hello, I tried using some of your code for setting up the IDT and I noticed I am getting a segmentation fault, but only when interrupts are enabled. I am still trying to figure out how to debug interrupts, but my thought is, does it have anything to do with overwriting the intel reserved interrupts? It seems that your code does that.

neon · Post by **neon** » Tue Jan 20, 2009 9:28 am

My real system generates the tables dynamically during runtime. I personally feel that this improves the code in several ways but comes with some additional complexity. Im not a fan of hard coding

Combuster · Post by **Combuster** » Tue Jan 20, 2009 11:28 am

I don't need to hardcode IDTs either, even though my kernel is in assembly. And its not like my setup code is any dark magic

kasper · Post by **kasper** » Wed Jan 21, 2009 10:40 am

Hi,

Neon, Combuster, does the contents of your (initial) GDT and IDT change depending on runtime conditions? Or what's the reason that you choose to build those tables at runtime?

With kind regards,
Kasper

Revelation · Post by **Revelation** » Wed Jan 21, 2009 10:57 am

Craze frog, that's a very nice macro!

If you create a loop, it can be even more simple:

Code: Select all

%macro create_common_isr 1-*
  %rep  %0
  
global isr%1
  isr%1:
        cli
	push 0				; push a dummy error code
        push %1
        jmp  isr_common_stub
	
  %rotate 1 
  %endrep 
%endmacro

%macro create_isr_with_errcode 1-*
  %rep  %0
  
  global isr%1
  isr%1:
        cli
	nop				; fill it to make it have the same size
	nop
        nop
        nop
        nop
        push %1
        jmp  isr_common_stub
	
  %rotate 1 
  %endrep 
%endmacro

create_common_isr 0, 1, 2, 3, 4, 5, 6, 7
create_isr_with_errcode 8
create_common_isr 9
create_isr_with_errcode 10, 11, 12, 13, 14

%macro fill_idt 0
%assign i 15
%rep    60 - 15				; register first 60 isrs
        create_common_isr i
%assign i i+1
%endrep
%endmacro

fill_idt

This is just (untested) example code, so it doesn't consider error codes. Also, check out my code to efficiently register the interrupts in C.

EDIT: I rewrote the code. This is the working NASM version. I registers the first 60 isrs.

Combuster · Post by **Combuster** » Wed Jan 21, 2009 4:10 pm

kasper wrote:Neon, Combuster, does the contents of your (initial) GDT and IDT change depending on runtime conditions? Or what's the reason that you choose to build those tables at runtime?

In my case, several. Mainly since neither table is constant, even after initial startup.

IDT:
- Generating 256 entries takes less space in the binary than hardcoding them in.
- It potentially needs specific alignment (F00F bug-related issue)
- I want to support both PIC and IOAPIC modes.
- I want to allow userspace applications to handle a subset of exceptions (the benign ones are safe to point anywhere as long as the kernel doesn't need them.) That means a new instance of the IDT has to be created for that address space.

GDT:
- Each processor has its own TSS. That means the size of the GDT isn't a constant. (or huge)
- I want to support segmentation, for small address spaces, and for allowing unpaged segmented memory if the application desires so.

Hyperdrive · Post by **Hyperdrive** » Thu Jan 22, 2009 4:13 am

Hm...

I personally don't like the code doubling when you create 256 ISR stubs. though each one is very small. I (again personally) find it better to have only one function that all IDT desriptors refer to.

The only thing is, how do you know which vector did invoke your ISR handler. A possible solution is to create a copy of the kernel code segment descriptor for every IDT descriptor you have and let the IDT descriptor select this copy. When the ISR routine is called you can very easily derive the interrupt vector from the value in the CS register.

The setup is not too complicated.

Put your kernel/user descriptors into the GDT.
Append 256 copies of your kernel code segment descriptor.
Generate the 256 entry IDT...

Code: Select all

#   | GDT                        | IDT  
=============================================================
0   | null descriptor            | segment=6,  offset=funcptr
1   | kernel code segment        | segment=7,  offset=funcptr
2   | kernel data segment        | segment=8,  offset=funcptr
3   | user code segment          | segment=9,  offset=funcptr
4   | user data segment          | segment=10, offset=funcptr
5   | tss descriptor             | segment=11, offset=funcptr
6   | kernel code segment copy   | segment=12, offset=funcptr
7   | kernel code segment copy   | segment=13, offset=funcptr
... | ...                        | ...
255 | kernel code segment copy   | segment=261, offset=funcptr
... | ...                        |============================
261 | kernel code segment copy   |
==================================

That are simple loops and there is (as I said) only one ISR handler function (maybe two: one with errorcode, one without). No code doubling. The dispatching is now more like a jump table, what I think is nicer.

One possible downside is, that this is not as portable, which may or may not be an issue.

I would like to hear your opinions about this technique. What do you think? Better/worse in speed? Better/worse in memory size? Is it at all worth to think about it?

Regards,
Thilo

P.S.: Mods, please feel free to split/move this thread, as this could be a whole alone standing discussion thread and/or may better fit in the design forum.

OSDev.org

flexible or hard coded kernel data structures?

flexible or hard coded kernel data structures?

Re: flexible or hard coded kernel data structures?

Re: flexible or hard coded kernel data structures?

Re: flexible or hard coded kernel data structures?

Re: flexible or hard coded kernel data structures?

Re: flexible or hard coded kernel data structures?

Re: flexible or hard coded kernel data structures?

Re: flexible or hard coded kernel data structures?

Re: flexible or hard coded kernel data structures?