Page 1 of 1

i386-specific single-step debugging.

Posted: Sun Jan 30, 2011 11:06 pm
by Sergio
Hello.
I want to give it a try at compiling a simple debugging system with my kernel.
Reading Intel's, by activating a specific EFLAGS flag, the CPU will generate a debug exception (int 1). So I went off to do some testing. I installed a handler at vector 1 for testing. It only prints the values pushed to the stack when the exception generated. I then returns. The EIP value pushed to the stack is the one following the instruction that generated the exception. I assume that the single-step flag is turned off when the exception generates, and it is set again on the iret instruction. But every when doing the iret, apparently it does not execute the following instruction, because every EIP that I print is 2 bytes greater than the last EIP. I know that x86 does not have a defined opcode lenght, so I don't se why every EIP printed is 2 bytes greater. Sometimes I get invalid-opcode exception, so maybe iret is not returning where it should?
The handler looks like:

Code: Select all

excp_serv:
	movl %dr6, %eax
	shrl $14, %eax
	andl $1, %eax
	testl %eax, %eax
	jz excp_ret
	call sstep
excp_ret:
	movb int_num, %al # load interrupt number
	decb %al # decrement
	movb %al, int_num # save interrupt number
	sti
	iret # return from interrupt
So test a bit on %dr6 for single-step. Call sstep that is a C function that prints the values on stack. Then iret.

I think a need some hints for this specific problem.
Thanks in advance.

Re: i386-specific single-step debugging.

Posted: Mon Jan 31, 2011 12:30 am
by Brendan
Hi,
Sergio wrote:The handler looks like:

Code: Select all

excp_serv:
	movl %dr6, %eax
	shrl $14, %eax
	andl $1, %eax
	testl %eax, %eax
	jz excp_ret
	call sstep
excp_ret:
	movb int_num, %al # load interrupt number
	decb %al # decrement
	movb %al, int_num # save interrupt number
	sti
	iret # return from interrupt
As an interrupt handler, that's borked. It doesn't save/restore registers it uses/trashes (including the registers that the "sstep" function trashes) and there's no parameters passed (correctly) to the "sstep" function. Apart from that, it does STI for no reason, and you could use "testl $0x00004000, %eax" and "decb intnum" to remove other instructions.

None of these mistakes explain why every EIP printed is 2 bytes greater. However...

That works out to about 5 mistakes in 13 lines of code. The code you're interrupting is probably another 10 lines and the "sstep" function is probably another 10 lines. Then there's the code that "sstep" calls to display stuff on the screen, the boot/initialisation code and anything else you're relying on (which may include other interrupt handlers, exception handlers, a scheduler, etc). If I estimate that there's probably 1300 lines of code you're relying on, then (based on the statistics for the code I've seen) I can estimate there's probably 500 other mistakes elsewhere in code that I haven't seen. At this point my only recommendation is to forget about the "every EIP printed is 2 bytes greater" problem (and single-stepping in general) and start a massive code audit in an attempt to find the other 499 mistakes you haven't noticed yet. ;)


Cheers,

Brendan

Re: i386-specific single-step debugging.

Posted: Mon Jan 31, 2011 12:44 am
by Sergio
Sorry about just throwing that interrupt handler...
Here is the complete interrupt/exception handling mechanism:
I thought many of the issues involved in interrupt/exception handling were implied in the little segment I copied before.
The main idea is that the interrupt/exception will land at the correct offset from excpn_handls or int_handls, thus storing the vector at handl_arg.

Code: Select all

/*
 * This code IS NOT ACCESSIBLE FROM THIS FILE/PROGRAM.
 * We will land here when receiving an exception or interrupt while in
 * protected-mode. This code is used by the kernel to do IO from BIOS
 * while device-drivers are not loaded.
 * We use the stack to save the machine state when receiving a signal, but
 * because stack space is limited, only a finite number of nested interrupts
 * are allowed.
 */
.code32
.set PG_ENABLED, ~0x7fffffff
.set PG_DISABLE, 0x7fffffff
.set PROT_ENABLED, 0x1
.set PROT_DISABLE, ~0x1
.text
/*
 * Generic handler for exceptions.
 * Chaos start!
 */
exptn_handls:
	cli # This must be the first instruction executed
	movb $0, handl_arg
	jmp go16excps
	cli # This must be the first instruction executed
	movb $1, handl_arg
	jmp go16excps
	cli # This must be the first instruction executed
	movb $2, handl_arg
	jmp go16excps
	cli # This must be the first instruction executed
	movb $3, handl_arg
	jmp go16excps
	cli # This must be the first instruction executed
	movb $4, handl_arg
	jmp go16excps
	cli # This must be the first instruction executed
	movb $5, handl_arg
	jmp go16excps
	cli # This must be the first instruction executed
	movb $6, handl_arg
	jmp go16excps
	cli # This must be the first instruction executed
	movb $7, handl_arg
	jmp go16excps
	cli # This must be the first instruction executed
	movb $8, handl_arg
	jmp go16excps
	cli # This must be the first instruction executed
	movb $9, handl_arg
	jmp go16excps
	cli # This must be the first instruction executed
	movb $10, handl_arg
	jmp go16excps
	cli # This must be the first instruction executed
	movb $11, handl_arg
	jmp go16excps
	cli # This must be the first instruction executed
	movb $12, handl_arg
	jmp go16excps
	cli # This must be the first instruction executed
	movb $13, handl_arg
	jmp go16excps
	cli # This must be the first instruction executed
	movb $14, handl_arg
	jmp go16excps
	cli # This must be the first instruction executed
	movb $15, handl_arg
	jmp go16excps
	cli # This must be the first instruction executed
	movb $16, handl_arg
	jmp go16excps
	cli # This must be the first instruction executed
	movb $17, handl_arg
	jmp go16excps
	cli # This must be the first instruction executed
	movb $18, handl_arg
	jmp go16excps
	cli # This must be the first instruction executed
	movb $19, handl_arg
	jmp go16excps
	cli # This must be the first instruction executed
	movb $20, handl_arg
	jmp go16excps
	cli # This must be the first instruction executed
	movb $21, handl_arg
	jmp go16excps
	cli # This must be the first instruction executed
	movb $22, handl_arg
	jmp go16excps
	cli # This must be the first instruction executed
	movb $23, handl_arg
	jmp go16excps
	cli # This must be the first instruction executed
	movb $24, handl_arg
	jmp go16excps
	cli # This must be the first instruction executed
	movb $25, handl_arg
	jmp go16excps
	cli # This must be the first instruction executed
	movb $26, handl_arg
	jmp go16excps
	cli # This must be the first instruction executed
	movb $27, handl_arg
	jmp go16excps
	cli # This must be the first instruction executed
	movb $28, handl_arg
	jmp go16excps
	cli # This must be the first instruction executed
	movb $29, handl_arg
	jmp go16excps
	cli # This must be the first instruction executed
	movb $30, handl_arg
	jmp go16excps
	cli # This must be the first instruction executed
	movb $31, handl_arg
	jmp go16excps
/*
 * Generic handlers for interrupts.
 * Chaos start!
 */
int_handls:
	cli # This must be the first instruction executed
	movb $32, handl_arg
	jmp go16ints
	cli # This must be the first instruction executed
	movb $33, handl_arg
	jmp go16ints
	cli # This must be the first instruction executed
	movb $34, handl_arg
	jmp go16ints
	cli # This must be the first instruction executed
	movb $35, handl_arg
	jmp go16ints
	cli # This must be the first instruction executed
	movb $36, handl_arg
	jmp go16ints
	cli # This must be the first instruction executed
	movb $37, handl_arg
	jmp go16ints
	cli # This must be the first instruction executed
	movb $38, handl_arg
	jmp go16ints
	cli # This must be the first instruction executed
	movb $39, handl_arg
	jmp go16ints
	cli # This must be the first instruction executed
	movb $32, handl_arg
	jmp go16ints
	cli # This must be the first instruction executed
	movb $40, handl_arg
	jmp go16ints
	cli # This must be the first instruction executed
	movb $41, handl_arg
	jmp go16ints
	cli # This must be the first instruction executed
	movb $42, handl_arg
	jmp go16ints
	cli # This must be the first instruction executed
	movb $43, handl_arg
	jmp go16ints
	cli # This must be the first instruction executed
	movb $44, handl_arg
	jmp go16ints
	cli # This must be the first instruction executed
	movb $45, handl_arg
	jmp go16ints
	cli # This must be the first instruction executed
	movb $46, handl_arg
	jmp go16ints
	cli # This must be the first instruction executed
	movb $47, handl_arg
	jmp go16ints

/*
 * Save machine state before interrupt.
 * Set processor in 16-bit real-mode and set the idtr to our original,
 * BIOS-oriented interrupt vector.
 */
go16ints:
	movb $1, int_or_excpn # signal is an interrupt
	jmp 2f
go16excps:
	movb $0, int_or_excpn # signal is an exception
1:
	cmpb $0x01, handl_arg
	je 2f
	cmpb $0x10, handl_arg
	je 2f
	cmpb $0x13, handl_arg
	je 2f
	cmpb $0x15, handl_arg
	je 2f
	cmpb $25, handl_arg
	je 2f
	cmpb $14, handl_arg // page fault?
	je 2f
	popl %eax
	popl %ebx
	popl %ecx
	pushl $excpdump_buf
	pushl handl_arg
	pushl %eax
	pushl %ebx
	pushl %ecx
	call excpdump32
	pushl $excpdump_buf
	call putstr_low
	jmp .
2:
	cmpb $NUM_NEST_INTS, int_num # max number of nested interrupts reached?
	ja end_caos # yes; ignore interrupt
	incb int_num # increment interrupt number
	cmpb $0x01, handl_arg # is it video?
	je excp_serv # yes
	cmpb $0x15, handl_arg # is it video?
	je ext_serv # yes
	cmpb $0x10, handl_arg # is it video?
	je vid_serv # yes
	cmpb $0x13, handl_arg # is it disk?
	je disk_serv # yes
	cmpb $0x21, handl_arg # is it a key press/release?
	je key_serv # yes
	cmpb $25, handl_arg # temporary handler for memory copy
	je temp_serv # yes
	cmpb $14, handl_arg # temporary handler for memory copy
	je page_serv # yes
	jmp 2f
page_serv:
	movl %cr2, %eax # faulting linear address
	pushl %eax # as argument
	call page_serv_high
	addl $8, %esp
	jmp no_pag2
ext_serv:
	movl $ext_bios_regs, %esi # load new stack
	movl %eax, (%esi) # save registers buffer
	movl %ebx, 4(%esi) # save registers buffer
	movl %ecx, 8(%esi) # save registers buffer
	movl %edx, 12(%esi) # save registers buffer
	pushl %ebp # form
	movl %esp, %ebp #  stack frame
	movl $ext_mem_ents, addr_copy # pointer to buffer
	movl $addr_copy, %esi
	movl $(ext_mem_gdt_real+18), %edi # point to source address in struct
	movl $3, %ecx
	rep
	movsb
	movl $(ext_mem_gdt_real+26), %edi # point to destination address in struct
	movl 20(%ebp), %eax
	movl %eax, addr_copy # destination address
	movl $addr_copy, %esi
	movl $3, %ecx
	rep
	movsb
	popl %ebp
	jmp 2f

/*
 * Stack if we get here:
 * 4(%ebp): EIPlow, 8(%ebp): CS, 12(%ebp): EFLAGS, 16(%ebp): EIPtemp, 
 * 20(%ebp): pointer buffer, 24(%ebp): struct size.
 */
temp_serv:
	pushl %ebp # form
	movl %esp, %ebp #  stack frame
	pushl 24(%ebp) # save...
	popl kargs_size #  ...and restore size of structure
	movl $kern_args, %eax
	movl %eax, addr_copy
	movl $addr_copy, %esi # pointer to buffer
	movl $(ext_mem_gdt_real+18), %edi # point to source address in struct
	movl $3, %ecx
	rep
	movsb
	movl $(ext_mem_gdt_real+26), %edi # point to destination address in struct
	movl 20(%ebp), %eax
	movl %eax, addr_copy
	movl $addr_copy, %esi # destination address
	movl $3, %ecx
	rep
	movsb
	popl %ebp
	jmp 2f
/*
 * Protected-mode passes an extended disk packet through the stack and the
 * extended function through %ah.
 */
disk_serv:
	movl $ext_bios_regs, %esi # load new stack
	movl %eax, (%esi) # save registers buffer
	movl %ebx, 4(%esi) # save registers buffer
	movl %ecx, 8(%esi) # save registers buffer
	movl %edx, 12(%esi) # save registers buffer
	pushl %ebp # form
	movl %esp, %ebp #  stack frame
	movl 28(%ebp), %esi # pointer to extended disk packet
	movl $ext_das_struct, %edi # copy it to
	movl $16, %ecx # copy 16 bytes
	rep
	movsb
	movl $ext_das_struct, %edi # pointer to extended disk packet
	movl $ext_rw_buf, 4(%edi) # copy segment:offset
	movl $(ext_mem_gdt_real+18), %edi # point to source address in struct
	movl $ext_rw_buf, addr_copy
	movl $addr_copy, %esi
	movl $3, %ecx
	rep
	movsb
	movl $0, addr_copy
	movl $(ext_mem_gdt_real+26), %edi # point to destination address in struct
	movl 24(%ebp), %eax
	movl %eax, addr_copy # destination address
	movl $addr_copy, %esi # destination address
	movl $3, %ecx
	rep
	movsb
	popl %ebp
	jmp 2f
/*
 */
vid_serv:
	pushl %ebp # temporary
	movl %esp, %ebp #  stack frame
	cmpl $0, 20(%ebp) # is argument 0?
	je no_str
	movl 20(%ebp), %eax
	movl %eax, ptr_str_term # save pointer to string
	jmp str
no_str:
	movl $0, ptr_str_term # pointer is NULL
str:
	popl %ebp # restore frame
	popl %eax # pop EIP
	popl %ebx # pop CS AND don't push it again
	popl %ecx # pop EFLAGS AND don't push it again
	pushl %eax # push EIP
	pushl %ecx # push EFLAGS
	pushl %ebx # push CS
	pushl $term_handl_scr # push EIP
	jmp no_pag2
/*
 * Replace the return address to jump to the terminal handler.
 */
key_serv:
	popl %eax # pop EIP
	popl %ebx # pop CS AND don't push it again
	popl %ecx # pop EFLAGS AND don't push it again
	pushl %eax # push EIP
	pushl %ecx # push EFLAGS
	pushl %ebx # push CS
	pushl $term_handl_kbd # push EIP
	call new_int09_32
	jmp no_pag2
excp_serv:
	movl %dr6, %eax
	shrl $14, %eax
	andl $1, %eax
	testl %eax, %eax
	jz excp_ret
	call sstep
excp_ret:
	movb int_num, %al # load interrupt number
	decb %al # decrement
	movb %al, int_num # save interrupt number
	sti
	iret # return from interrupt
2:
	pusha # save 
	pushf # save 
	pushl %ds # save 
	pushl %es # save 
	pushl %ss # save 
	pushl %gs # save 
	pushl %fs # save 
	movl $MEM_TSS, %edi # pointer to task segment
	movl %cr3, %eax # load page directory address register
	movl %esp, 56(%edi) # save page directory address to TSS
	movl %eax, 32(%edi) # save stack pointer to TSS
	movl %cr0, %eax # load status register
	testl $PG_ENABLED, %eax
	jz no_pag
	movb $1, page_on
	andl $PG_DISABLE, %eax
	movl %eax, %cr0
no_pag:
	ljmp $RCODE_SEGSEL, $segtion # to 16-bit segment
.code16
segtion:
	movl $RDATA_SEGSEL, %edx
	movl %edx, %ss # data segment
	movl %edx, %ds # data segment
	movl %edx, %es # data segment
	movl %edx, %fs # data segment
	movl %edx, %gs # data segment
	lidt ivtreg # load the real-mode Interrupt Vector Table
	movl %cr0, %eax
	andl $PROT_DISABLE, %eax # disable segmentation
	movl %eax, %cr0
	ljmp $0, $end_go16
end_go16:
	xorw %ax, %ax # reset segment registers
	movw %ax, %ss
	movw %ax, %ds
	movw %ax, %es
	movw $STACK, %sp # load new stack pointer
	sti # enable interrupts once again
	movb int_or_excpn, %al # load variable
	testb %al, %al # is it an interrupt?
	jnz 2f # yes
end_go16.1:
	jmp int16_excpns # it is an interrupt
2:
end_go16.2:
	jmp int16_ints # it is an interrupt
.code32
reentry_eip:
	nop # just in case...
	movl $SDATA_SEGSEL, %eax # protected-mode segments
	movl %eax, %ss
	movl %eax, %ds
	movl %eax, %es
	movl %eax, %gs
	movl %eax, %fs
	movl $MEM_TSS, %edi # point to TSS
	movl 56(%edi), %esp # restore stack pointer
	movl 32(%edi), %eax # load page directory address
	movl %eax, %cr3 # restore page directory address
	movb page_on, %al
	testb %al, %al
	jz no_pag1
	movl %cr0, %eax # load register
	orl $PG_ENABLE, %eax # PG on
	movl %eax, %cr0 # restore paging
no_pag1:
	movb $0, page_on # assume no paging for next interrupt
	popl %fs # restore
	popl %gs # restore
	popl %ss # restore
	popl %es # restore
	popl %ds # restore
	popf # restore
	popa # restore
no_pag2:
	movb int_num, %al # load interrupt number
	decb %al # decrement
	movb %al, int_num # save interrupt number
end_caos:
	sti
	iret # return from interrupt
There are some interrupts that still need to use the BIOS (for example for reading/writing to the disk). In that case, fall back to real-mode and disable paging, and jump to here:

Code: Select all

/*
 * When an interrupt or exception is generated in protected-mode, control
 * will ultimately lead here. We are already in real-mode and have the vector
 * number in the handl_arg variable.
 */
int16_excpns:
	movb handl_arg, %al # test if not really and exception
	cmpb $0x10, %al
	je 1f
	cmpb $0x13, %al
	je 2f
	cmpb $0x15, %al
	je 3f
	cmpb $25, %al
	je 4f
1:
	movw $REAL_STRING_BUF, %si
	int $0x10
	jmp end_int16
2:
	movw $ext_bios_regs, %si
	movl (%si), %eax
	movl 4(%si), %ebx
	movl 8(%si), %ecx
	movl 12(%si), %edx
	pushw %es # save segment
	pushw $0 # new segment
	popw %es # load it
	movw $ext_das_struct, %si # das packet
	movb drive_num, %dl # drive number
	int $0x13 # extended read or write
	jc error
	movb $0x87, %ah
	movw $0x100, %cx
	movw $ext_mem_gdt_real, %si	
	int $0x15
	popw %es # restore segment
	jmp end_int16
3:
	movw $ext_bios_regs, %di
	movl (%di), %eax
	movl 4(%di), %ebx
	movl 8(%di), %ecx
	movl 12(%di), %edx
	pushw %es
	pushw $0
	popw %es
	xorl %ebx, %ebx
	movl $0x534d4150, %edx # magic number
	movl $0xe820, %eax # function number
	movl $24, %ecx # number of bytes
	movw $ext_mem_ents, %di # buffer to store entries
	movl $1, 20(%di) # make it ACPI 3.0 compatible
mem_call:
	int $0x15 # return extended memory
	movl $0xe820, %edx # value will be exchanged
	xchgl %eax, %edx # exchange values
	cmpl $0x534d4150, %edx # must be this magic value
	jne error
	movl $0, 24(%di) # NULL pointer
	addw $28, %di # next entry (acpi 3.0)
	movl $24, %ecx
	testl %ebx, %ebx # last entry?
	jnz mem_call # no
	movl $0x12345, 20(%di)
end_mem_call:
	movb $0x87, %ah
	xorb %al, %al
	movw ext_mem_ents, %cx
	movw $ext_mem_gdt_real, %si	
	int $0x15
	popw %es # restore segment
	jmp end_int16
4:
	pushw %es
	pushw $0
	popw %es
	movb $0x87, %ah
	movw kargs_size, %cx
	movw $ext_mem_gdt_real, %si	
	int $0x15
	popw %es # restore segment
	jmp end_int16
int16_ints:
	xorl %eax, %eax
	xorl %ebx, %ebx
	movb handl_arg, %al
	subb $32, %al
	movb $4, %bl
	mull %ebx
	addw $int16_init.1, %ax
	jmp *%ax
int16_init.1:
	int $32 # IRQ0: 18.2hz timer tick
	jmp end_int16
	int $33 # IRQ1: keypress interrupt
	jmp end_int16
	int $34 # IRQ2: cascade to slave 8259
	jmp end_int16
	int $35 # IRQ3: COM 2
	jmp end_int16
	int $36 # IRQ4: COM 1
	jmp end_int16
	int $37 # IRQ5: LPT 2
	jmp end_int16
	int $38 # IRQ6: floppy
	jmp end_int16
	int $39 # IRQ7: LPT 1
	jmp end_int16
	int $40 # IRQ8: Real-time clock (1khz)
	jmp end_int16
	int $41 # IRQ9
	jmp end_int16
	int $42 # IRQ10
	jmp end_int16
	int $43 # IRQ11
	jmp end_int16
	int $44 # IRQ12: Mouse
	jmp end_int16
	int $45 # IRQ13: Math coprocessor
	jmp end_int16
	int $46 # IRQ14: IDE services
	jmp end_int16
	int $47 # IRQ15: APM suspend
end_int16:
	cli # disable interrupts while restoring protected-mode
end_int16_nocli:
	lgdt gdtreg
	lidt idtreg
	movl %cr0, %eax # load cr0
	orl $1, %eax # enable protection bit
	movl %eax, %cr0 # return to protected-mode
	ljmp $SCODE_SEGSEL, $reentry_eip
The reason for not saving context is that I have no processes/threads yet. I haven't got to the point where processes are created and context switching occurs.
I hope it is a lot clearer now.

Re: i386-specific single-step debugging.

Posted: Mon Jan 31, 2011 12:50 am
by Sergio
When I call sstep, the arguments are already in the stack, and received at the C function.
As I said earlier, the handler was installed just for testing; I can't go on doing anymore until I understand the exception.

Re: i386-specific single-step debugging.

Posted: Mon Jan 31, 2011 4:19 am
by Brendan
Hi,

My apologies - my previous "need a massive code audit" assumption was wrong.

For interrupt handling on 80x86 there's "trap gates" and "interrupt gates" (and "task gates" and other things that don't matter for the purpose of what I'm saying here). For trap gates the CPU leaves the interrupt enable/disable flag as is. For interrupt gates the CPU automatically disables IRQs before starting the interrupt handler and automatically restores the previous state of the interrupt enable/disable flag on IRET. In real mode, all interrupts behave like "interrupt gates". The reason Intel designed it like this is because there's no other sane way to do it. For example, if the CPU didn't have interrupt gates (or if an OS uses trap gates when it should use interrupt gates) it's possible for an IRQ to interrupt an interrupt handler before it has executed its first instruction (and before a CLI instruction can be executed) and therefore it'd be impossible to prevent IRQs in cases where it matters. Of course you also want to use trap gates for cases where you don't need to disable IRQs (and don't need to make IRQ latency worse for no reason). Basically, under no circumstances should any interrupt handler ever need do a CLI to manually disable interrupts at the beginning of an interrupt handler.

The STI instruction is one of those instructions that should probably never be used. Consider this code:

Code: Select all

stuff:
    cli
    * do stuff *
    sti
    ret
What happens if other code that needs to have interrupts disabled calls this routine? That's right, the caller gets screwed because this code enables IRQs when they should've stayed disabled. To avoid that you'd want to do this instead:

Code: Select all

stuff:
    pushf
    cli
    * do stuff *
    popf
    ret
In this case, it stores the state of the interrupt enable/disable flag, then disables interrupts, then restores the previous state of the interrupt enable/disable flag. If other code that needs to have interrupts disabled calls this routine, then everything works fine because interrupts wouldn't become enabled.

Interrupts handlers work the same. The CPU stores the previous value of FLAGS/EFLAGS on the stack before it starts any of your interrupt handler code; and when you do IRET the CPU restores the previous value of FLAGS/EFLAGS from the stack. This means that (for e.g.) if code that must have interrupts disabled causes an exception or uses a software interrupt, then that interrupt/exception handler never needs to screw things up by enabling interrupts.

Ok. Now imagine you are a taxi driver. Every time you pick up a customer the customer tells you where they want to go, and you write their destination on a piece of paper and drive all the way back to your house. As soon as you get back to your house you look at the piece of paper and only then start going to correct destination. Does that sound like an incredibly stupid idea? I hope so. Now imagine you are a CPU. Every time an interrupt occurs it tells you where you need to go, and you write this down in "handl_arg" and then go to "go16ints:" or "go16excps:"; and when you get there you look at what value you wrote to "handl_arg" and only then go to where you should've went to start with. Is this any less stupid than the incredibly stupid taxi driver example?
Sergio wrote:There are some interrupts that still need to use the BIOS (for example for reading/writing to the disk). In that case, fall back to real-mode and disable paging, and jump to here:
No. The BIOS is worse than camel puke for anything more than getting an OS started. It has 3 basic problems:
  • It's real mode code
  • It's only useful for bad single-tasking code (note: good single tasking code would provide asynchonous I/O to avoid some of the performance problems caused by single-tasking)
  • It relies on a lot of hardware state (e.g. PIC and APIC configuration, power management controls, etc) that a sane OS would need to change
There's several ways to fix the first problem properly (e.g. virtual8086 mode). You can't fix the other 2 problems.
Sergio wrote:The reason for not saving context is that I have no processes/threads yet. I haven't got to the point where processes are created and context switching occurs.
I hope it is a lot clearer now.

No. Imagine something simple like a "rep stosd" instruction. Now imagine an IRQ that interrupts just before that "rep movsd" instruction. Now imagine a stupid interrupt handler that trashes EAX, ECX, ESI, EDI, DS and/or ES before returning to that "rep stosd" instruction. Now imagine some potentially large area of memory at an uknown address getting overwritten at unpredictable times. Can you see how that might not be good (regardless of whether or not you've got processes/threads)?
Sergio wrote:When I call sstep, the arguments are already in the stack, and received at the C function.
As I said earlier, the handler was installed just for testing; I can't go on doing anymore until I understand the exception.
As soon as your interrupt handlers are fixed (and save registers properly, like they must), the contents of the stack will include all the registers you had to save on the stack, and what you thought were parameters to the C function will be something entirely different. You might be silly enough to adjust the C function to suit, but then you'll change something else and need to re-adjust it again, and again and again and again. Maybe after all that you'll realise that you need to pass parameters to the C function properly to avoid a code maintenance nightmare.


Basically, my previous "need a massive code audit" assumption was wrong. You need to delete the entire mess and start again from scratch.


Cheers,

Brendan

Re: i386-specific single-step debugging.

Posted: Mon Jan 31, 2011 12:48 pm
by Sergio
For example, if the CPU didn't have interrupt gates (or if an OS uses trap gates when it should use interrupt gates) it's possible for an IRQ to interrupt an interrupt handler before it has executed its first instruction (and before a CLI instruction can be executed) and therefore it'd be impossible to prevent IRQs in cases where it matters.
Interrupt gates, trap gates... fundamental information. Thanks.
Basically, under no circumstances should any interrupt handler ever need do a CLI to manually disable interrupts at the beginning of an interrupt handler.
Understood. :)
Is this any less stupid than the incredibly stupid taxi driver example?
No, I think it is just as stupid... when I was starting with the code I had no real interrupt/exception handlers for each case, so I just sent them all to a generic function. The stupidness was done basically because I counted on the set of instructions

Code: Select all

cli # This must be the first instruction executed
   movb $NUMBER, handl_arg
   jmp go16[excps/ints]
all being the same byte lenght. Maybe jumping elsewhere would require more bytes and the something would start to fail... well maybe it is just an equally stupid excuse, but for now I really ignore this kind of redundant/unnecesary/stupid coding because I know I'll revisit it later and fix all those stuff.
No. Imagine something simple like a "rep stosd" instruction. Now imagine an IRQ that interrupts just before that "rep movsd" instruction. Now imagine a stupid interrupt handler that trashes EAX, ECX, ESI, EDI, DS and/or ES before returning to that "rep stosd" instruction. Now imagine some potentially large area of memory at an uknown address getting overwritten at unpredictable times. Can you see how that might not be good (regardless of whether or not you've got processes/threads)?
Yes. As a matter of fact, after posting my replies I gave it a second thought and totally agree with it. My context-saving mechanism is quite simple (push all to the stack), so maybe a simple pusha at the beginning of the handling might do it. There are some information I save, but only in the case switching to real-mode, and quite late when a lot of registers are screwed and, worse, I don't even save the general-purpose registers!
As soon as your interrupt handlers are fixed (and save registers properly, like they must), the contents of the stack will include all the registers you had to save on the stack, and what you thought were parameters to the C function will be something entirely different. You might be silly enough to adjust the C function to suit, but then you'll change something else and need to re-adjust it again, and again and again and again. Maybe after all that you'll realise that you need to pass parameters to the C function properly to avoid a code maintenance nightmare.
Yes. The context saving indeed implies a new mechanism for passing the arguments to the C function.
Basically, my previous "need a massive code audit" assumption was wrong. You need to delete the entire mess and start again from scratch.
Only here I do not agree. I think that the mess was due to some bad assumptions I made. For example, it would be trivial to make the taxi driver do its job and stop being stupid. Next it would be equally trivial to push the entire context to the stack; if a particular handler needs to pass the values pushed to the stack by the CPU, just pop them to memory, push the context and push them again, call C and then pop the context before returning.
Although my mistakes are pretty much fundamental, I think the loop for constructing the interrupt vector, which is this one:

Code: Select all

setidt:
	movw $NUM_HANDLS, %cx # loop number of handlers times
	movw $NUM_EXCNS, %dx # loop number of exception handlers times
	movw $handls_ctrl, %si # control string
	movw $MEM_IDT, %di # IDT pointer
	xorw %bx, %bx # scale control register
next_handler:
	testw %dx, %dx # end of exceptions?
	jnz 2f # no
	movw end_handls, %ax # load end of handlers bool variable
	testw %ax, %ax # end of handlers?
	jnz end_setidt # yes
	incw %ax
	movw %ax, end_handls
	xorw %bx, %bx # reset scale control
	addw $8, %si # now point to interrupts
	movw $NUM_INTS, %dx # loop number of interrupt handlers times
2:
	movw (%si), %ax # load handler
	pushw %bx # save scale factor
	pushw %ax # save it
	movw $13, %ax # handler size
	mulw %bx # scale
	xchgw %bx, %ax # exchange values
	popw %ax # restore handler
	addw %bx, %ax # add offset
	popw %bx # restore scale factor
	movw %ax, (%di) # copy handler to IDT
	movw 2(%si), %ax # load segment selector
	movw %ax, 2(%di) # copy it to IDT
	movw 4(%si), %ax # load control information
	movw %ax, 4(%di) # copy it to IDT
	movw 6(%si), %ax # load offset continuation
	movw %ax, 6(%di) # copy it to IDT
	addw $8, %di # point to next entry in IDT
	decw %dx # one exception/interrupt less
	incw %bx # one scale factor more
	loop next_handler
end_setidt:
	ret
here is the control string:

Code: Select all

handls_ctrl:
	.word exptn_handls
	.word SCODE_SEGSEL
	.byte 0x00, 0x8e, 0x00, 0x00
	.word int_handls
	.word SCODE_SEGSEL
	.byte 0x00, 0x8e, 0x00, 0x00
It would be totally unnecesary to rethink it.
Maybe just updating the exptn_handls/int_handls and fixing the context-saving issue would imply the code is less messier?

By the way, this table is constructed very early, so it is a temporary solution. I will do something a lot more flexible right after being able to not depend on the BIOS in any aspect.

Anyway, thanks for the i386 trap/interrupt gate lesson.

Re: i386-specific single-step debugging.

Posted: Tue Feb 08, 2011 10:18 pm
by Sergio
It has been a long week.
I finally managed to single step through the instructions hopefully very early during initialization.
I rewrote part of the handling mechanism; it currently uses the heap to allocate space for saving registers. I also had to rewrite a lot of the terminal handler because it was far to complex and had a lot of unnecessary stuff.
Here are a couple of pics starting (sorry about the mess!)
Image
Image

Although didn't receive any DIRECT help for doing this here, I appreciate the comments that helped me rethink the interrupt/exception handling strategy. Thanks.