Page 1 of 1

stack operations after sti

Posted: Mon Apr 10, 2017 12:32 am
by Js2xxx
I received a #DF when it's trying to push/pop after sti in bochs: after sti, wherever there's a stack operation, there's a #DF.
However, there's no exception in VMware. It's confusing.

Is it a bug of bochs? Or is it my code's fault?

(My OS is in 64-bit, and it has enabled specific paging(not one-to-one). The canonical stack linear address is properly mapped to present memory address.)
(After checking, it doesn't seem that the first exception is #GP or #SS.)
(I've enabled APIC LVT timer.)

Re: stack operations after sti

Posted: Mon Apr 10, 2017 12:46 am
by iansjack
It is your code's fault (not that I have seen your code). Such a catastrophic bug in Boch's would have been found long ago.

I would imagine that there is a pending interrupt which is triggered when you enable interrupts, and some fault in your interrupt-handling code. An invalid stack?

Re: stack operations after sti

Posted: Mon Apr 10, 2017 12:58 am
by alexfru
Are you sure you aren't getting an interrupt on vector 8 (this would be timer/IRQ0 by default / until you reconfigure interrupts)? Your dump says EXT=1, implying the external source of the event. It totally makes sense after enabling interrupts with the STI instruction.

Re: stack operations after sti

Posted: Mon Apr 10, 2017 1:02 am
by Js2xxx
iansjack wrote:(not that I have seen your code).
So here's my code:

Code: Select all

	mov	ecx, IA32_X2APIC_LVT_LINT0
	rdmsr
	bts	eax, 16
	wrmsr                       ; disable 8259 PIC
	mov	rax, 0
	mov	cr8, rax             ; set TPR = 0
	call	EnableTimer      ; enable timer

	mov	rdi, PreparLock
	call	LeaveSpinLock   ; leave spinlock

	sti
	jmp	kernel_main
kernel_main() is a C function. It occurs at "push rbp" which's generated by GCC.

Re: stack operations after sti

Posted: Mon Apr 10, 2017 1:05 am
by Js2xxx
alexfru wrote:timer/IRQ0 by default
Well, I use APIC LVT timer (Int vector 0x20), not 8259 PIC. For external interrupt, I use I/O APIC and I 've masked all of them (Int vector 0x21 ~ 0x37).

And here's my code for I/O APIC:

Code: Select all

SegSelector sel;
sel.RPL = 0;
sel.TI = 0;
sel.Index = 4;
SetIntGate(0x21, KeyboardHandler, sel, INT_GATE | DESC_P, 0);      //For example, the keyboard handler (others are the same as it.)
...
SetIOAPICVector(0x12, 0x21);
...

Re: stack operations after sti

Posted: Mon Apr 10, 2017 1:16 am
by alexfru
Js2xxx wrote:
alexfru wrote:timer/IRQ0 by default
Well, I use APIC LVT timer (Int vector 0x20), not 8259 PIC. For external interrupt, I use I/O APIC and I 've masked all of them (Int vector 0x21 ~ 0x37).
It would be interesting to see the PIC state (pending and in service flags for IRQ0) and/or the code that disables the PIC interrupts. It could be a different kind of bug, though.

Re: stack operations after sti

Posted: Mon Apr 10, 2017 1:22 am
by Js2xxx
alexfru wrote:see the code that disables the PIC interrupts.
So here's the code:

Code: Select all

	mov	ecx, IA32_X2APIC_LVT_LINT0 ;0x835
	rdmsr
	bts	eax, 16
	wrmsr
I disable LINT0 so I disconnect my kernel with 8259 PIC.

Re: stack operations after sti

Posted: Mon Apr 10, 2017 1:37 am
by Brendan
Hi,
Js2xxx wrote:
alexfru wrote:see the code that disables the PIC interrupts.
So here's the code:

Code: Select all

	mov	ecx, IA32_X2APIC_LVT_LINT0 ;0x835
	rdmsr
	bts	eax, 16
	wrmsr
I disable LINT0 so I disconnect my kernel with 8259 PIC.
This is completely broken - there's no guarantee that the PIC chip/s are connected to LINT0. For example; the PIC chips can be connected to an IO APIC input (configured as "extInt").


Cheers,

Brendan

Re: stack operations after sti

Posted: Mon Apr 10, 2017 1:47 am
by Js2xxx
Brendan wrote: This is completely broken - there's no guarantee that the PIC chip/s are connected to LINT0. For example; the PIC chips can be connected to an IO APIC input (configured as "extInt").
So I add this to that code:

Code: Select all

        mov	rax, 10h
	mov	ebx, 0FEC00000h
	mov	[ebx], eax
	mov	rax, 0
	bts	rax, 16
	mov	ebx, 0FEC00010h
	mov	[ebx], eax
However, it makes little sense.
Thanks for any advice.

Re: stack operations after sti

Posted: Mon Apr 10, 2017 1:59 am
by Brendan
Js2xxx wrote:
Brendan wrote: This is completely broken - there's no guarantee that the PIC chip/s are connected to LINT0. For example; the PIC chips can be connected to an IO APIC input (configured as "extInt").
So I add this to that code:

Code: Select all

        mov	rax, 10h
	mov	ebx, 0FEC00000h
	mov	[ebx], eax
	mov	rax, 0
	bts	rax, 16
	mov	ebx, 0FEC00010h
	mov	[ebx], eax
However, it makes little sense.
This would make a little more sense:

Code: Select all

	mov	dword [0FEC00000h], 10h
	mov	dword [0FEC00010h], (1 << 16)
However there is no guarantee that the PIC chips are connected to "IO APIC input #0" (and not another IO APIC input), no guarantee that there isn't multiple separate IO APICs, and no guarantee that any of the IO APICs that exist are at that specific address.

My advice is:
  • Mask the PIC's IRQs in the PIC chips themselves
  • Do "STI" to allow BIOS/firmware/whatever to handle any IRQs that were buffered/postponed before you masked them
  • Don't bother using CLI after this because there is no point (all IRQs masked)
  • Configure PICs regardless of whether you use them or not; to ensure that spurious IRQs don't cause "interrupt 0x0F" (or "interrupt 0x77") but will use something nicer (like "interrupt 0x37" and "interrupt 0x3F" instead).

Cheers,

Brendan

Re: stack operations after sti

Posted: Mon Apr 10, 2017 2:33 am
by Js2xxx
Brendan wrote:

Code: Select all

	mov	dword [0FEC00000h], 10h
	mov	dword [0FEC00010h], (1 << 16)
Your advice benefits me. The address has been sign-extended and has made a #PF, though.