Stuck with stack-segment fault on interrupt in ring 3

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
jbreu
Posts: 2
Joined: Mon Jan 01, 2024 9:27 am

Stuck with stack-segment fault on interrupt in ring 3

Post by jbreu »

Hallo,

long-time lurker here. I am working in an x64 long mode os and stuck for a week in getting interrupts during user land to work. Interrupts are working fine when in kernel land, the only thing I added were the switch to userland and a rudimentary syscall functionality (the latter commented for now). At this moment, the user land function only consists of an infinite loop.

In my efforts to keep everything very simple as of now, I dont have multiple tasks yet. I just switch the complete os to user land, which seems to work according to the CPL=3 when the interrupt (keyboard activity) happens.

Code: Select all

Servicing hardware INT=0x21
     1: v=21 e=0000 i=0 cpl=3 IP=0013:000000000010f411 pc=000000000010f411 SP=000b:000000000014cf80 env->regs[R_EAX]=2000002000000005
RAX=2000002000000005 RBX=0000000000000000 RCX=0000000000111750 RDX=0000000000000000
R12=0000000000000000 R13=0000000000000000 R14=0000000000000000 R15=0000000000000000
RIP=000000000010f411 RFL=00000202 [-------] CPL=3 II=0 A20=1 SMM=0 HLT=0
R12=0000000000000000 R13=0000000000000000 R14=0000000000000000 R15=0000000000000000
RIP=000000000010f411 RFL=00000202 [-------] CPL=3 II=0 A20=1 SMM=0 HLT=0
R12=0000000000000000 R13=0000000000000000 R14=0000000000000000 R15=0000000000000000
RIP=000000000010f411 RFL=00000202 [-------] CPL=3 II=0 A20=1 SMM=0 HLT=0
ES =0010 0000000000000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
CS =0013 0000000000000000 ffffffff 00a0fb00 DPL=3 CS64 [-RA]
SS =000b 0000000000000000 ffffffff 00c0f300 DPL=3 DS   [-WA]
DS =0010 0000000000000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
FS =0010 0000000000000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
GS =0010 0000000000000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
LDT=0000 0000000000000000 0000ffff 00008200 DPL=0 LDT
TR =0000 0000000000000000 0000ffff 00008b00 DPL=0 TSS64-busy
GDT=     000000000014e010 0000003b
IDT=     000000000014d010 00007fff
CR0=80000011 CR2=0000000000000000 CR3=0000000000146000 CR4=00000020
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
CCS=0000000000000000 CCD=0000000000000501 CCO=EFLAGS
EFER=0000000000000501
However, immediately after, I get a triple fault:

Code: Select all

check_exception old: 0xffffffff new 0xc
     2: v=0c e=0000 i=0 cpl=3 IP=0013:000000000010f411 pc=000000000010f411 SP=000b:000000000014cf80 env->regs[R_EAX]=2000002000000005
RAX=2000002000000005 RBX=0000000000000000 RCX=0000000000111750 RDX=0000000000000000
RSI=0000000000142b28 RDI=000000000000000a RBP=0000000000000000 RSP=000000000014cf80
R8 =0000000000000000 R9 =0000000000000001 R10=0000000000000030 R11=0000000000000202
R12=0000000000000000 R13=0000000000000000 R14=0000000000000000 R15=0000000000000000
RIP=000000000010f411 RFL=00000202 [-------] CPL=3 II=0 A20=1 SMM=0 HLT=0
ES =0010 0000000000000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
CS =0013 0000000000000000 ffffffff 00a0fb00 DPL=3 CS64 [-RA]
SS =000b 0000000000000000 ffffffff 00c0f300 DPL=3 DS   [-WA]
DS =0010 0000000000000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
FS =0010 0000000000000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
GS =0010 0000000000000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
LDT=0000 0000000000000000 0000ffff 00008200 DPL=0 LDT
TR =0000 0000000000000000 0000ffff 00008b00 DPL=0 TSS64-busy
GDT=     000000000014e010 0000003b
IDT=     000000000014d010 00007fff
CR0=80000011 CR2=0000000000000000 CR3=0000000000146000 CR4=00000020
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
CCS=0000000000000000 CCD=0000000000000501 CCO=EFLAGS
EFER=0000000000000501
check_exception old: 0xc new 0xc
     3: v=08 e=0000 i=0 cpl=3 IP=0013:000000000010f411 pc=000000000010f411 SP=000b:000000000014cf80 env->regs[R_EAX]=2000002000000005
RAX=2000002000000005 RBX=0000000000000000 RCX=0000000000111750 RDX=0000000000000000
RSI=0000000000142b28 RDI=000000000000000a RBP=0000000000000000 RSP=000000000014cf80
R8 =0000000000000000 R9 =0000000000000001 R10=0000000000000030 R11=0000000000000202
R12=0000000000000000 R13=0000000000000000 R14=0000000000000000 R15=0000000000000000
RIP=000000000010f411 RFL=00000202 [-------] CPL=3 II=0 A20=1 SMM=0 HLT=0
ES =0010 0000000000000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
CS =0013 0000000000000000 ffffffff 00a0fb00 DPL=3 CS64 [-RA]
SS =000b 0000000000000000 ffffffff 00c0f300 DPL=3 DS   [-WA]
DS =0010 0000000000000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
FS =0010 0000000000000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
GS =0010 0000000000000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
LDT=0000 0000000000000000 0000ffff 00008200 DPL=0 LDT
TR =0000 0000000000000000 0000ffff 00008b00 DPL=0 TSS64-busy
GDT=     000000000014e010 0000003b
IDT=     000000000014d010 00007fff
CR0=80000011 CR2=0000000000000000 CR3=0000000000146000 CR4=00000020
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
CCS=0000000000000000 CCD=0000000000000501 CCO=EFLAGS
EFER=0000000000000501
check_exception old: 0x8 new 0xc
Triple fault
I have set up a TSS with hardcoded RSP0 to the stack pointer before the switch to user land (which happens to be not changed afterwards in user land):

Code: Select all

static TSS_ENTRY: Tss = Tss {
    reserved1: 0x0,
    rsp0: 0x14cf80, // TODO replace with derived value, dont use static
    rsp1: 0x0,
    rsp2: 0x0,
    reserved2: 0x0,
    ist1: 0x0,
    ist2: 0x0,
    ist3: 0x0,
    ist4: 0x0,
    ist5: 0x0,
    ist6: 0x0,
    ist7: 0x0,
    reserved3: 0x0,
    reserved4: 0x0,
    iopb: 0x0,
};
At this point, after double-checking all flags, bits and so on I am pretty clueless what else to check for. Hence, I am happy for any hints what could go wrong here.

My expectation would be that when the interrupt fires, the cpu first loads the tss.rsp0 as new rsp, goes into ring 0 and then proceeds with the interrupt handling as before. But it doesnt even enter the interrupt handler, it fails before.

You can find the current code (Rust) here: https://github.com/jbreu/os-series/tree/userland
Octocontrabass
Member
Member
Posts: 5560
Joined: Mon Mar 25, 2013 7:01 pm

Re: Stuck with stack-segment fault on interrupt in ring 3

Post by Octocontrabass »

jbreu wrote:

Code: Select all

TR =0000 0000000000000000 0000ffff 00008b00 DPL=0 TSS64-busy
I have set up a TSS
According to the log you posted, the TSS isn't set up. But maybe you already fixed it?
jbreu
Posts: 2
Joined: Mon Jan 01, 2024 9:27 am

Re: Stuck with stack-segment fault on interrupt in ring 3

Post by jbreu »

Octocontrabass wrote:
jbreu wrote:

Code: Select all

TR =0000 0000000000000000 0000ffff 00008b00 DPL=0 TSS64-busy
I have set up a TSS
According to the log you posted, the TSS isn't set up. But maybe you already fixed it?
Yes you are correct, in the meantime I have made it work. IIRC there were two essential things missing:
  • TSS was not loaded via LTR. My wrong understanding earlier was, that it is loaded as part of LGDT (because it is part of the GDT structure). And somehow I missed all the LTR references in the Wiki pages. When I found the solution I suddenly saw all of them ;)
  • Only first half of TSS was defined in GDT, second (empty) part was missing: https://github.com/jbreu/os-series/comm ... 3e68c1R115
Octocontrabass
Member
Member
Posts: 5560
Joined: Mon Mar 25, 2013 7:01 pm

Re: Stuck with stack-segment fault on interrupt in ring 3

Post by Octocontrabass »

jbreu wrote:second (empty) part was missing
The second part isn't empty, it contains the upper 32 bits of the 64-bit base address. That'll be important if you ever need to move the TSS to a higher address (for example, if you make a higher-half kernel).
Post Reply