Weird errors trying to get back from user -> kernel mode

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
midir
Member
Member
Posts: 46
Joined: Fri Jun 13, 2008 4:09 pm

Weird errors trying to get back from user -> kernel mode

Post by midir »

Hi!

I've had kernel mode multithreading working for a long time but now I'm trying to get user mode working. I've finally figured out how to set up the stack in the task switch correctly to iret into it, and run code in it successfully, but I have to keep interrupts disabled because the first interrupt that tries to get back into kernel mode is sending it absolutely haywire.

First, here's the layout of my GDT:

Code: Select all

static seg_descriptor gdt[] __attribute__ ((aligned (8))) =
{
	{ /* Null descriptor */
		0, 0, 0, 0, 0, 0, 0,
	},
	
	{ /* Kernel code descriptor */
		0xFFFF,
		0, 0,
		SEG_EXECUTABLE | CSEG_READ_ENABLE | DESCR_PRESENT | DESCR_DATA_CODE_SEG | DESCR_PL_0,
		0xF,
		SEG_FLAGS_SIZE | SEG_FLAGS_GRANULARITY,
		0,
	},
	
	{ /* Kernel data descriptor */
		0xFFFF,
		0, 0,
		DSEG_WRITE_ENABLE | DESCR_PRESENT | DESCR_DATA_CODE_SEG | DESCR_PL_0,
		0xF,
		SEG_FLAGS_SIZE | SEG_FLAGS_GRANULARITY,
		0,
	},
	
	{ /* User code descriptor */
		0xFFFF,
		0, 0,
		SEG_EXECUTABLE | CSEG_READ_ENABLE | DESCR_PRESENT | DESCR_DATA_CODE_SEG | DESCR_PL_3,
		0xF,
		SEG_FLAGS_SIZE | SEG_FLAGS_GRANULARITY,
		0,
	},
	
	{ /* User data descriptor */
		0xFFFF,
		0, 0,
		DSEG_WRITE_ENABLE | DESCR_PRESENT | DESCR_DATA_CODE_SEG | DESCR_PL_3,
		0xF,
		SEG_FLAGS_SIZE | SEG_FLAGS_GRANULARITY,
		0,
	},
	
	{ /* TSS for user to kernel mode transitions */
		0x67,
		(((uint32)&tss0) & 0x0000FFFF) >> 0,
		(((uint32)&tss0) & 0x00FF0000) >> 16,
		DESCR_PRESENT | DESCR_TSS,
		0,
		0,
		(((uint32)&tss0) & 0xFF000000) >> 24,
	},
};
and my IDT is initialized as follows:

Code: Select all

for (uint i = 0; i < num_idt_entries; i++) {
	if (int_handlers[i]) {
		idt[i].seg_selector = KERNEL_MODE_CODE_SEGMENT;
		idt[i].offset_low = LOW_WORD(int_handlers[i]);
		idt[i].offset_high = HIGH_WORD(int_handlers[i]);
		idt[i].type = IDESCR_INT_GATE | IDESCR_X1 | IDESCR_32_BIT | DESCR_PL_3 | DESCR_PRESENT;
	}
}
(KERNEL_MODE_CODE_SEGMENT is 8.)

This is what I get in the Bochs debug log as soon as I try to transition back to kernel mode:

Code: Select all

00045057213i[CPU0 ] [45057213] Stopped on MAGIC BREAKPOINT
00045057233d[CPU0 ] page walk for address 0x00000000f00006cd
00045057245d[CPU0 ] interrupt(): vector = 20, TYPE = 0, EXT = 1 ; this is the instruction that should send us back
00045057245d[CPU0 ] interrupt(): INTERRUPT TO INNER PRIVILEGE ; yep
00045057245d[CPU0 ] page walk for address 0x0000000000000008 ; what? 8? there's nothing down there
00045057245d[CPU0 ] page walk for address 0x0000000000001230 ; ...except for the 0x1234's I filled low mem with
00045057245e[CPU0 ] interrupt(): SS is not writable data segment ; no kidding you stupid machine -- you read the wrong bit of memory
00045057245d[CPU0 ] exception(0x0a): error_code=1234 ; "invalid TSS" -- because of a dodgy segment selector pulled from the wrong place! (?)
00045057245d[CPU0 ] interrupt(): vector = 0a, TYPE = 3, EXT = 1
00045057245d[CPU0 ] page walk for address 0x00000000f00006b0 ; yes, THAT'S byte offset 8 into the global descriptor table!
00045057245d[CPU0 ] interrupt(): INTERRUPT TO INNER PRIVILEGE
00045057245d[CPU0 ] page walk for address 0x0000000000000008 ; argh no what are you doing down there again!?
00045057245e[CPU0 ] interrupt(): SS is not writable data segment ; you moronic machine
00045057245d[CPU0 ] exception(0x0a): error_code=1234
00045057245d[CPU0 ] exception(0x08): error_code=0000
00045057245d[CPU0 ] interrupt(): vector = 08, TYPE = 3, EXT = 1
00045057245d[CPU0 ] page walk for address 0x00000000f00006b0
00045057245d[CPU0 ] interrupt(): INTERRUPT TO INNER PRIVILEGE
00045057245d[CPU0 ] page walk for address 0x0000000000000008 ; why oh why?
00045057245e[CPU0 ] interrupt(): SS is not writable data segment
00045057245d[CPU0 ] exception(0x0a): error_code=1234
00045057245i[CPU0 ] CPU is in protected mode (active)
00045057245i[CPU0 ] CS.d_b = 32 bit
00045057245i[CPU0 ] SS.d_b = 32 bit
00045057245i[CPU0 ] EFER   = 0x00000000
00045057245i[CPU0 ] | RAX=0000000000000000  RBX=0000000000000000
00045057245i[CPU0 ] | RCX=0000000000000000  RDX=0000000000000000
00045057245i[CPU0 ] | RSP=00000000f0080fec  RBP=0000000000000000
00045057245i[CPU0 ] | RSI=0000000000000000  RDI=0000000000000000
00045057245i[CPU0 ] |  R8=0000000000000000   R9=0000000000000000
00045057245i[CPU0 ] | R10=0000000000000000  R11=0000000000000000
00045057245i[CPU0 ] | R12=0000000000000000  R13=0000000000000000
00045057245i[CPU0 ] | R14=0000000000000000  R15=0000000000000000
00045057245i[CPU0 ] | IOPL=3 id vip vif ac vm RF nt of df IF tf sf zf af pf cf
00045057245i[CPU0 ] | SEG selector     base    limit G D
00045057245i[CPU0 ] | SEG sltr(index|ti|rpl)     base    limit G D
00045057245i[CPU0 ] |  CS:001b( 0003| 0|  3) 00000000 ffffffff 1 1
00045057245i[CPU0 ] |  DS:0023( 0004| 0|  3) 00000000 ffffffff 1 1
00045057245i[CPU0 ] |  SS:0023( 0004| 0|  3) 00000000 ffffffff 1 1
00045057245i[CPU0 ] |  ES:0023( 0004| 0|  3) 00000000 ffffffff 1 1
00045057245i[CPU0 ] |  FS:0000( 0000| 0|  0) 00000000 00000000 0 0
00045057245i[CPU0 ] |  GS:0000( 0000| 0|  0) 00000000 00000000 0 0
00045057245i[CPU0 ] |  MSR_FS_BASE:0000000000000000
00045057245i[CPU0 ] |  MSR_GS_BASE:0000000000000000
00045057245i[CPU0 ] | RIP=00000000f00059ac (00000000f00059ac)
00045057245i[CPU0 ] | CR0=0xe0000011 CR2=0x0000000000000000
00045057245i[CPU0 ] | CR3=0x00010000 CR4=0x00000000
00045057245d[CTRL ] searching for component 'cpu' in list 'bochs'
00045057245d[CTRL ] searching for component 'reset_on_triple_fault' in list 'cpu'
00045057245p[CPU0 ] >>PANIC<< exception(): 3rd (10) exception with no resolution
Maybe I've just misunderstood how this is supposed to work, but for an int gate, the seg selector in the IDT entry tells it the code segment to switch to... right? I mean it works superbly for interrupts when I'm already in kernel mode. But in user mode, it's as if it thinks that 8 is a reference to low memory and goes and pulls stuff from there. Or maybe that's not what it's doing at all, maybe those extra Bochs "page walk for address" messages are spurious, but it's still throwing an invalid TSS exception.

In Virtual PC, it's not crashing, but I'm not sure what it's doing. My kernel just stops switching threads and hangs, consuming 100% CPU.

Please help me to understand this. If I can't fix this Mr. Computer is going to go on a little holiday vertically towards the ground outside. :(

EDIT: Here's a picture of things before it dies:
Image
It's printing the context of every task switch. As you can see, it doesn't get very far in that regard. The first switch is into thread ID#2 (the user mode thread; eflags 0x3202) but it can't get back out again.
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: Weird errors trying to get back from user -> kernel mode

Post by Combuster »

It looks like you didn't load a valid TSS and it still points to the start of memory... the kernel SS is located at 0(TSS base)+8
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
midir
Member
Member
Posts: 46
Joined: Fri Jun 13, 2008 4:09 pm

Re: Weird errors trying to get back from user -> kernel mode

Post by midir »

Combuster, THANK YOU SO MUCH! :D

You were absolutely right of course, it was complaining about the stack segment (offset 8 in TSS), not the code segment (offset 8 in GDT).

The task register was loaded with 40 (0x28) which was correct, but the three base fields in the TSS descriptor in the GDT were blank. It should have worked -- but I guess the compiler/linker couldn't initialize those fields (with the shifts and all) statically. It probably set up some constructor somewhere to init them at runtime just before main (this is C++), but my asm loader script isn't calling those.

Of all the things I expected to be to blame, it wasn't the compiler! But initing the three base fields of that descriptor at runtime works:

Code: Select all

gdt[5].base_1 = (((uint32)&tss0) & 0x0000FFFF) >> 0;
gdt[5].base_2 = (((uint32)&tss0) & 0x00FF0000) >> 16;
gdt[5].base_3 = (((uint32)&tss0) & 0xFF000000) >> 24;
Now I can get in and out of user mode at will. The red counters on the right are in user threads; the command prompt on the left is still a kernel mode thread, and they all run simultaneously. :D

Image

Thank you again! I never expected to get this far!
Post Reply