Page 1 of 1

GP fault when flushing TSS

Posted: Thu Nov 07, 2019 2:58 pm
by ryukoposting
Hello, all:

I'm slowly but surely getting my plucky little kernel to ring 3. I've just started trying to load a TSS into the GDT, and I getting a general protection fault at ltr in this bit of code:

Code: Select all

tss_flush:
  mov %ax, 0x2B
  ltr %ax
  ret
I had declared my TSS as a global static variable like this:

Code: Select all

static Tss tss;
I tried setting up virtual memory/paging before and after setting up the GDT, and neither arrangement seemed to have an effect. Then, instead of statically allocating the TSS, I changed my declaration above to a pointer, and used my page allocator to point it to a page-aligned address. Doing that magically made it work. I have other global static declarations elsewhere, and none of them cause GP faults when reading/writing to them- only the TSS was doing this, and the only time it happened was at that ltr instruction. Does anyone have any guesses for why this might happen?

Re: Fixed a TSS-related GP fault, but why did this (not) wor

Posted: Thu Nov 07, 2019 3:41 pm
by ryukoposting
Wait a minute, it's back again. General protection fault at ltr. I have paging enabled, with the first 4MiB of RAM identity-mapped. All pages are set to allow ring 3 access, and all are set to be writable.

When I do this, I get a general protection fault

Code: Select all

tss_flush:
  mov %ax, 0x2B
  ltr %ax
  int $0x3
  ret
When I do this, I get a breakpoint exception:

Code: Select all

tss_flush:
  mov %ax, 0x2B
  int $0x3
  ltr %ax
  ret
so it's definitely happening right there.

Re: GP fault when flushing TSS

Posted: Thu Nov 07, 2019 8:48 pm
by zhiayang
erm... we haven't actually seen your TSS *descriptor* in the GDT...

Re: GP fault when flushing TSS

Posted: Thu Nov 07, 2019 9:44 pm
by MichaelPetch
If you are using standard AT&T syntax with GNU assembler then source operand is listed first and destination operand is second. You appear to have them reversed. I will assume that 0x2b was meant to be $0x2B as you want to load ax with 0x2B not what is at memory address DS:0x2B. I would expect you meant:

Code: Select all

mov $0x2B, %ax

Re: GP fault when flushing TSS

Posted: Fri Nov 08, 2019 11:58 am
by ryukoposting
MichaelPetch wrote:If you are using standard AT&T syntax with GNU assembler then source operand is listed first and destination operand is second. You appear to have them reversed. I will assume that 0x2b was meant to be $0x2B as you want to load ax with 0x2B not what is at memory address DS:0x2B. I would expect you meant:

Code: Select all

mov $0x2B, %ax
Yup, realized that about 20 minutes after my second post. I then decided it was time to take a break :mrgreen:

I now can now set up the VGA buffer, parse the multiboot header, set up virtual memory with the first 4 MiB identity-mapped (setting all pages as writable and ring 3 accessible, for now), create an IDT (which appears to work, seeing as I can get the divide-by-zero, breakpoint, and page fault ones to fire at will), and set up a GDT like so:
  • - zero descriptor
    - ring 0 code, base=0, limit=FFFFF, access flags set to 0xC (32-bit and 4KB pages)
    - ring 0 data, base=0 limit=FFFFF, access flags set to 0xC
    - ring 3 code, base=0, limit=FFFFF, access flags set to 0xC
    - ring 3 data, base=0, limit=FFFFF, access flags set to 0xC
    - TSS
At this point, I'd love to just be able to switch into ring 3, even if I can't do anything with it yet. I know I can get interrupts to work, so that shouldn't be too painful a process.

I jacked this code from here http://www.jamesmolloy.co.uk/tutorial_h ... 0Mode.html

Code: Select all

void switch_to_user_mode()
{
   // Set up a stack structure for switching to user mode.
   asm volatile("
     cli; \
     mov $0x23, %ax; \
     mov %ax, %ds; \
     mov %ax, %es; \
     mov %ax, %fs; \
     mov %ax, %gs; \
                   \
     mov %esp, %eax; \
     pushl $0x23; \
     pushl %eax; \
     pushf; \
     pushl $0x1B; \
     push $1f; \
     iret; \
     1:");
}
As I understand it, this puts offset 0x20 (ring 3 data segment) into the data selector register (ds), as well as the other selectors that hypothetically don't get used. Then, it pushes the same selector, the stack pointer, EFLAGS, the ring 3 code selector, and a label to return to (which is just the bottom of the function, which iret then sees as a direction to change to the ring 3 GDT selectors. When I call this function (after doing everything described previously), I get an infinite loop of restarts.

If I throw in an int $0x3 right before that iret, I get my normal breakpoint exception. If I put int $0x3 immediately after iret, I get the triple fault again. Any suggestions?

Re: GP fault when flushing TSS

Posted: Fri Nov 08, 2019 1:13 pm
by ryukoposting
I should stop posting in the middle of debugging, but hopefully another individual will find my ramblings useful. Obviously, even if things were working fine, putting an int $0x3 AFTER a return instruction shouldn't do anything, so I turned to GDB. I found that I was actually returning from the function just fine- things only blew up when the code tried to return from the function that called switch_to_user_mode in the first place (which happens to be the first and only C function called from _start). If I just throw a while(1) at the end of this function, I no longer get the fault.

My first guess was that that switch_to_user_mode screws up the stack alignment, but this doesn't seem to be the case:

Code: Select all

  volatile int k = 3;
  switch_to_user_mode();
  putnum(k);
prints 3 a-okay, so it seems unlikely that the stack is getting screwed up (also, since all vmem pages are set up to be writable by ring 3, it makes sense that printing anything at all is possible).

The assembly code (that calls the C code (that calls switch_to_user_mode)) has an infinite loop that should just sit there and spin at the end. But, I based it on the osdev barebones assembly code, which has this at the end:

Code: Select all

cli
hlt
which can't be executed in ring 3, should cause a general protection fault, but it's triple faulting and restarting as mentioned before. Is it safe to assume that this is because interrupts were disabled while still in ring 0?

Re: GP fault when flushing TSS

Posted: Fri Nov 08, 2019 2:11 pm
by ryukoposting
It looks like I may have been setting up esp0 in my TSS wrong- I was blindly following James Molloy's guide (http://www.jamesmolloy.co.uk/tutorial_h ... 0Mode.html) and was just setting esp0 to zero. Instead, I'm now setting esp0 to the initial value of esp when _start is called. Now, I'm getting a general protection fault when I try to do int $0x3, which I would expect since I'm in user mode.

Re: GP fault when flushing TSS

Posted: Fri Nov 08, 2019 8:29 pm
by Ethin
As a cautionary note, I would strongly recommend you read this Wiki article on the tutorial your following -- it has well-documented, well-known issues that (in my eyes anyway) make it a very bad idea to follow. (I wonder if James even consulted the OSDev wiki/forums or not when writing that...) (Note: I *did* not write this wiki article, just to ensure that that confusion does not arise. :))

Re: GP fault when flushing TSS

Posted: Sat Nov 09, 2019 11:39 am
by ryukoposting
Ethin wrote:As a cautionary note, I would strongly recommend you read this Wiki article on the tutorial your following -- it has well-documented, well-known issues that (in my eyes anyway) make it a very bad idea to follow. (I wonder if James even consulted the OSDev wiki/forums or not when writing that...) (Note: I *did* not write this wiki article, just to ensure that that confusion does not arise. :))
Holy cow! I hadn't seen that wiki page before, and I've been perusing the wiki on and off for a couple years now (wrote a little ARM embedded kernel last year). Maybe it'd be helpful to link to that page on some of the other pages that have info overlapping with Molloy's tutorial.

As for whether or not he consulted the OsDev wiki, I strongly doubt it. His guide's approach to the linker script, boot.s, and memory allocator (among other things) is very different from some of the stuff on the wiki. I think that's a good thing, though- clearly there's no one "correct" way of doing this stuff. imo, seeing two different ways of doing the same thing makes it easier to understand what's really going on.

Re: GP fault when flushing TSS

Posted: Sat Nov 09, 2019 3:39 pm
by Ethin
ryukoposting wrote:
Ethin wrote:As a cautionary note, I would strongly recommend you read this Wiki article on the tutorial your following -- it has well-documented, well-known issues that (in my eyes anyway) make it a very bad idea to follow. (I wonder if James even consulted the OSDev wiki/forums or not when writing that...) (Note: I *did* not write this wiki article, just to ensure that that confusion does not arise. :))
Holy cow! I hadn't seen that wiki page before, and I've been perusing the wiki on and off for a couple years now (wrote a little ARM embedded kernel last year). Maybe it'd be helpful to link to that page on some of the other pages that have info overlapping with Molloy's tutorial.

As for whether or not he consulted the OsDev wiki, I strongly doubt it. His guide's approach to the linker script, boot.s, and memory allocator (among other things) is very different from some of the stuff on the wiki. I think that's a good thing, though- clearly there's no one "correct" way of doing this stuff. imo, seeing two different ways of doing the same thing makes it easier to understand what's really going on.
Here here! Seeing varying (and sometimes completely opposite) implementations if anyting is always fu to see; its an excellent indicator that the universe still has a lot of creative juices it needs to dish out to us mortals. :)