Can't jump to entry point after switching page directory

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
codyd51
Member
Member
Posts: 77
Joined: Fri May 20, 2016 2:29 pm
Location: London, UK
GitHub: https://github.com/codyd51
Contact:

Can't jump to entry point after switching page directory

Post by codyd51 »

Hello,

Up until now, my exec() hasn't removed kernel code+data mappings from the page mappings. This is something I want to do now. I've created a new directory, mapped in the executable's segments, and set up a small stack space.

The page mappings look like this:
08048000-08049000 1000 -rw
10000000-10001000 1000 -rw

My issue is, in my task switch handler, I load esp, ebp, and cr3 with values from the structure containing info about the process. The routine in question looks like this:

Code: Select all

task_switch_real:
	cli
	mov ecx, [esp + 4]	; eip
	mov eax, [esp + 8]	; physical address of current paging dir
	mov ebp, [esp + 12] ; ebp
	mov esp, [esp + 16] ; esp
	mov cr3, eax		; set paging directory
	mov eax, 0xDEADBEEF	; magic value to detect task switch
	sti
	jmp ecx
This routine, and, thus, the instruction pointer when this routine executes, is in the kernel code. When I switch cr3 to that of the new process, kernel code is no longer mapped in, and I'm unable to execute the last instruction to jump to the new process's entry point as the instruction pointer is invalid, and I get a triple fault.

My initial idea was to turn off paging while I switch cr3, but that has the exact same problem: as soon as I turn paging back on to jump to the process's entry point, eip will be invalid again.

How can I work around this? Thanks!
User avatar
Ch4ozz
Member
Member
Posts: 170
Joined: Mon Jul 18, 2016 2:46 pm
Libera.chat IRC: esi

Re: Can't jump to entry point after switching page directory

Post by Ch4ozz »

What I did is mapping the kernel code as readonly (Execute only) into all my context clones.
This way I can still execute the code there :)
LtG
Member
Member
Posts: 384
Joined: Thu Aug 13, 2015 4:57 pm

Re: Can't jump to entry point after switching page directory

Post by LtG »

codyd51 wrote:Hello,

Up until now, my exec() hasn't removed kernel code+data mappings from the page mappings. This is something I want to do now. I've created a new directory, mapped in the executable's segments, and set up a small stack space.

The page mappings look like this:
08048000-08049000 1000 -rw
10000000-10001000 1000 -rw

My issue is, in my task switch handler, I load esp, ebp, and cr3 with values from the structure containing info about the process. The routine in question looks like this:

Code: Select all

task_switch_real:
	cli
	mov ecx, [esp + 4]	; eip
	mov eax, [esp + 8]	; physical address of current paging dir
	mov ebp, [esp + 12] ; ebp
	mov esp, [esp + 16] ; esp
	mov cr3, eax		; set paging directory
	mov eax, 0xDEADBEEF	; magic value to detect task switch
	sti
	jmp ecx
This routine, and, thus, the instruction pointer when this routine executes, is in the kernel code. When I switch cr3 to that of the new process, kernel code is no longer mapped in, and I'm unable to execute the last instruction to jump to the new process's entry point as the instruction pointer is invalid, and I get a triple fault.

My initial idea was to turn off paging while I switch cr3, but that has the exact same problem: as soon as I turn paging back on to jump to the process's entry point, eip will be invalid again.

How can I work around this? Thanks!
Why do you want to do that? AFAIK, "nobody" does that. That's why you have "higher-half kernels" in the first place.

If you want, you can minimize the amount of stuff in the VAS, but you can't really eliminate it. For instance, how would you handle interrupts, if there's no IDT (which is considered to be a part of the kernel usually) in the VAS then what happens when you get an interrupt?

So this brings me back to my original question, why? If there's some specific reason then maybe we can suggest a better solution.

If you absolutely must do it, even though it's bad and you "shouldn't", I guess you could use hardware task switching (TSS). It's not supported on x86_64 though.
User avatar
bzt
Member
Member
Posts: 1584
Joined: Thu Oct 13, 2016 4:55 pm
Contact:

Re: Can't jump to entry point after switching page directory

Post by bzt »

codyd51 wrote:When I switch cr3 to that of the new process, kernel code is no longer mapped in, and I'm unable to execute the last instruction to jump to the new process's entry point as the instruction pointer is invalid, and I get a triple fault.
No, the instruction pointer is not invalid, your code is. The code you're executing is read by the cpu through the MMU, therefore it must be mapped in memory. If you remove the mapping, your code disappears, no wonder the cpu can't execute it any longer. When you change cr3, you have to make precautions to include the executing code in the new map. This means at least one page (with the "mov cr3" instruction) must be mapped in the old and in the new map at the same location. But when you start to use system calls and ISRs (as LtG suggested), you'll realize that you need the other kernel pages as well. So make peace with it, have your kernel mapped in all memory maps.
I understand that you try to find new ways (and I think that's the whole point of writing an OS, not copy'n'pasting tutorials), but there are certain limitations on the hardware you're using which cannot be eliminated. Having your kernel code (or at least part of it) mapped in memory all times is one of them.
samiam95124
Posts: 9
Joined: Sun Sep 11, 2016 12:54 pm

Re: Can't jump to entry point after switching page directory

Post by samiam95124 »

bzt wrote:
codyd51 wrote:But when you start to use system calls and ISRs (as LtG suggested), you'll realize that you need the other kernel pages as well. So make peace with it, have your kernel mapped in all memory maps.
I understand that you try to find new ways (and I think that's the whole point of writing an OS, not copy'n'pasting tutorials), but there are certain limitations on the hardware you're using which cannot be eliminated. Having your kernel code (or at least part of it) mapped in memory all times is one of them.
Yes, but that issue is specific to x86.

Cheers,

Scott Franco
San Jose
User avatar
iansjack
Member
Member
Posts: 4706
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Can't jump to entry point after switching page directory

Post by iansjack »

That's incorrect. (Almost) all modern processors provide memory management and interrupt facilities, so the same consideration applies.
samiam95124
Posts: 9
Joined: Sun Sep 11, 2016 12:54 pm

Re: Can't jump to entry point after switching page directory

Post by samiam95124 »

Nooooo....

Osdev tends to be overly focused on the x86 series processors. The need to map kernel structures into user space comes from a design flaw in the x86 that goes back to the x286 processor, namely hardware based task switching. When interrupting or syscalling back to kernel, processors with PMM outside of the x86 family swap a minimum set of registers and resources to accomplish the context change. Typically this means having a kernel IP pointer, Kernel stack pointer, and kernel page root pointer in addition to the ones for user space and swapping them on interrupt/syscall and clearing the caches.

x86 actually does this as well, but it was designed into an overly complex mechanism (the TSS). Thus it was/is deprecated and then removed at AMD64 mode. This kind of design flaw keeps rolling into further designs. There is no way to explain the need to user map the kernel in virtual machine designs, so the x86 needed a VMM "mode", which would have been handled by the regular processor semantics if the TSS issue didn't exist.

We agree that cross mapping "is just the way it is", on x86, but this design flaw does not apply to other processors. And unless the board is changed to OSdevx86 I don't think it makes sense to push the concept as universal.

Scott Franco
San Jose
User avatar
iansjack
Member
Member
Posts: 4706
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Can't jump to entry point after switching page directory

Post by iansjack »

Nobody uses hardware-based task switching. The need to map kernel memory space into user processes is nothing to do with that or which registers are saved on a task switch. Actually, an interrupt or system call doesn't necessarily involve a task switch so that's a red herring.
LtG
Member
Member
Posts: 384
Joined: Thu Aug 13, 2015 4:57 pm

Re: Can't jump to entry point after switching page directory

Post by LtG »

samiam95124 wrote:Nooooo....

Osdev tends to be overly focused on the x86 series processors. The need to map kernel structures into user space comes from a design flaw in the x86 that goes back to the x286 processor, namely hardware based task switching. When interrupting or syscalling back to kernel, processors with PMM outside of the x86 family swap a minimum set of registers and resources to accomplish the context change. Typically this means having a kernel IP pointer, Kernel stack pointer, and kernel page root pointer in addition to the ones for user space and swapping them on interrupt/syscall and clearing the caches.

x86 actually does this as well, but it was designed into an overly complex mechanism (the TSS). Thus it was/is deprecated and then removed at AMD64 mode. This kind of design flaw keeps rolling into further designs. There is no way to explain the need to user map the kernel in virtual machine designs, so the x86 needed a VMM "mode", which would have been handled by the regular processor semantics if the TSS issue didn't exist.

We agree that cross mapping "is just the way it is", on x86, but this design flaw does not apply to other processors. And unless the board is changed to OSdevx86 I don't think it makes sense to push the concept as universal.

Scott Franco
San Jose
By osdev, do you mean this site or osdev in general? This site is focused on osdev, not x86, however most people are most familiar and mostly work with x86 due to practical reasons. So I wouldn't say it's overly focused, it's practically focused.

Even if you had a hardware system where there's a separate kernel memory and user memory, it's still the same thing really. If you want, you can consider the "bit" that toggles kernel vs user to be the most significant bit of the long mode virtual address space and presto, you have the same functionality on x86_64.

And it wasn't a "design flaw", there's really two reasons why the hardware task switching didn't catch on, monolithic kernels at best only benefit from it marginally (given that modern x86 can do multiple instructions per cycle) and in cases where a full task switch isn't necessary it would incur extra overhead.

When designing long mode AMD would have had to essentially re-create the hardware task switching at least because of the increased number of registers, but likely for other reasons as well. Given that nobody used it and the extra effort (in silicon, complexity, energy use, etc) is relatively big, they decided to opt-out.

Btw, what's so "complex" in the TSS, on a logical level?
samiam95124
Posts: 9
Joined: Sun Sep 11, 2016 12:54 pm

Re: Can't jump to entry point after switching page directory

Post by samiam95124 »

"And it wasn't a "design flaw", there's really two reasons why the hardware task switching didn't catch on, monolithic kernels at best only benefit from it marginally (given that modern x86 can do multiple instructions per cycle) and in cases where a full task switch isn't necessary it would incur extra overhead."

Read what I said again please. I was referring to the need to map kernel space into user addresses as a design flaw.

"Even if you had a hardware system where there's a separate kernel memory and user memory, it's still the same thing really. If you want, you can consider the "bit" that toggles kernel vs user to be the most significant bit of the long mode virtual address space and presto, you have the same functionality on x86_64."

We'll agree to disagree. In fact, being forced to kernel map into user space breaks the virtual machine model, since in a virtual machine, you are supposed to have clean hardware to yourself. Having a block of memory in the VMM mapped to the kernel breaks that. The VT-x and AMD-V both have support to fix that, but it would never have needed fixing if it was not broken in the first place.

"Btw, what's so "complex" in the TSS, on a logical level?"

I was referring to CISC vs. RISC. And indeed, that is why AMD got rid of it.

Cheers,

Scott Franco
San Jose, CA
User avatar
bzt
Member
Member
Posts: 1584
Joined: Thu Oct 13, 2016 4:55 pm
Contact:

Re: Can't jump to entry point after switching page directory

Post by bzt »

samiam95124 wrote:"Btw, what's so "complex" in the TSS, on a logical level?"

I was referring to CISC vs. RISC. And indeed, that is why AMD got rid of it.
That's not true. ARM has exactly the same functionality, the only difference is that the fields of the TSS (or fields for equivalent functionalities) are hardwired in the chip or stored in registers. The intel way only differs in one way to that, it has a structure in memory so that the programmer can override it's values more easily. Same applies to IDT. On intel you can place it anywhere in memory, on ARM the vector branch table must be at the beginning of the memory (like in intel's real mode). Regardless for both architectures the table must be mapped in at same location in all memory mappings, otherwise IST can't be located when an interrupt fires.
User avatar
iansjack
Member
Member
Posts: 4706
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Can't jump to entry point after switching page directory

Post by iansjack »

Switching memory maps for every interrupt and system call would be horrendously inefficient. Which is why OSs don't do that.
Post Reply