Set 'em and Forget 'em

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
Post Reply
CelestialMechanic
Member
Member
Posts: 52
Joined: Mon Oct 11, 2010 11:37 pm
Location: Milwaukee, Wisconsin

Set 'em and Forget 'em

Post by CelestialMechanic »

In days of old (when knights were bold ...) the Linux kernel GDT had four main selectors (among others). These selectors were for data and code for ring 0, data and code for ring 3. The user selectors only covered the first 3 gigabytes of address space, so when a transition was made to ring 0 any data selectors such as DS and ES had to be replaced with their ring 0 counterparts for the duration of the system call or interrupt and restored on return. (I am referring to 32-byte systems here.)

Since then the preference has been for all four of these selectors to cover all 4 gigabytes of address space, and the paging system provides the protection for the upper 1 gigabyte. It has occurred to me that if DS, ES, FS, and GS are all set to the ring 3 data selector (and not used for special purposes such as thread local storage) that there will be no need to change these selectors again, ever. Just set 'em and forget 'em.

Of course we still need the ring 0 data selector for SS, but otherwise all memory can be read (and written to regardless of the S/U and R/W flags in the page table). When in user mode these bits will keep ring 3 code from accessing protected memory.

Has anyone tried this? Is there something I'm missing here? I'm about to have my microkernel start multithreading Real Soon Now, I will try this and no longer have my interrupts touch the DS, ES, FS, and GS registers.

Please forgive me if this topic was touched on in the past, but I could not find anything quite like this question.
Microsoft is over if you want it.
alexfru
Member
Member
Posts: 1111
Joined: Tue Mar 04, 2014 5:27 am

Re: Set 'em and Forget 'em

Post by alexfru »

SS:ESP will come from the TSS on transition into the kernel. However, the user can populate DS, ES, FS and GS with a null selector and you don't want a #GP in the kernel when accessing memory through null selectors.
CelestialMechanic
Member
Member
Posts: 52
Joined: Mon Oct 11, 2010 11:37 pm
Location: Milwaukee, Wisconsin

Re: Set 'em and Forget 'em

Post by CelestialMechanic »

SS:ESP will come from the TSS on transition into the kernel.
True, but the user should provide an initial ring 3 SS:ESP on the kernel stack above EFLAGS, CS:EIP at creation time for any thread meant to operate in user mode.
However, the user can populate DS, ES, FS and GS with a null selector and you don't want a #GP in the kernel when accessing memory through null selectors.
I've never heard of this. But I think I will limit myself to sensible values for selectors. I can't help thinking that somewhere down the road NULL selectors will cause problems. Indeed, the reason for the NULL selector was to provide a way for software to trap this error and (possibly) correct it.
Microsoft is over if you want it.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Set 'em and Forget 'em

Post by Brendan »

Hi,
CelestialMechanic wrote:Has anyone tried this? Is there something I'm missing here? I'm about to have my microkernel start multithreading Real Soon Now, I will try this and no longer have my interrupts touch the DS, ES, FS, and GS registers.
This is one of my old tricks!
CelestialMechanic wrote:
SS:ESP will come from the TSS on transition into the kernel.
True, but the user should provide an initial ring 3 SS:ESP on the kernel stack above EFLAGS, CS:EIP at creation time for any thread meant to operate in user mode.
However, the user can populate DS, ES, FS and GS with a null selector and you don't want a #GP in the kernel when accessing memory through null selectors.
I've never heard of this. But I think I will limit myself to sensible values for selectors. I can't help thinking that somewhere down the road NULL selectors will cause problems. Indeed, the reason for the NULL selector was to provide a way for software to trap this error and (possibly) correct it.
Typically the kernel also uses FS or GS used for "per CPU" data.

This means:
  • Potentially malicious CPL=3 code can load NULL into DS, ES, FS or GS. When the kernel uses the segment register it causes a general protection fault; and the general protection fault handler can restore the correct value for that segment and return from the general protection fault (to re-try the instruction that caused the exception and continue running normally).
  • Potentially malicious CPL=3 code can load its code segment into DS, ES, FS or GS. When the kernel uses the segment register it causes a general protection fault; and the general protection fault handler can restore the correct value for that segment and return from the general protection fault (to re-try the instruction that caused the exception and continue running normally).
  • Potentially malicious CPL=3 code can load its code segment or its data segment into FS or GS. If the kernel makes sure that there's a "not present" area starting at virtual address 0x00000000 in every process, and also makes sure that all offsets in its "per CPU" data areas are smaller than the size of that "not present" area; then if CPL=3 code can load its data segment into FS or GS the kernel will get a page fault when trying to access the "per CPU" data; and the page fault handler can restore the correct value for FS or GS and return from the page fault (to re-try the instruction that caused the exception and continue running normally).
In this way you can never load kernel data segments during IRQ handlers and the kernel API, and just let the kernel "auto-correct" if malicious CPL=3 tried to mess things up. Because segment register loads are slow (and because most user-space software isn't malicous) this can improve performance.

Note that there are a few restrictions:
  • If you use a segment register for "task local storage" then potentially malicious CPL=3 code can also load its "task local storage" segment into DS, ES, FS or GS. I've always used paging to create "thread specific storage" instead, so I've never had to care about this.
  • If you use virtual8086 mode; then it will break everything (if an IRQ handler interrupts a virtual8086 mode task, then the kernel would end up using "nonsense real mode" values in segment registers). This is one of the reasons I started refusing to use Virtual8086 mode originally (back before UEFI and long mode existed). Fortunately (now that UEFI and long mode do exist) there's even less reason to want to use Virtual8086 mode.
For both of these cases; there's no sane way to avoid loading kernel data segments during IRQ handlers.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
onlyonemac
Member
Member
Posts: 1146
Joined: Sat Mar 01, 2014 2:59 pm

Re: Set 'em and Forget 'em

Post by onlyonemac »

It's common these days to avoid using the segment registers for anything other than changing privilege level when required, and to use paging for memory protection. Paging makes the segmentation model redundant, and the only times you should ever have to change the segment registers if you're using paging is to get around a privilege level limitation imposed by the fact that, even if you don't use it, the segmentation system still exists and is still active.

Of course, you can always design your OS in another way that uses paging and segmentation alongside each other. I can't quite think of a situation where that would be a good design choice, but it's possible that someone might find an interesting way to use both together.
When you start writing an OS you do the minimum possible to get the x86 processor in a usable state, then you try to get as far away from it as possible.

Syntax checkup:
Wrong: OS's, IRQ's, zero'ing
Right: OSes, IRQs, zeroing
Post Reply