Keeping thread TSS and SS0 stack in LDT

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
User avatar
Owen
Member
Member
Posts: 1700
Joined: Fri Jun 13, 2008 3:21 pm
Location: Cambridge, United Kingdom
Contact:

Re: Keeping thread TSS and SS0 stack in LDT

Post by Owen »

The problems I see with most segmentation implementations
  1. Insufficient segment registers. x86 has 3 general purpose (ES, FS, GS) registers; though ES is implicit in some instructions so only GS and FS are truly general purpose
  2. Segmented addressing creates weirdly sized pointers
  3. Segment indexes are disjointed (i.e. the segment following Segment 0x10 is segment 0x18, not 0x11
The way to correct this - in my opinion - is to integrate paging and segmentation. That is, the upper N bits of the address index the (potentially hierarchical) segment table, and this table defines each segment. Multiple modes could be offered:
  • Direct physical memory mode (i.e. the base contained in the descriptor is used as a physical memory address)
  • Various hierarchical paging modes
I would use the upper two bits of the address to select between various tables:
  • 11: System global segments; kernel administered
  • 10: Application segments; kernel administered
  • 0x: Application segments; application administered
(Application administered segments would function in a slightly different manner in order to maintain system security. You might require they be defined in terms of a kernel-administered segment, for example)

Implementing this efficiently would have similar complexity to implementing paging efficiently.
ErikVikinger
Member
Member
Posts: 30
Joined: Wed Jan 13, 2010 7:59 am
Location: Germany / Nuernberg

Re: Keeping thread TSS and SS0 stack in LDT

Post by ErikVikinger »

Hello,

Owen wrote:The problems I see with most segmentation implementations
What do you mean with "most" segmentation implementations? There is only one existing and your points go all only to x86. ;)

  1. That's true and a real handicap of x86, in addition that the segment registers are not free usable by some instructions (see movs) that makes a high resource pressure (like an accumulator centric architecture). In my CPU i will have 16 mostly free usable segment registers.
  2. That's also true but in my opinion not negative.
  3. Segment Selectors are not for calculations and so it is not a handicap. All segments are completely independent of each other. In my OS i will do allocate the selectors (LDT entries) randomly.
Owen wrote:The way to correct this - in my opinion - is to integrate paging and segmentation. That is, the upper N bits of the address index the (potentially hierarchical) segment table, and this table defines each segment.
If the N is flexible than you need every time then access a new address a complex lookup, see the problems of the paging of itanium with different/flexible page sizes and its hash based caches.
In my CPU (that i develop at this time) a far pointer has a 16 bit selector in addition to the 32/64 bit offset. One bit of the selector distinguish between GDT and LDT and the other 15 Bits are a index (all descriptor tables can contain 32k entries). All segments of the complete system (all applications and the kernel) use/share the same linear address space that is typical identical to the physical address space that additional contains the periphery hardware. For each segment can activated the paging independently (a bit in the descriptor).

The management of the LDT do complete the kernel, the application can create, resize and delete segments only through syscalls.
Owen wrote:Implementing this efficiently would have similar complexity to implementing paging efficiently.
Yes, but at runtime you can switch off the paging and get a higher performance (no TLB misses).


An other massive performance boost by segmentation is for IPC, you can easy inherit segments or parts of it from one process to an other process, completely independent of the amount of data (always O(1)). This makes possible a real zero copy IPC over multiple layers (e.g.: application -> TCP -> IP -> ethernet driver -> hardware). With this the overhead of IPC of a micro kernel can be small nearly the syscall overhead of a monolith, but you have the safety of a micro kernel and the sophisticated protection of segmentation.

berkus wrote:and still no segmentation
Is this an argument?


Greetings
Erik
Post Reply