Page 1 of 2

Segmentation

Posted: Wed Aug 15, 2007 2:32 pm
by pacman
I've created a kernel with paging, and now Im interested in trying out segmentation (just for the fun of it).

There arent much segmentation tutorials and theres a few things I dont quite understand...

What happens when you access memory from a C compiled program? Does the program (compiled C code) manually manage the segment registers or does the CPU/MMU modify them on memory access if a segment is defined in a descriptor table?
In general if you want to use the segmentation mechanism, by having the different segment registers represent segments with different base addresses, you won't be able to use a modern C compiler, and may very well be restricted to just Assembly.
Can anyone tell me why?

Has anyone actually made a multi-segmented memory based kernel in C?

Posted: Wed Aug 15, 2007 3:05 pm
by sancho1980
You cannot do that because C has no language elements that provide for segmentation. The compiler emits code that just assumes that data and code and everything else has the same base address. HLL like C are there to keep the user away from these nasty things. If you want to do segmentation with high level elements, try HLA (that's what I'm doing):

http://webster.cs.ucr.edu/AsmTools/HLA/dnld.html

Posted: Thu Aug 16, 2007 4:01 am
by elderK
Segmentation itself isnt a bad, nasty thing. The real-mode 64 KiB limitation, sure, but Segmentation itself can be very, very powerful.

~zeii

Posted: Thu Aug 16, 2007 7:33 am
by JamesM
Yet it requires micromanagement of memory addresses - calculating offsets from 4 possible different locations make things about 10 times worse for the compiler (as if it doesn't have it bad already!) It also destroys all transparency over the memory space, which is a buzzword that computer scientists spend many years designing systems to conform to.

Posted: Thu Aug 16, 2007 9:45 am
by pacman
Would this mean that all future applications would also have to be created in ASM/HLA?
The only freely-available C compiler that supports both 32-bit code and multiple segments is Watcom C
I only just found this today. Anyone ever tried/heard of it before?

Posted: Thu Aug 16, 2007 9:58 am
by JamesM
Would this mean that all future applications would also have to be created in ASM/HLA?
yes, if you needed to use segmentation in user-mode programs.
I only just found this today. Anyone ever tried/heard of it before?
Heard of it, never used it. You could make your own compiler though....

8)

Posted: Thu Aug 16, 2007 10:32 am
by frank
pacman wrote:
The only freely-available C compiler that supports both 32-bit code and multiple segments is Watcom C
I only just found this today. Anyone ever tried/heard of it before?
I'm not so sure that Watcom supports segmentation and 32 mode at the same time. It is worth a try though. I have used it before and I did not like it that much. It was just so confusing to use. Of course I really didn't know what I was doing back then so maybe it wouldn't be so bad if I tried it again.

Posted: Thu Aug 16, 2007 11:19 am
by Gizmo
You could use segmentation in most compilers if you make an external asm with all of these functions and just call those external functions when you want to change segments, but this could get tricky. :shock:

I don't personally use this, but alot of c osdev tutorials use external asm functions to handle anything that can't be handled in plain c.

Posted: Thu Aug 16, 2007 11:20 am
by jnc100
CC386?
Never tried it though. Personally, I think segmentation is a bad idea, although you could switch segments with inline asm in gcc.

Regards,
John.

Posted: Thu Aug 16, 2007 11:38 am
by JAAman
I'm not so sure that Watcom supports segmentation and 32 mode
it used to -- though i thought i heard that they abandoned segmentation - idk, maybe im wrong about that, never used it myself
Personally, I think segmentation is a bad idea
i cant agree enough -- everything useful in segmentation, can be done in paging alone, and segmentation with paging is much more complicated (and segmentation without paging is much harder)
although you could switch segments with inline asm in gcc
the biggest problem with using segmentation with gcc is not changing segments (thats easily handled through ASM), its the fact that GCC assumes that ds.base == ss.base == es.base (and in some cases CS also)

GCC will use DS/SS/ES interchangably -- referencing the same address with different segment registers, meaning your program will probably crash if the hidden portion of DS differs from that of SS and ES

Posted: Thu Aug 16, 2007 11:56 am
by pacman
I just need to make sure what Im doing will work 8)

From what I've been reading in the past few hours, I figured that I could access data from different segments with Far/Huge pointers (I may be entirely wrong) and that you dont need anything special to access data in the same segment (which is probably why flat memory models dont need the segment registers to change).

Heres a run-down of what my code currently does, or will do in the future if my theories are right :wink: . Im not sure if its correct but anyway....

When loading an app...
Creates a code segment and a data segment for the app (and possibly a stack).

When dynamically allocating memory...
Creates a data segment the size of the memory to allocate.

So would everything work if I use C and use far/huge pointers for dynamically allocated memory, and treat everything else as if segmentation just didnt exist.

Sorry if I sound really ignorant about all of this :D (because I am!)

EDIT:

i cant agree enough -- everything useful in segmentation, can be done in paging alone, and segmentation with paging is much more complicated (and segmentation without paging is much harder)
I agree as well - I've already written a paging based kernel and just want to try segmentation for educational purposes and for the fun of it.

Posted: Thu Aug 16, 2007 12:08 pm
by JAAman
the only problem with this is your compiler -- if you use GCC (and most other compilers), then when referencing data, the compiler will sometimes use DS, and other times use SS (and sometimes use ES), assuming that the same offset to all three will point to the same memory location, so just be careful of this issue


the other problem is it isnt compatible with LMode -- though you may have no intention of supporting LMode anyway

Posted: Thu Aug 16, 2007 12:17 pm
by pacman
when referencing data, the compiler will sometimes use DS, and other times use SS (and sometimes use ES), assuming that the same offset to all three will point to the same memory location, so just be careful of this issue
I could probably find ways around this problem. The first thing that comes to mind is using a single data segment per task. It shouldnt be any different to using multiple data segments, except that the OS may have to do more work sorting out memory fragmentation and managing the free space (and the fact that C compiled code will work!).
the other problem is it isnt compatible with LMode -- though you may have no intention of supporting LMode anyway
I dont even know what it is...

EDIT: What does GCC do with FS and GS?

Posted: Thu Aug 16, 2007 12:36 pm
by elderK
;) Jeez, way to think outside the box.

~Z

Posted: Thu Aug 16, 2007 12:43 pm
by JAAman
LMode is 64bit extensions to x86 (used by the 64bit versions of existing OSs -- it doesnt support segmentation -- for the reasons i said before -- it doesnt make sense practically) -- it also wont be portable to alternative CPUs (as most CPUs dont have any segmentation)


using a single data segment per process can help (as long as that segment is also used for both SS and ES) -- but it does make managing fragmentation your biggest issue (as segments must be contiguous this could get very complicated -- adding paging can help -- using both, the segmented addresses can be moved in physical memory by editing the page tables, without having to actually move the data (as segmentation is applied first, then altering the page tables will move the relation to physical memory, allowing you to alter the segments without having to move the data)