How do I make the connection between code and a GDT segment?

rknox · Post by **rknox** » Sun May 30, 2010 8:10 pm

How do I make the connection between code and a GDT segment?

I’m working in fedora using GCC and LD.

I see by examining a dump and map of an executable that the linker has put all the data references in one program I’m looking at, into segment with selector 0x10, with the code at 0x08.

For example, setting the stack pointer in that program involves the command 'BC address'. Where B is the one byte command to move immediate data to a register, C tells it the register is SP, and the address has 0x10 as its top byte, and the correct offset as its low 3 bytes.

I see that another program has the code at 0x10 and the data at 0x18.

Both are C programs compiled with gcc and linked with ld.

MAYBE - asking prompted an idea - maybe the section command has an arg where I can specify the segment register. Will go look.

MAYBE - asking the question prompted an idea. maybe there are args to the section command which tell the assembler what segment register to use? I'll look.

I am unable to find a linker switch or script command that controls this.

I see that in C, global data goes into sections bss or data or COMMON depending on how it is or is not initialized. In assembler of course I can put data into any section I want.

I can set up GDT segments as I choose and assign them to segment registers, no problem.

I don’t see how to place - for example - .bss at the segment which is offset in the GDT at – for example – 0x18. Right now the linker is placing it at 0x10. Even if I find out which segment register is used for .bss, and set that register to 0x18, unless I’m wrong the linker doesn’t know about that and will put it at 0x10.

The 2 programs are different in this respect. One program is identified in the linker script as architecture i386, the other i686. I see that the architecture determines which routines in a library – forget name right now – 3 letters – maybe starts with D. Does this control which segment register is used? If so, how do I find that library? I’m new to Intel and almost a virgin regarding Unix and it’s children.

I'm sure there is a way to use the remaining slots in the gdt, and I can load them, and assign them to registers, but I dont see how to make the connection to code or data.

Hope one of you can lead me out of my darkness - ****

~ · Post by ~ » Sun May 30, 2010 8:50 pm

I don't know if it helps, but you can load and pass GDT selectors using general-purpose registers.

Assume that you have properly configured three GDT descriptors, at offsets 8, 16 and 24 (decimal) of the GDT.

Assume that selector 8 is a 4GB code selector, 16 is a 4GB data selector and 24 is also a 4GB data selector.

You could assign the stack segment, the default data segment and the extra data segment to selector 16 with something like this, in NASM syntax:

Code: Select all

mov ax,16   ;load 16 into AX
mov ss,ax   ;load 16 into SS
mov ds,ax   ;load 16 into DS
mov es,ax   ;load 16 into ES

And you could assing, say, FS and GS data selectors the selector 24:

Code: Select all

mov ax,24   ;load 24 into AX
mov fs,ax   ;load 24 into FS
mov gs,ax   ;load 24 into GS

For GCC inline assembly, you would use AT&T syntax, which if I remember correctly would look something like this for the previous code:

Code: Select all

movw $16,%ax   ;load 16 into AX
movw %ax,%ss   ;now load 16 into SS...
movw %ax,%ds   ;and into DS...
movw %ax,%es   ;and into ES

movw $24,%ax   ;load 24 into AX
movw %ax,%fs   ;now load 24 into FS...
movw %ax,%gs   ;and into GS

gerryg400 · Post by **gerryg400** » Sun May 30, 2010 9:11 pm

The answer to your original question is, I think, that you can't. It is the OS that makes the connection between code and GDT segment. Many OS loaders only really use 2 GDT entries for user programmes. One for code (text, rodata etc.) and one for data(data and bss). Usually the gcc toolchain (ld mainly) is configured to do this for the OS.

Now if you're writing your own OS (I guess you are) you probably want to build a custom toolchain and configure it to do what you want.

Having said all that, I don't think you can get gcc to generate code that has the data and bss in separate segments unless the segments are identical wrt to base address. Specifically, you can't have a variable in the data segment at say address 0x1234 in its segment and a variable in the bss at address 0x1234 in another segment. gcc would need to have code to change the DS when accessing data and bss or would have to use say DS for data and FS for bss. It just won't do that.

rknox · Post by **rknox** » Sun May 30, 2010 9:29 pm

I'm new here so I dont fully understand how to use this forum. In response to the first comment, you tell me how to associate a register with a gdt segment, but how do I asociate some code with that register?? I thought that maybe the section command in the assembler would do it, but all it does is associate the code with a section. I thought maybe there was an ld command that associated a section with a segment register, but I cant find such a command.

And then I'm puzzled since one program I have uses 0x08 0x10 while the other uses 0x10 and 0x18??

I'm writing an op system and want my data less protected than the code. code is at 0, but I want to put data at 3 since the Intel book sez tht when you modify data from an interrupt routine, the code should be at 0 and the data at 3 -- but the system complains when I try to put my stack there, so I wanted to have one segment for the stck and th other for the data. I can do this, but the compiler wants to put stack and data at 0x10 - per my dump and map.

I will continue to struggle - for every problem there is a workaround. This reminds me of the old days when every code generation tool had a list of "undocumented features" (i.e. bugs) that you had to deduce, but should not use since the next release might fix them.

gerryg400 · Post by **gerryg400** » Sun May 30, 2010 10:09 pm

If you are writing an OS, it is up to you to associate the code, data etc from the executable file with the segments when you load the OS. If you use Grub as a loader you are stuck with whatever it does.

Basically what you do is define the segments you want by loading values into the GDT. You seem to know how to do this part. Then you simply copy your code and data into the segments required and away you go.

Now you seem to be asking another question about privilege level. I've not seen that part of the Intel manual but it doesn't seem correct. All logic says that kernel code and data should be at ring 0. Furthermore, the stack is data. So it should be part of the data segment in most cases. Again, gcc won't like it if the stack and data have different base address. The original reason for separate data and stack segments was not to separate the data and stack but rather to allow more memory back when segments were only 64k. Back then compilers had to generate code with lot's of segment override directives. They don't do that anymore.

Regarding being new to the forum, it seems you are doing everything right. Keep researching and asking questions.

Kenny · Post by **Kenny** » Mon May 31, 2010 5:51 am

From the processor point of view, you can happily have more than one segment set up for code, and when your code wants to jump from the code in one segment to the code in another segment, it will just need to execute a "far jump", which would look something like this: (the far jump automatically resets the Code Segment register to the new required value)

Code: Select all

Segment 0x10, starting at offset 0
FunctionA:
	nop
	nop
	nop
	jmp 0x16:FunctionB

Code: Select all

Segment 0x16, starting at offset 0
FunctionB:
	nop
	nop
	nop
	jmp 0x10:FunctionA

This side of it is no problem, the bigger problem is that, AFAIK, tools such as GCC have no concept of this and would not know how to produce these jumps.

Yes, you could easily use a linker script to separate the code out into different sections in the executable file, so that FunctionA was in .text and FunctionB was in something else, maybe .text16. Yes, you could then take this executable file and run it through your OSes executable loader which would detect the multiple executable sections and load each one into it's own executable segment. The problem, as I said, is that you would need some way to let the compiler know that when you call a function, it has to execute a far jump to whatever segment that function was going to be loaded into. (I guess inline assembly might give you a way to test the proof of concept, but it won'y be pretty.)

For completeness, as far as the data goes, a similar problem arises. The processor is quite happy to have multiple segments for data, and you can easily select them using the segment registers, like so.

Code: Select all

	mov ds, 0x40
	mov si, HelloString
	lodsb		; Load the 'H' from 'Hello World'

	mov ds, 0x48
	mov si, TeapotString
	lodsb		; Load the 'I' from 'I'm a little Teapot'

Code: Select all

Segment 0x40, starting at offset 0
HelloString:
	db "Hello World!", 0

Code: Select all

Segment 0x48, starting at offset 0
TeapotString
	db "I'm a little Teapot", 0

Separating these out with a linker script is again trivial, but getting the compiler to generate code to switch to the appropriate segment before accessing the particular value ... I have no idea how you would start.

Feel free to slap me down if this is very far off the mark of what you were asking

rknox · Post by **rknox** » Mon May 31, 2010 9:01 am

Your responses have been very helpful. I now need a few days/weeks? of heavy reading to better understand and digest - after which I will thank you indivdually. I was confused about the privilege levels - on re reading Vol 3a Chapter 6.12.1.1 I believe that it only applies when the trap handler itself starts at a user level.

I need to understand how the op sys and its clients can be written in a compiler that does not allow control of the segment. We cant have op sys and client code in the same seg! - I suspect it is in the task handling features which I still have not digested.

Again - thanks
****

Kenny · Post by **Kenny** » Mon May 31, 2010 9:47 am

Don't think I'm trying to put you off here (I'm really not, if we always wanted to take the easy way out, we wouldn't be writing OSes), but I would be remissed if I didn't point out that this would be a fairly unusual way of doing it. Most OSes use virtual memory to separate the memory spaces of the kernel and the different tasks, and don't make heavy use of segmentation.

With virtual memory, we basically get to "renumber" the physical memory without needing to alter the code we're loading at all. The processor is initially configured with a page table (a mapping) for the kernel. As each new task is created, they have a new page table of their own created. As the process handling code of the kernel switches from one task to another, one of the things it does is to load the new page table, changing the processors "view" of the physical memory to be that suitable for the task that it about to run.

TaskA cannot interact with the memory of TaskB because there isn't any address in the virtual memory space of TaskA that is mapped to the physical memory that contains any of the data for TaskB.

There are many good tutorials and documents about virtual memory, so I shall say no more about it.

Good luck

OSDev.org

How do I make the connection between code and a GDT segment?

How do I make the connection between code and a GDT segment?

Re: How do I make the connection between code and a GDT segm

Re: How do I make the connection between code and a GDT segm

Re: How do I make the connection between code and a GDT segm

Re: How do I make the connection between code and a GDT segm

Re: How do I make the connection between code and a GDT segm

Re: How do I make the connection between code and a GDT segm

Re: How do I make the connection between code and a GDT segm