How GDT fits into the picture? Explain through code? O_o

vargadanis · Post by **vargadanis** » Thu Feb 24, 2011 5:34 pm

Hi,

I am interested in how and why this ASM code works the way it works and how GDT fits into the picture.. I have written a very very simple C application to see what the ASM code would look like.
w/ syntax highlighting: http://paste.pocoo.org/show/344118/

Code: Select all

#ASM.C :
#
#void main() {
#  int a, b;
#  a = 100;
#  b = a + 5;
#}

	.file	"asm.c"
	.text
.globl main
	.type	main, @function
main:
	pushl	%ebp                         # push ebp to stack - back it up?
	movl	%esp, %ebp                    # copy stack pointer to base pointer - where does the stack point point now?
                                        # what's the connection with GDT?
	subl	$16, %esp                     # decrement stack pointer by 16, allocate memory for local vars, 2x8byte (shorts?)
	movl	$100, -4(%ebp)                # copy literal 100 to -4 base pointer (-4 relative offset of base pointer)
	movl	-4(%ebp), %eax                # copy -4 offset to eax generic register
	addl	$5, %eax                      # add literal 5 to eax register
	movl	%eax, -8(%ebp)                # store the result of addition to -8 relative offset of ebp
	leave				      # what does this line and ret do to the stack? pop?
	ret
	.size	main, .-main
	.ident	"GCC: (Ubuntu/Linaro 4.4.4-14ubuntu5) 4.4.5"
	.section	.note.GNU-stack,"",@progbits

I have a couple of questions...
Intel manual sais:

all memory accesses pass through either the global descriptor table (GDT) or an optional local descriptor table (LDT)

That is simple... Segment descriptor = an entry in the GDT tables... each descriptor has a segment selector that does some technoblah blah I have no idea about O_o

To access a byte in a segment, a segment selector and an offset must be supplied. The segment
selector provides access to the segment descriptor for the segment (in the GDT or LDT). From
the segment descriptor, the processor obtains the base address of the segment in the linear
address space. The offset then provides the location of the byte relative to the base address.

That doesn't make much sense to me.. So when I call this application, the things happen in the kernel I do not quite understand. Could someone tell me what? How does GDT fit into the picture?

bewing · Post by **bewing** » Thu Feb 24, 2011 6:15 pm

The simple answer is to say that the GDT is not really used in modern programming. The GDT is used to set "bases" and "limits" into the segment registers. In modern programming, the "bases" are always set to 0, and the limits are always set to 4Gb (or "as big as possible"). This is called a "flat address space", and it basically means that the GDT and segments are not being used.

To understand your code disassembly, you need to understand the concept of a "stack frame". EBP is a special register used for stack frames. It exists for the purpose of keeping track of where the stack was pointing when a function was entered. Whenever you enter a function, the first thing you do is PUSH the old value of EBP (for the parent function), then save the current value of the stack pointer. All the arguments to a function are on the stack just above the stack frame. Any "automatic variables" (that are allocated on the stack) are allocated at known offsets just below EBP. This is why it is convenient to have a stack frame -- it serves as a base pointer for your local variables and arguments. So, as you examine disassemblies, you fill find that all functions start with "PUSH EBP ; MOV EBP, ESP" (in intel syntax).

After the stack frame is set up, then (as you say) storage for local variables gets set up. Then the actual code for the function happens (in your case, some math). In your case, you can see that the local variable "a" is stored in stack memory at (stack frame pointer - 4), and "b" is stored at (stack frame pointer - 8). The relationship between your math in C, and the math in ASM is obvious.

At the end of the function, any return value is stored in EAX. (your function is void)
Then the "stack frame" needs to be undone, to allow the function to return. In your case, this consists of copying the stack frame pointer back into ESP, doing a POP EBP (to restore the stack frame pointer of the parent function), and then a RET. The LEAVE instruction is a shorthand opcode for undoing a stack frame. It does both steps -- copying EBP to ESP, and doing the POP of the old value of EBP.

a5498828 · Post by **a5498828** » Thu Feb 24, 2011 6:40 pm

afair someone flamed me for not being 'advanced enough' programmer in one thread. I just wonder if its gona hapen also here to you.

Code: Select all

pushl   %ebp                         # push ebp to stack - back it up?

Yes. You have later access to outer and inner variables. You would not like to use esp to address it.

Code: Select all

 movl   %esp, %ebp                    # copy stack pointer to base pointer - where does the stack point point now?

to ebp. under esp (esp points to lowest byte of ebp). push is equal to subtracting 4 from esp and copying under esp address ebp.

Code: Select all

                                  # what's the connection with GDT?

what gdt? its just a stack frame.

Code: Select all

 subl   $16, %esp                     # decrement stack pointer by 16, allocate memory for local vars, 2x8byte (shorts?)

yes, 16 bytes of local storage.

Code: Select all

   movl   $100, -4(%ebp)                # copy literal 100 to -4 base pointer (-4 relative offset of base pointer)

its called immediate in asm.

Code: Select all

movl   -4(%ebp), %eax                # copy -4 offset to eax generic register

i dont know this syntax enough, but i melive its equivalent to mov eax, [ebp - 4]. So copy data from ss:ebp-4 to eax. Copying offset would be with lea.

Code: Select all

   addl   $5, %eax                      # add literal 5 to eax register

yes

Code: Select all

 movl   %eax, -8(%ebp)                # store the result of addition to -8 relative offset of ebp

yes

Code: Select all

 leave                  # what does this line and ret do to the stack? pop?

dont use it. Dont use enter or leave. They are missing from 8086. leave = mov sp, bp, pop bp. remember that you used mov sp, bp, wich saved sp in bp. this reverse it.

That is simple... Segment descriptor = an entry in the GDT tables... each descriptor has a segment selector that does some technoblah blah I have no idea about O_o

If thats simple why you post problem related to it?
Each descriptor is indexed in GDT/LDT. index is segment selector. each descriptor has 8 bytes. so indexes are 0, 8, 10, 18, 20, etc. 0 is a special one. You cant load 0 to stack/code segment, or access memory while ds or es (also gs and fs) is 0.
Linear address is taken by adding segment descriptor base to offset.
Base is either 24 or 32 bit. In long mode its forced to 0.

To access a byte in a segment, a segment selector and an offset must be supplied. The segment
selector provides access to the segment descriptor for the segment (in the GDT or LDT). From
the segment descriptor, the processor obtains the base address of the segment in the linear
address space. The offset then provides the location of the byte relative to the base address.

Have you read the manual?
Intel® 64 and IA-32 Architectures
Software Developer’s Manual
Volume 3A:
System Programming Guide, Part 1

vargadanis · Post by **vargadanis** » Thu Feb 24, 2011 7:05 pm

Yeah, that quote is from the intel manual. That was a bit too complicated for me to understand at first.
I have read upon stack, heap, BSS, TEXT etc and now I understand how an application is being loaded and ran by the OS. O confused the lack of knowledge of those above with the questions regarding the GDT.
Now I understand bit more. Had to read more about how memory was accessed when there was only 64 or 1MB available. Segmentation in Protected mode is rarely used nowadays, flat memory model is used where the segment is set to 0 and the offset can be anything between 0 and 2^32 to get the memory location. (logical one I think, physical translation happens in the FSB or northbridge? (doesn't matter to me really) on intel archs)

I think I am getting there... slooowly and need lotta reading.. thanks for the posts!

a5498828 · Post by **a5498828** » Thu Feb 24, 2011 7:41 pm

Physical translation happens on cpu. If you read the manual you see the paging chapter. CPU is always sending physical address outside. ALWAYS. Virtual address concept is just internal to cpu.

Code: Select all

flat memory model is used where the segment is set to 0 and the offset can be anything between 0 and 2^32 to get the memory location.

Yes. segment base is set to 0. offset can be 0 to 2^32 - 1, using 2^20 bits. With granuality, you can expand it by changing the byte into 4096 bytes the offset points to.

Also remember, that segment selector has 3 parts in it. index to table, RPL and table indicator. table indicator tells if use gdt or ldt (ldt is taken from ldtr - it holds gdt index of ldt descriptor). rpl is the way of ensuring that more privleged code wont access data wich it shouldnt. basicly when you send a pointer (seg:off) to higher privileged code, this code must not access data the user isnt allowed to. it use arpl to possibly elevate rpl of pointer to match user cpl (from cs) wich cannot be explicitly changed.

shadowH3 · Post by **shadowH3** » Wed Mar 09, 2011 2:03 am

Take a look here:

The GDT sets up the CPU for certain 'modes' or segments. i think the first guy explained it best.

You can adress these in your code later on. For now you set them. Almost as boring as the IDT(isr int vectors)...yes, you will need those in protected mode.

in asm:
jmp 08h:code

jumps below, but uses GDT entry at 8

code:

You get into this more when jumping(as it were) to different tasks, ring3(user mode), and/or allowing video ram access later on

[Im still geting to this...you need RM compiler for this(a headache in pascal, but possible(RARE, but possible)) You need newfrontier and dpmi units that can handle 32-bit code.32-bit RTL in assembler doesn't hurt, either. Yes, i patched the crt unit. Runnning code @ 3GHZ, and no issues...]

So there is a use for this.
Yes, I have a ring3 switch..Im debugging a GPF condition at the moment..along with some other code Napalm was so kind to donate on HIS boards in assembler...

You can see what is set if you use vbox/qemu. Qemu has an undocumented ctrl-alt-2 debugger mode. type in 'info registers'. The top half ones get cut off, but hex codes can be utterly useless if not accurate(IE: post interrupt,post error producing isr,etc..).

The most important regs will be seen as such the GDT, IDT load vectors and CR/CPL regs, which tell you if your in ring0 or 3. I have code in the kernel that displays this info if im in a working state(not often as of late).

The TSS works, (QEMU bombs on TSS fault if not correct)I have issues with the Ring3 switch.
Its not as easy as it looks.

code.google.com/coffeeos
-Jazz

turdus · Post by **turdus** » Wed Mar 09, 2011 7:44 am

OMG. Do not listen to a[0-9]+, it's totally foolishness.
My favourite:

a5498828 wrote:what gdt? its just a stack frame.

And to answer your question: intel used segmentation (kind of memory access method) for real mode and protected mode.
In real mode segment registers were 16 bit long, and they hold the base address multiplied by 16, and you could see a 64k window of memory.
In protected mode protection mechanism was put in (how suprising), and the segment registers remain 16 bit long, so there was no place for protection bits and the base address. Therefore they decided to create a table holding necessary information (descriptor tables), and put only an index in segment registers (selectors).
In long mode they realized that base offset was always 0, and everybody used a window size of FFFF..Fh, so they drop out the relevant fields from the table leaving only protection information, selectors remain the same. That's it in short.

Anyway your nick suggests me that your IRL name is Varga Denes, meaning you are hungarian too. Correct?

OSDev.org

How GDT fits into the picture? Explain through code? O_o

How GDT fits into the picture? Explain through code? O_o

Re: How GDT fits into the picture? Explain through code? O_o

Re: How GDT fits into the picture? Explain through code? O_o

Re: How GDT fits into the picture? Explain through code? O_o

Re: How GDT fits into the picture? Explain through code? O_o

Re: How GDT fits into the picture? Explain through code? O_o

Re: How GDT fits into the picture? Explain through code? O_o