Page 1 of 1

Referencing a variable from assembly stored in a data segmen

Posted: Wed Aug 26, 2009 6:21 pm
by zman97211
I've come across another example of the problem I posted about earlier this week. Let me please explain.

I've set up a segment in my assembly source called s_data, of which only 3 bytes are defined. My linker script maps these to load at 0x200000. My GDT has an entry pointing to 0x200000, with a limit of 0x1000. DS is loaded with the selector pointing to that segment.

First, take a look at the following from bochs:

Code: Select all

(0) [0x0030007c] 0008:0000007c (unk. ctxt): mov al, byte ptr ds:0x200001 ; a001002000
<bochs:10> s
Next at t=32991012
(0) [0x00300081] 0008:00000081 (unk. ctxt): mov bl, 0xa0              ; b3a0
<bochs:11> sreg
cs:s=0x0008, dh=0x00409830, dl=0x00001000, valid=1
ds:s=0x0010, dh=0x00cf9300, dl=0x0000ffff, valid=7
ss:s=0x0020, dh=0x00409240, dl=0x00007fff, valid=7
es:s=0x0010, dh=0x00409220, dl=0x00001000, valid=1
fs:s=0x0010, dh=0x00409320, dl=0x00001000, valid=1
gs:s=0x0010, dh=0x00409320, dl=0x00001000, valid=1
ldtr:s=0x0000, dh=0x00008200, dl=0x0000ffff, valid=1
tr:s=0x0000, dh=0x00008b00, dl=0x0000ffff, valid=1
gdtr:base=0x00210000, limit=0x2f
idtr:base=0x00000000, limit=0x3ff
<bochs:12> delete 1
<bochs:13> c
00032993553e[CPU0 ] read_virtual_checks(): read beyond limit
00032993553e[CPU0 ] interrupt(): gate descriptor is not valid sys seg (vector=0x0d)
00032993553e[CPU0 ] interrupt(): gate descriptor is not valid sys seg (vector=0x08)
00032993553i[CPU0 ] CPU is in protected mode (active)
00032993553i[CPU0 ] CS.d_b = 32 bit
00032993553i[CPU0 ] SS.d_b = 32 bit
00032993553i[CPU0 ] | EAX=00000000  EBX=00026260  ECX=00000001  EDX=000903f2
00032993553i[CPU0 ] | ESP=00007ef0  EBP=00007f00  ESI=000263d3  EDI=000263e3
00032993553i[CPU0 ] | IOPL=0 id vip vif ac vm RF nt of df if tf sf ZF af PF cf
00032993553i[CPU0 ] | SEG selector     base    limit G D
00032993553i[CPU0 ] | SEG sltr(index|ti|rpl)     base    limit G D
00032993553i[CPU0 ] |  CS:0008( 0001| 0|  0) 00300000 00001000 0 1
00032993553i[CPU0 ] |  DS:0010( 0002| 0|  0) 00200000 00001000 0 1
00032993553i[CPU0 ] |  SS:0020( 0004| 0|  0) 00400000 00007fff 0 1
00032993553i[CPU0 ] |  ES:0010( 0002| 0|  0) 00200000 00001000 0 1
00032993553i[CPU0 ] |  FS:0010( 0002| 0|  0) 00200000 00001000 0 1
00032993553i[CPU0 ] |  GS:0010( 0002| 0|  0) 00200000 00001000 0 1
00032993553i[CPU0 ] | EIP=0000007c (0000007c)
00032993553i[CPU0 ] | CR0=0x60000011 CR2=0x00000000
00032993553i[CPU0 ] | CR3=0x00000000 CR4=0x00000000
(0).[32993553] [0x0030007c] 0008:0000007c (unk. ctxt): mov al, byte ptr ds:0x200001 ; a001002000
The very first line in that output, mov al, byte ptr ds:0x200001, doesn't throw a protection exception, even though the offset is outside the segment limit! I continue execution, and the same line of code is executed later, and as you can see, it does throw one.

Now, this is where it comes back to the problem I was having earlier. 0x200001 is the location of my byte variable (called cury. DS:0x01 is the same thing. But the linker seems to keep me from accessing it with mov al,byte [cury]. Well, it does at least once as shown above!

Is this an error in bochs?

What is the preferred method of referencing my variable? Please, don't point me at something that enables paging, why would I need it?

Below is some of my code, it's sloppy I'm sure.

console.asm (where my variable is defined and accessed):

Code: Select all

;Need access to VID_SEL to print to the video screen
extern	VID_SEL

global	PrintChar

section	s_data

curx		db	0
cury		db	0
attrib        db	0x07

section	s_code

;Procedure PrintChar(byte char)
;
;Prints a character to the screen using current attrib, moves curx and cury accordingly
PrintChar:
.c	equ	8

	push	ebp
	mov	ebp,esp
	push	eax
	push	ebx
	push	ecx
	push	gs

	xor	eax,eax
	mov	al,[cury]
	mov	bl,160
	mul	bl
	xor	ebx,ebx
	mov	bl,byte [curx]
	shl	bl,1
	add	ebx,eax
	
	mov	eax,VID_SEL
	mov	gs,eax
	
	mov	cl,[bp+.c]
	mov	[gs:bx],cl
	mov	cl,[attrib]
	mov	[gs:bx+1],cl
	
	inc	byte [curx]
	cmp	byte [curx],81
	jne	.done
	mov	byte [curx],0
	inc	byte [cury]
	call	Scroll
.done:

	pop	gs
	pop	ecx
	pop	ebx
	pop	eax
	pop	ebp
	ret	1
gdt.asm defines my GDT using some macros I should really thank someone for:

Code: Select all

%include	"gdtn_inc.asm"

global	STACK_SIZE
global	CS_SEL
global	DS_SEL
global	SS_SEL
global	VID_SEL
global	gdtloc

STACK_SIZE	equ	0x8000

SECTION	s_gdt
gdtloc:
	start_gdt
	CS_SEL	desc 0x300000,0x1000,D_CODE + D_DPL0 + D_BIG
	DS_SEL	desc 0x200000,0x1000,D_DATA + D_DPL0 + D_BIG + D_WRITE
	GDT_SEL	desc 0x210000,0x1000,D_DATA + D_DPL0 + D_BIG + D_WRITE
	SS_SEL	desc 0x400000,STACK_SIZE - 1,D_DATA + D_DPL0 + D_BIG + D_WRITE
	VID_SEL	desc 0xB8000,0xf9f,D_DATA + D_DPL0 + D_BIG + D_WRITE
	end_gdt
And my linker script:

Code: Select all

OUTPUT_ARCH(i386)
OUTPUT_FORMAT(elf32-i386)
INPUT(multiboot.o loader.o gdt.o main.o console.o)
OUTPUT(kernel.bin)
ENTRY(_loader)
phys = 0x00100000;
SECTIONS
{
	. = 0x100000;
	s_multiboot 0x100000 :
	{
		*(s_multiboot)
		. = ALIGN(4096);
	}
	s_loader :
	{
		*(s_loader)
		. = ALIGN(4096);
	}
	. = 0x200000;
	s_data 0x200000 :
	{
		*(s_data)
		. = ALIGN(4096);
	}
	. = 0x210000;
	s_gdt 0x210000 :
	{
		*(s_gdt)
		. = ALIGN(4096);
	}
	. = 0x300000;
	s_code 0x300000 :
	{
		main.o(s_code)
		*(s_code)
		. = ALIGN(4096);
	}
}

Re: Referencing a variable from assembly stored in a data segmen

Posted: Thu Aug 27, 2009 6:26 am
by gravaera
I didn't really read the code, but...you say that you defined a selector which has a base of 0x200000, and a limit of 0x1000. So what's wrong with the address 0x200001? It's within the segment. That segment extends up to 0x201000. I don't think anything should happen.

Also, you may want to check whether or not you've made the symbol (cury) global. The linker would need to see it as a 'strong' symbol before you can link to it in another file. That's just a bit of guesstimation without acutally having read your logs/code dump.

-All the best
gravaera

Re: Referencing a variable from assembly stored in a data segmen

Posted: Thu Aug 27, 2009 9:15 am
by zman97211
gravaera wrote:I didn't really read the code, but...you say that you defined a selector which has a base of 0x200000, and a limit of 0x1000. So what's wrong with the address 0x200001? It's within the segment. That segment extends up to 0x201000. I don't think anything should happen.
According to the Intel documentation, the offset supplied is relative to the base address in the descriptor. So, using the descriptor in the GDT for my data segment, the offset would be 1, relative to the base of 0x200000, which corresponds to the physical address 0x200001.

All of the basic examples I've seen around work within the bootloader's default environment of a code segment and data segment set up starting at address 0, extending to 0xFFFFFFFF. If I were using this default flat memory model, there would be no problem. I am not working within this flat memory model.

That's the the exception was thrown the second time in the bochs output I posted above.

After I originally made my segments in the GDT, I wrote a few lines to access memory inside and outside the limit, and an exception was raised every time I was above the limit.
gravaera wrote:Also, you may want to check whether or not you've made the symbol (cury) global. The linker would need to see it as a 'strong' symbol before you can link to it in another file.
cury is only access within the one source file. No other modules even need know it exists.
gravaera wrote:That's just a bit of guesstimation without acutally having read your logs/code dump.
I appreciate the response.

If nothing else, take a look at the logs. The same instruction (with the same bytecode) is executed twice, with the same selector and same limit, and no change to the GDT (actually, no change to the DS register at all, so the processor's cache of the decriptor shouldn't even be changing). It allows the access to occur once, and refuses it to occur a second time.

Steve

Re: Referencing a variable from assembly stored in a data segmen

Posted: Thu Aug 27, 2009 1:38 pm
by zman97211
From the page on Segmentation:
# In general if you want to use the segmentation mechanism, by having the different segment registers represent segments with different base addresses, you won't be able to use a modern C compiler, and may very well be restricted to just Assembly.
# So, if you're going to use C, do what the rest of the C world does, which is set up a flat-memory model, use paging, and ignore the fact that segmentation even exists.
Maybe this is why there is no answer for this question? Most everyone seems to stick with C after getting past the loader and jump to a main() function.

All my code is in assembly. Does ld not support segmented memory? Is there another linker out there?

Almost every tutorial or article I find doesn't even touch on the subject I'm asking about. I search and search...

I hope someone on the forums here has an answer or a pointer. And I assure you, after reading another thread on this site, I'm not a novice to programming, I've done my research, etc.

Maybe I need to create a post-link program to go through and fiddle with my symbols. There's got to be a better way.

Steve

Re: Referencing a variable from assembly stored in a data segmen

Posted: Thu Aug 27, 2009 2:58 pm
by Cognition
I'm not sure that there's a way to change that behavior, but you'd have to read the manual for your assembler to check. NASM has some conventions for it for 16-bit real mode code I believe, but I'd imagine it doesn't extend to a 32-bit mode. In general you might want to consider if segmentation is really the best tool for what you want to accomplish in this case.

Edit: Reread things a bit closer.

Re: Referencing a variable from assembly stored in a data segmen

Posted: Thu Aug 27, 2009 3:30 pm
by zman97211
From an article on segmentation:
Since coinciding logical and linear addresses are simpler to handle, they became standard, such that 64-bit mode now enforces a flat linear address space. But even in flat mode segments are still crucial for x86 protection, the mechanism that defends the kernel from user-mode processes and every process from each other.
This is a thought I had while driving home from work - having the kernel sit in a flat model near the beginning of memory, and letting the user processes run above it in their own segments.

The benefits of protection speak for themselves. However, I though it a good idea to protect the kernel from itself. Sure, the kernel runs at privilege level 0, and can write into almost any memory location, but if the kernel has a data and a code segment starting at 0 and extending to whatever arbitrary value and overlapping, that means the kernel can accidentally (or me as the programmer can accidentally) overwrite its own code!

I encourage you to expand on your "read things a bit closer" - as I'm not following your hint. Are you implying after you posted that there might be a way to accomplish it?

I'm going to move on in a flat model for now, at least for code and data for the kernel. Video memory and such will still have it's own entry in the GDT, as will some other modules and of course user space code. I will revisit this problem at a later date when I have more experience under my belt.

Steve

Re: Referencing a variable from assembly stored in a data segmen

Posted: Thu Aug 27, 2009 4:23 pm
by gravaera
Well good luck with that. =P~