compiler, stack

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
mansonbound

compiler, stack

Post by mansonbound »

the djgpp compiler uses the stack for local variables:

Code: Select all

int main()
{
        int i=5;
}
comjpiles to:

Code: Select all

      .file      "test.c"
      .section .text
      .p2align 4
.globl _main
_main:
      pushl      %ebp
      movl      %esp, %ebp
      subl      $8, %esp
      andl      $-16, %esp
      movl      $5, -4(%ebp)
      movl      %ebp, %esp
      popl      %ebp
      ret
      .ident      "GCC: (GNU) 3.0.3"
And heres what i think: doesn't that affect the dataseg, where it is supposed to use the stackseg? The esp points somewhere to the top of the stack in the stackseg. But that Code says: Dataseg+esp.........that is somewhere in the datseg??

Am i wrong?
User avatar
df
Member
Member
Posts: 1076
Joined: Fri Oct 22, 2004 11:00 pm
Contact:

Re: compiler, stack

Post by df »

yes you are wrong, esp+ebp only reference SS unless overriden with ds/cs/es/fs/gs.

also, gcc produces flat code, so the ds will == the ss on x86 anyway.
-- Stu --
mansonbound

Re: compiler, stack

Post by mansonbound »

ok
ty
roswell

Re: compiler, stack

Post by roswell »

I agree for esp and ebp, by default they reference memory based on ss.

But I'm not sure gcc assume that ss=ds.

It assumes that ds=es, but nothing about ss.
Perhaps I'm wrong, but I think it's not compiler responsability to assume that kind of thing.

If you are sure that gcc assumes something about segment, can you confirm it.

Roswell
User avatar
df
Member
Member
Posts: 1076
Joined: Fri Oct 22, 2004 11:00 pm
Contact:

Re: compiler, stack

Post by df »

It assumes that ds=es, but nothing about ss.
Perhaps I'm wrong, but I think it's not compiler responsability to assume that kind of thing.

If you are sure that gcc assumes something about segment, can you confirm it.
sure. gcc produces flat model code. all segments are the same. therefore it assumes ss=es, ss=ds, ss=cs in as much as it knows what ss is, or what ds is. its all the same. same base, same size. all one big flat memory model.
-- Stu --
Tim

Re: compiler, stack

Post by Tim »

gcc does assume that SS==DS (as do most PC C compilers).

Look at this code, for example:

Code: Select all

char data_string[] = "This string is in the .data section";

void fn(void)
{
    char stack_string[] = "This string is on the stack";
    printf("%s\n%s\n", data_string, stack_string);
}
If SS != DS then we'd need two versions of printf() for this example: one which accepted strings on the stack, and one which accepted strings in the data segment
mansonbound

Re: compiler, stack

Post by mansonbound »

i think that the pointer to the string is on the stack.
If ds == ss i couldnot link with -tText=0x0 cause the esp gets smaller with every push, so that the stack has to be in front of every program. Also: Programs work although ds!=ss (f.e. function calls)
Tim

Re: compiler, stack

Post by Tim »

i think that the pointer to the string is on the stack.
Yes, but that pointer is relative to SS. If SS and DS have a different base address, and printf tries to access the second string without an SS segment override, it will get the wrong address.
If ds == ss i couldnot link with -tText=0x0 cause the esp gets smaller with every push, so that the stack has to be in front of every program.
No, it doesn't. You just have to set ESP correctly. gcc excepts a flat memory model where the base addresses of CS, DS, ES and SS are the same. The stack is implemented by pointing ESP a long way away from any code or data.
Also: Programs work although ds!=ss (f.e. function calls)
Most gcc and VC++ C or C++ programs won't unless you pass far pointers around. This is impossible in these 32-bit compilers (although this was a common problem in the 16-bit world).
mansonbound

Re: compiler, stack

Post by mansonbound »

i have a dataseg starting @0x80000
and the stackseg starting @0x70800 and everything works!

also f.e.:
a local string pointer is held on the stacksegment. Therefore gcc reserves space below the esp:

Code: Select all

subl      $8, %esp
and then compiles this:

Code: Select all

char *s1=s;
where s is a global string pointer and s1 is a local pointer
to:

Code: Select all

      movl      _s, %eax
      movl      %eax, -4(%ebp)
so only the pointers are in the stackseg. When gcc wants to work with the real data (the string) it first loads the pointer (from the stack) into the eax register then uses eax as an offset into the dataseg.
Tim

Re: compiler, stack

Post by Tim »

That's not what I'm talking about! :)

Accessing local variables is fine because the x86 implicitly uses SS when accessing memory via EBP or ESP.

Look at this code which forms two pointers to two strings and tries to use them.

Code: Select all

char global_string[] = "This string is global";

int main(void)
{
  24:   e8 00 00 00 00          call   29 <_main+0xd>
/cygdrive/h/temp.c:7
        char stack_string[] = "This string is on the stack";
  29:   8d 45 e0                lea    0xffffffe0(%ebp),%eax
  2c:   8d 7d e0                lea    0xffffffe0(%ebp),%edi
  2f:   be 00 00 00 00          mov    $0x0,%esi
  34:   fc                      cld
  35:   b9 07 00 00 00          mov    $0x7,%ecx
  3a:   f3 a5                   repz movsl %ds:(%esi),%es:(%edi)
Here a string is put onto the stack.

Code: Select all

/cygdrive/h/temp.c:9
        char *p1, *p2;
        p1 = global_string;
  3c:   c7 45 dc 00 00 00 00    movl   $0x0,0xffffffdc(%ebp)
/cygdrive/h/temp.c:10
        p2 = stack_string;
  43:   8d 45 e0                lea    0xffffffe0(%ebp),%eax
  46:   89 45 d8                mov    %eax,0xffffffd8(%ebp)
Here pointers are formed to each string. Note that global_string is referred to by an absolute address (it is zero here because this code has not been linked yet). stack_string is referred to with ebp-relative addressing. p1 is relative to DS, and p2 is relative to SS. So far so good.

Code: Select all

/cygdrive/h/temp.c:11
        puts(p1);
  49:   83 c4 f4                add    $0xfffffff4,%esp
  4c:   8b 45 dc                mov    0xffffffdc(%ebp),%eax
  4f:   50                      push   %eax
  50:   e8 00 00 00 00          call   55 <_main+0x39>
  55:   83 c4 10                add    $0x10,%esp
/cygdrive/h/temp.c:12
        puts(p2);
  58:   83 c4 f4                add    $0xfffffff4,%esp
  5b:   8b 45 d8                mov    0xffffffd8(%ebp),%eax
  5e:   50                      push   %eax
  5f:   e8 00 00 00 00          call   64 <_main+0x48>
  64:   83 c4 10                add    $0x10,%esp
Here puts is called twice, with the pointers to the strings we just got. But each call is made in exactly the same way: the fact that stack_string is on the stack is ignored. Inside puts there will be instructions which fetch each character of the string without a segment override, so the characters will be obtained from the data segment. puts will get the wrong segment for stack string unless either:
(1) you pass it a far pointer to the string (which is impossible with gcc, which I used here)
(2) the base addresses of SS and DS are identical

gcc assumes (2) on the bx86.
mansonbound

Re: compiler, stack

Post by mansonbound »

ok....so how can i prevent the stack from destroying data, when ds==ss?
Tim

Re: compiler, stack

Post by Tim »

Put ESP a long way away from your code and data.

If you're using paging it's easy to stop the stack overflowing and writing over your code and/or data: put some nonpresent pages between your data and the bottom of your stack.

If you're using segmentation then there's no good way of detecting stack overflows. If you've ever read Operating Systems: Design and Implementation by Tanenbaum, you'll see that this is the same problem he had with Minix. He got his kernel to check the stack pointer on every syscall, although he was aiming for compatiblity with every CPU from the 8086 upwards.

On the whole it's easier to go for a flat paged architecture, especially if you're using a compiler like gcc.
Post Reply