Page 1 of 1

compiler, stack

Posted: Fri Mar 01, 2002 11:18 am
by mansonbound
the djgpp compiler uses the stack for local variables:

Code: Select all

int main()
{
        int i=5;
}
comjpiles to:

Code: Select all

      .file      "test.c"
      .section .text
      .p2align 4
.globl _main
_main:
      pushl      %ebp
      movl      %esp, %ebp
      subl      $8, %esp
      andl      $-16, %esp
      movl      $5, -4(%ebp)
      movl      %ebp, %esp
      popl      %ebp
      ret
      .ident      "GCC: (GNU) 3.0.3"
And heres what i think: doesn't that affect the dataseg, where it is supposed to use the stackseg? The esp points somewhere to the top of the stack in the stackseg. But that Code says: Dataseg+esp.........that is somewhere in the datseg??

Am i wrong?

Re: compiler, stack

Posted: Fri Mar 01, 2002 12:20 pm
by df
yes you are wrong, esp+ebp only reference SS unless overriden with ds/cs/es/fs/gs.

also, gcc produces flat code, so the ds will == the ss on x86 anyway.

Re: compiler, stack

Posted: Fri Mar 01, 2002 12:44 pm
by mansonbound
ok
ty

Re: compiler, stack

Posted: Fri Mar 01, 2002 10:00 pm
by roswell
I agree for esp and ebp, by default they reference memory based on ss.

But I'm not sure gcc assume that ss=ds.

It assumes that ds=es, but nothing about ss.
Perhaps I'm wrong, but I think it's not compiler responsability to assume that kind of thing.

If you are sure that gcc assumes something about segment, can you confirm it.

Roswell

Re: compiler, stack

Posted: Fri Mar 01, 2002 11:33 pm
by df
It assumes that ds=es, but nothing about ss.
Perhaps I'm wrong, but I think it's not compiler responsability to assume that kind of thing.

If you are sure that gcc assumes something about segment, can you confirm it.
sure. gcc produces flat model code. all segments are the same. therefore it assumes ss=es, ss=ds, ss=cs in as much as it knows what ss is, or what ds is. its all the same. same base, same size. all one big flat memory model.

Re: compiler, stack

Posted: Mon Mar 04, 2002 1:36 am
by Tim
gcc does assume that SS==DS (as do most PC C compilers).

Look at this code, for example:

Code: Select all

char data_string[] = "This string is in the .data section";

void fn(void)
{
    char stack_string[] = "This string is on the stack";
    printf("%s\n%s\n", data_string, stack_string);
}
If SS != DS then we'd need two versions of printf() for this example: one which accepted strings on the stack, and one which accepted strings in the data segment

Re: compiler, stack

Posted: Mon Mar 04, 2002 6:52 am
by mansonbound
i think that the pointer to the string is on the stack.
If ds == ss i couldnot link with -tText=0x0 cause the esp gets smaller with every push, so that the stack has to be in front of every program. Also: Programs work although ds!=ss (f.e. function calls)

Re: compiler, stack

Posted: Mon Mar 04, 2002 9:36 am
by Tim
i think that the pointer to the string is on the stack.
Yes, but that pointer is relative to SS. If SS and DS have a different base address, and printf tries to access the second string without an SS segment override, it will get the wrong address.
If ds == ss i couldnot link with -tText=0x0 cause the esp gets smaller with every push, so that the stack has to be in front of every program.
No, it doesn't. You just have to set ESP correctly. gcc excepts a flat memory model where the base addresses of CS, DS, ES and SS are the same. The stack is implemented by pointing ESP a long way away from any code or data.
Also: Programs work although ds!=ss (f.e. function calls)
Most gcc and VC++ C or C++ programs won't unless you pass far pointers around. This is impossible in these 32-bit compilers (although this was a common problem in the 16-bit world).

Re: compiler, stack

Posted: Mon Mar 04, 2002 10:55 am
by mansonbound
i have a dataseg starting @0x80000
and the stackseg starting @0x70800 and everything works!

also f.e.:
a local string pointer is held on the stacksegment. Therefore gcc reserves space below the esp:

Code: Select all

subl      $8, %esp
and then compiles this:

Code: Select all

char *s1=s;
where s is a global string pointer and s1 is a local pointer
to:

Code: Select all

      movl      _s, %eax
      movl      %eax, -4(%ebp)
so only the pointers are in the stackseg. When gcc wants to work with the real data (the string) it first loads the pointer (from the stack) into the eax register then uses eax as an offset into the dataseg.

Re: compiler, stack

Posted: Tue Mar 05, 2002 3:11 am
by Tim
That's not what I'm talking about! :)

Accessing local variables is fine because the x86 implicitly uses SS when accessing memory via EBP or ESP.

Look at this code which forms two pointers to two strings and tries to use them.

Code: Select all

char global_string[] = "This string is global";

int main(void)
{
  24:   e8 00 00 00 00          call   29 <_main+0xd>
/cygdrive/h/temp.c:7
        char stack_string[] = "This string is on the stack";
  29:   8d 45 e0                lea    0xffffffe0(%ebp),%eax
  2c:   8d 7d e0                lea    0xffffffe0(%ebp),%edi
  2f:   be 00 00 00 00          mov    $0x0,%esi
  34:   fc                      cld
  35:   b9 07 00 00 00          mov    $0x7,%ecx
  3a:   f3 a5                   repz movsl %ds:(%esi),%es:(%edi)
Here a string is put onto the stack.

Code: Select all

/cygdrive/h/temp.c:9
        char *p1, *p2;
        p1 = global_string;
  3c:   c7 45 dc 00 00 00 00    movl   $0x0,0xffffffdc(%ebp)
/cygdrive/h/temp.c:10
        p2 = stack_string;
  43:   8d 45 e0                lea    0xffffffe0(%ebp),%eax
  46:   89 45 d8                mov    %eax,0xffffffd8(%ebp)
Here pointers are formed to each string. Note that global_string is referred to by an absolute address (it is zero here because this code has not been linked yet). stack_string is referred to with ebp-relative addressing. p1 is relative to DS, and p2 is relative to SS. So far so good.

Code: Select all

/cygdrive/h/temp.c:11
        puts(p1);
  49:   83 c4 f4                add    $0xfffffff4,%esp
  4c:   8b 45 dc                mov    0xffffffdc(%ebp),%eax
  4f:   50                      push   %eax
  50:   e8 00 00 00 00          call   55 <_main+0x39>
  55:   83 c4 10                add    $0x10,%esp
/cygdrive/h/temp.c:12
        puts(p2);
  58:   83 c4 f4                add    $0xfffffff4,%esp
  5b:   8b 45 d8                mov    0xffffffd8(%ebp),%eax
  5e:   50                      push   %eax
  5f:   e8 00 00 00 00          call   64 <_main+0x48>
  64:   83 c4 10                add    $0x10,%esp
Here puts is called twice, with the pointers to the strings we just got. But each call is made in exactly the same way: the fact that stack_string is on the stack is ignored. Inside puts there will be instructions which fetch each character of the string without a segment override, so the characters will be obtained from the data segment. puts will get the wrong segment for stack string unless either:
(1) you pass it a far pointer to the string (which is impossible with gcc, which I used here)
(2) the base addresses of SS and DS are identical

gcc assumes (2) on the bx86.

Re: compiler, stack

Posted: Wed Mar 06, 2002 3:14 am
by mansonbound
ok....so how can i prevent the stack from destroying data, when ds==ss?

Re: compiler, stack

Posted: Wed Mar 06, 2002 7:10 am
by Tim
Put ESP a long way away from your code and data.

If you're using paging it's easy to stop the stack overflowing and writing over your code and/or data: put some nonpresent pages between your data and the bottom of your stack.

If you're using segmentation then there's no good way of detecting stack overflows. If you've ever read Operating Systems: Design and Implementation by Tanenbaum, you'll see that this is the same problem he had with Minix. He got his kernel to check the stack pointer on every syscall, although he was aiming for compatiblity with every CPU from the 8086 upwards.

On the whole it's easier to go for a flat paged architecture, especially if you're using a compiler like gcc.