ELF, newlib, gcc... crash [solved]

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
User avatar
TyrelHaveman
Member
Member
Posts: 40
Joined: Thu Sep 20, 2007 11:20 pm
Location: Bellingham, WA
Contact:

ELF, newlib, gcc... crash [solved]

Post by TyrelHaveman »

Hi All,
I've got my own toolchain built with binutils, gcc, and newlib. In newlib I've created a very basic syscalls.c as described in the wiki. The only calls that are really sort of implemented are write() and _exit().

write() seems to be working now -- I'm able to output to my console from a program. However I'm having a problem with _exit()... although it's before _exit() is actually called, when exit() is working. exit() is either called explicitly by my code or implicitly after main() returns, because _start consists of calling main then exit (I believe this is coming from crt0.o provided by GCC, yes?).

The problem appears to be that sometime during exit(), before _exit() is called, a call is being made to an invalid pointer, or something. I get a general protection fault exception, at which point EIP is set to 0x00000001, which is not valid for my OS (user code starts at 0x80000000).

This is what exit() looks like in newlib:

Code: Select all

/*
 * Exit, flushing stdio buffers if necessary.
 */

void
_DEFUN (exit, (code),
        int code)
{
  __call_exitprocs (code, NULL);

  if (_GLOBAL_REENT->__cleanup)
    (*_GLOBAL_REENT->__cleanup) (_GLOBAL_REENT);
  _exit (code);
}
This is the output of my GPF handler:

Code: Select all

exceptions.c:isr_13:330: EXCEPTION: General Protection Exception (Triple Fault), 00000000

exceptions.c:dump_registers:108: EAX=FFFFFFFF EBX=00000000 ECX=00010000 EDX=00010000
exceptions.c:dump_registers:109: EIP=00000001 ESP(k)=00236FE4 EBP=DFFFFFB8 ESP(u)=DFFFFFA0
exceptions.c:dump_registers:110: EDI=0 ESI=00000000
exceptions.c:dump_registers:111: CS=0000001B SS=00000023 DS=00000023 ES=00000023 FS=00000023 GS=00000023
exceptions.c:dump_stack:116: User Stack dump:
exceptions.c:dump_stack:157: DFFFFFA0   Argument: 00000000 (0)
exceptions.c:dump_stack:157: DFFFFFA4   Argument: 00000000 (0)
exceptions.c:dump_stack:141: DFFFFFA8 Return address: 80007260
exceptions.c:dump_stack:145: DFFFFFAC   Argument: DFFFFFCC (pointer to stack)
exceptions.c:dump_stack:157: DFFFFFB0   Argument: 00000000 (0)
exceptions.c:dump_stack:157: DFFFFFB4   Argument: 00000000 (0)
exceptions.c:dump_stack:145: DFFFFFB8   Argument: DFFFFFD8 (pointer to stack)
exceptions.c:dump_stack:141: DFFFFFBC Return address: 80000174
exceptions.c:dump_stack:157: DFFFFFC0   Argument: 00000000 (0)
exceptions.c:dump_stack:157: DFFFFFC4   Argument: 00000001 (1)
exceptions.c:dump_stack:145: DFFFFFC8   Argument: DFFFFFFC (pointer to stack)
exceptions.c:dump_stack:157: DFFFFFCC   Argument: 00000000 (0)
exceptions.c:dump_stack:157: DFFFFFD0   Argument: 00200034 (2097204)
exceptions.c:dump_stack:145: DFFFFFD4   Argument: DFFFFFE4 (pointer to stack)
exceptions.c:dump_stack:157: DFFFFFD8   Argument: 00000000 (0)
exceptions.c:dump_stack:141: DFFFFFDC Return address: 80000085
exceptions.c:dump_stack:141: DFFFFFE0 Return address: 80000085
exceptions.c:dump_stack:157: DFFFFFE4   Argument: 00000001 (1)
exceptions.c:dump_stack:145: DFFFFFE8   Argument: DFFFFFFC (pointer to stack)
exceptions.c:dump_stack:157: DFFFFFEC   Argument: 00000000 (0)
exceptions.c:dump_stack:157: DFFFFFF0   Argument: 00000000 (0)
exceptions.c:dump_stack:157: DFFFFFF4   Argument: 00000000 (0)
exceptions.c:dump_stack:157: DFFFFFF8   Argument: 00000000 (0)
exceptions.c:dump_stack:157: DFFFFFFC   Argument: 5B00005F (1526726751)
exceptions.c:dump_stack:163: Dump complete.
process.c:end_current_thread:459: Ending thread 002148CC
_start is at 0x800000080. When something says "return address" on my stack output, it could be a function pointer as well. The last address pushed on the stack looks like 0x80007260. There is no code here. This is right after the last function in the code of the ELF executable, which is in the .fini section (which I do not call explicitly), which just calls __do_global_dtors_aux.

Let me know if there's any additional info I can provide that might help diagnose this. I've been suck on it for a little while, and I'm a little bit mystified. I previously had a problem that turned out to be related to not setting up the stack properly (with command line args etc) at startup. I believe I'm doing that alright now (you can see it at the beginning of the stack), but maybe that's still a bit wrong or something. Or maybe this is occurring because I haven't fully implemented one of the syscalls.c functions. I really don't know

Thanks very much in advance for any insight you can provide!
Last edited by TyrelHaveman on Tue Dec 09, 2008 8:03 pm, edited 1 time in total.
User avatar
AJ
Member
Member
Posts: 2646
Joined: Sun Oct 22, 2006 7:01 am
Location: Devon, UK
Contact:

Re: ELF, newlib, gcc... crash

Post by AJ »

Hi,

I'm not familiar with the workings of newlib, but have a look at the function called by:

Code: Select all

__call_exitprocs (code, NULL);
Is it possible this is calling an array of stored exit function pointers? If so, and for some reason one of those function pointers is NULL, you are attempting to run code at the invalid address of 0x00000000. For some reason, this is perhaps resulting in an invalid opcode --> double fault --> triple fault?

Perhaps you can discover more by refining your exception handlers to provide more debug information. Also, with your system call int handler, make sure that you handle unknown syscall numbers gracefully, rather than just having a sparse function pointer array, so you can tell if user code is attempting to call an invalid syscall #.

Cheers,
Adam
User avatar
JamesM
Member
Member
Posts: 2935
Joined: Tue Jul 10, 2007 5:27 am
Location: York, United Kingdom
Contact:

Re: ELF, newlib, gcc... crash

Post by JamesM »

Hi,

You shouldn't be using the "standard" crt0.o. There is no such thing - if you create an i686-elf toolchain you don't even get one which means the one you have is made for a different OS - such as Linux. You should create your own, then you know *exactly* what is being called.

exit() runs several procedures, most importantly _atexit() which closes file descriptors etc. You could look into that too.

Cheers,

James
User avatar
TyrelHaveman
Member
Member
Posts: 40
Joined: Thu Sep 20, 2007 11:20 pm
Location: Bellingham, WA
Contact:

Re: ELF, newlib, gcc... crash

Post by TyrelHaveman »

JamesM wrote:Hi,

You shouldn't be using the "standard" crt0.o. There is no such thing - if you create an i686-elf toolchain you don't even get one which means the one you have is made for a different OS - such as Linux. You should create your own, then you know *exactly* what is being called.
Well crt0.o is coming from somewhere, either newlib or gcc, I can't remember which right now. I am getting crtbegin and such from gcc. In any case, the _start function I'm getting out of whatever crt0 looks perfectly fine (call main, call exit), so I'm not concerned that that might be causing problems.

I tried to look at __call_exitprocs (code, NULL); and it was a mess of code. Unclear where stuff might be getting added to the list, etc. I will look at it again tonight and see if I can find anything. I was hoping someone familiar with newlib would have an idea about what specifically it expects to be implemented / on the stack / something.

Thanks,
Tyrel
User avatar
JamesM
Member
Member
Posts: 2935
Joined: Tue Jul 10, 2007 5:27 am
Location: York, United Kingdom
Contact:

Re: ELF, newlib, gcc... crash

Post by JamesM »

Hi,

It is coming from gcc. What did you target your toolchain at? It's obviously bringing in a crt0.o and it's not what you want. If you're using newlib you'll need to call _init_signal() in crt0.c and a stock one definately won't do that - you're lucky it's not calling some random syscall specific to whatever OS it was designed for.

James
User avatar
TyrelHaveman
Member
Member
Posts: 40
Joined: Thu Sep 20, 2007 11:20 pm
Location: Bellingham, WA
Contact:

Re: ELF, newlib, gcc... crash

Post by TyrelHaveman »

JamesM wrote:Hi,

It is coming from gcc. What did you target your toolchain at? It's obviously bringing in a crt0.o and it's not what you want. If you're using newlib you'll need to call _init_signal() in crt0.c and a stock one definately won't do that - you're lucky it's not calling some random syscall specific to whatever OS it was designed for.

James
I created my own target, called i586-pc-zebu, which is what I'm using to build binutils, gcc, and newlib, exactly as described in the wiki except that I'm also building the libgcc stuff. http://wiki.osdev.org/OS_Specific_Toolchain

I just realized that that article actually shows how to write crt0.S, and I obviously did that as well, but have since forgotten about it. I read the part about adding _init_signal -- looks like I should be doing that, and I'm not sure why I didn't. I will add that tonight and see if it fixes my problem. I will post another reply and let you know.

Thanks for your help!

Tyrel
User avatar
TyrelHaveman
Member
Member
Posts: 40
Joined: Thu Sep 20, 2007 11:20 pm
Location: Bellingham, WA
Contact:

Re: ELF, newlib, gcc... crash

Post by TyrelHaveman »

JamesM wrote:Hi,

It is coming from gcc. What did you target your toolchain at? It's obviously bringing in a crt0.o and it's not what you want. If you're using newlib you'll need to call _init_signal() in crt0.c and a stock one definately won't do that - you're lucky it's not calling some random syscall specific to whatever OS it was designed for.

James
This fixed it. I just added call _init_signal before call main in crt0.S in my newlib/sys/myos directory. It no longer crashes on exit.

Thanks again very much! Now I can get on to more exciting things.
Post Reply