Reading the ELF file

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
richi18007
Member
Member
Posts: 35
Joined: Mon Mar 07, 2011 1:41 pm

Reading the ELF file

Post by richi18007 »

Hello , everyone .
I'm have developed a loader (a full proof code , no errors) .But now I have to develop a stub function which fills in the segment tables etc information for the loader to load the image . The test code is simple

Code: Select all

void ELF_Print(char* msg);


char  s1[40] = "Hi ! This is the first string\n";

int main(int argc, char** argv)
{
   char  s2[40] = "Hi ! This is the second string\n";

   ELF_Print(s1);
   ELF_Print(s2);

   return 0;
}
And this is what I do in the stub function

Code: Select all

int Parse_ELF_Executable(
	char *exeFileData,
	ulong_t exeFileLength,
    struct Exe_Format *exeFormat)
{	
	//var declaration
	int i;
	
	//map an elfHeader to the exeFileData
	elfHeader* eh = (elfHeader*) exeFileData;
	//check the elf-identification
	if(eh->ident[0] != 0x7F && 
			eh->ident[1] != 'E' &&
			eh->ident[2] != 'L' &&
			eh->ident[3] != 'F')
	{
		//TODO is it necessary, that the other bytes are checked too?
		Print("\nnot an executable file\n");
		return ENOEXEC;
	}
	if(eh->ident[4] == 0 ||
			eh->ident[4] == 2)
	{
		Print("\ninvalid class or 64bit object\n");
		return ENOEXEC;
	}
	
	//check header for correct data
	/*if(eh->type != 2)
	{ //executable type - first we only implement exec files
		Print("\nunimplemented executable type: %d\n", eh->type);
		return ENOEXEC;
	}*/
	/*if(eh->machine != 2)
	{ //machine type - we implement this on x86 machines
		Print("\nunsupported machine type: %d\n", eh->machine);
		return ENOEXEC;
	}*/
	
	//fill in data to exeFormat
	//- set nr. of exeSegments
	exeFormat->numSegments = eh->phnum;
	//- set the entry address
	exeFormat->entryAddr = eh->entry;
	//- get exeSegments
	if(eh->phoff == 0)
	{
		Print("\nno program header available\n");
		return ENOEXEC;
	}
	
	int nextProgramHeaderOffset = (eh->phoff);
	for(i=0; i<eh->phnum; i++)
	{
		if(i >= EXE_MAX_SEGMENTS)
		{ //the number of segments must be less or equals the maximum numbers set for the exeFormat
			Print("\nmaximum number of segments exceeded\n");
			return ENOEXEC;
		}
		
		//map the programHeader to the exeFileData
		programHeader* ph = (programHeader*) (exeFileData+nextProgramHeaderOffset);
	
		//fill in the sections in the exeFormat
		(exeFormat->segmentList[i]).offsetInFile = ph->offset;
		(exeFormat->segmentList[i]).lengthInFile = ph->fileSize;
		(exeFormat->segmentList[i]).startAddress = ph->vaddr;
		(exeFormat->segmentList[i]).sizeInMemory = ph->memSize;
		(exeFormat->segmentList[i]).protFlags	 = ph->flags;
		
		//get the address of the next programHeader
		nextProgramHeaderOffset += eh->phentsize;
	}
	
    return 0;
}
That is , I just copy up the segments from the elf file into these structures . And that's all to do , linking (dynamic linking etc ) will all be handled by the linker

The code prints the global string fine . But it does not print up the local string . Is there something I'm missing ?? Or is it the loader ??

debugging gave me that one of the segment has type as "segment type 1685382481 offset 0" . This is not a valid segment either .
richi18007
Member
Member
Posts: 35
Joined: Mon Mar 07, 2011 1:41 pm

Re: Reading the ELF file

Post by richi18007 »

So , I'm supposed to copy all the segments into this ELF executable structure passed as an argument . Do , I need to copy only those headers which are to be loaded ? Or should the loader(a separate function) do this work ?

Also , I dug deeper into this and found out the following :-

If I place char * str="This is second string" .It prints out the result just fine .
but If i do it as char str[40]="This is second string , it does not " .Here's whjat happens after I load up the segments into the data structure

The following function is called up

Code: Select all

static int Spawn_Program(char *exeFileData, struct Exe_Format *exeFormat)
struct exeFormat is where I loaded all the segments from the program headers as shown in the code above
for each segments it finds out the upper limit of the virtual memory (user space ) address as

Code: Select all

ulong_t topva = segment->startAddress + segment->sizeInMemory;
if(topva>maxva)
{
          maxva=topva ;
}
Then it finds out the space required from 0 to maxva in terms of pages +pages for stack segments
And allocates that much memory which is then pointed to by the variable *VirSize*

Now returning to the c.exe code to be loaded . It executes a print_ELF library call to print the string . This called function does the following simple job

Code: Select all

asm __volatile__ ("int 0x90 :a(msg)")
// that is put the pointer to the string in eax . and call the interupt 90 .This interrupt calls the Print_interrupt , also a very simple function all it does is

Code: Select all

static void Printrap_Handler( struct Interrupt_State* state )
{
  char * msg = (char *)virtSpace + state->eax;

  Print(msg);

  g_needReschedule = true;
  return;
}
virtSpace points to the maloced memory for segments allocated previously (pages (not frames) from 0 to maxva)

So , I have two questions now

1)Char * works but char [] , does not ,It seems they are in a different segment because when char [] type value is passed to the interrupt handler it resolves it incorrectly with garbage
2)What the hell is happening over here !
I am adding the code as well for both the files as an attachment
Attachments
lprog.c
The file which does al the stuff with interrupts etc mentoined above

The code for c.exe is mentioned in the above posts
(4.79 KiB) Downloaded 102 times
User avatar
Solar
Member
Member
Posts: 7615
Joined: Thu Nov 16, 2006 12:01 pm
Location: Germany
Contact:

Re: Reading the ELF file

Post by Solar »

Did you look at the compiled files to check your assumptions about the strings turning up in specific sections? (Hint: objdump / nm are of immense help with this.)
Every good solution is obvious once you've found it.
richi18007
Member
Member
Posts: 35
Joined: Mon Mar 07, 2011 1:41 pm

Re: Reading the ELF file

Post by richi18007 »

I loaded three segments as verified by objdump

Code: Select all

    LOAD off    0x00001000 vaddr 0x00001000 paddr 0x00001000 align 2**12
         filesz 0x000000a4 memsz 0x000000a4 flags r-x
    LOAD off    0x000010c0 vaddr 0x000020c0 paddr 0x000020c0 align 2**12
         filesz 0x00000028 memsz 0x0000002c flags rw-
   STACK off    0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**2
         filesz 0x00000000 memsz 0x00000000 flags rw-
Also , deaasembly gives

Code: Select all

  1055:	8b 44 24 1c          	mov    0x1c(%esp),%eax
    1059:	89 04 24             	mov    %eax,(%esp)
    105c:	e8 07 00 00 00       	call   1068 <ELF_Print>
Clearly before calling ELF_PRINT() , all it does is push return address on the stack + push the address for the string in main , which was allocated on the stack (not pasting it for brevity ;) )

But still , no conclusion :(
User avatar
Solar
Member
Member
Posts: 7615
Joined: Thu Nov 16, 2006 12:01 pm
Location: Germany
Contact:

Re: Reading the ELF file

Post by Solar »

What I meant is, are the strings actually in the binary and where your code expects them to be?
Every good solution is obvious once you've found it.
richi18007
Member
Member
Posts: 35
Joined: Mon Mar 07, 2011 1:41 pm

Re: Reading the ELF file

Post by richi18007 »

Solar wrote:What I meant is, are the strings actually in the binary and where your code expects them to be?
Dude , I found out that all the addresses were correct except that s2[40] one , debugger shows the address as virtual_base_address+48856 (in decimal) , for the string to be printed .While objdump shows the following sequence of instructions

Code: Select all

 109d:	8d 44 24 18          	lea    0x18(%esp),%eax
    10a1:	89 04 24             	mov    %eax,(%esp)
    10a4:	e8 07 00 00 00       	call   10b0 <ELF_Print>
I know that it selects the 0x18th entry from the stack segent , puts it in the eax , and goes on . But what is the value of the stack segment ? Here , the ebp value is 48820 , but I still could not find the location of the string in the stack :(

I'm writing an important objdump data .plz help me :|

Code: Select all

/*
 * Entry point.  Calls user program's main() routine, then exits.
 */
void _Entry(void)
{
    1000:	55                   	push   %ebp
    1001:	89 e5                	mov    %esp,%ebp
    1003:	83 ec 18             	sub    $0x18,%esp

    /* Call main(); arguments won't be needed */
    main(0, 0);
    1006:	c7 44 24 04 00 00 00 	movl   $0x0,0x4(%esp)
    100d:	00 
    100e:	c7 04 24 00 00 00 00 	movl   $0x0,(%esp)
    1015:	e8 06 00 00 00       	call   1020 <main>

    /* make the inter-selector jump back */
  __asm__ __volatile__ ("leave");
    101a:	c9                   	leave  
  __asm__ __volatile__ ("lret");
    101b:	cb                   	lret   

}
    101c:	c9                   	leave  
    101d:	c3                   	ret    
    101e:	90                   	nop
    101f:	90                   	nop

00001020 <main>:

char  s1[40] = "Hi ! This is the first string\n";
//char  s2[40] = "Hi ! This is the second string\n";
int kkm;
int main(int argc, char** argv)
{
    1020:	55                   	push   %ebp
    1021:	89 e5                	mov    %esp,%ebp
    1023:	83 e4 f0             	and    $0xfffffff0,%esp
    1026:	83 ec 40             	sub    $0x40,%esp
   char  s2[40] = "Hi ! This is the second string\n"; 
    1029:	c7 44 24 18 48 69 20 	movl   $0x21206948,0x18(%esp)
    1030:	21 
    1031:	c7 44 24 1c 20 54 68 	movl   $0x69685420,0x1c(%esp)
    1038:	69 
    1039:	c7 44 24 20 73 20 69 	movl   $0x73692073,0x20(%esp)
    1040:	73 
    1041:	c7 44 24 24 20 74 68 	movl   $0x65687420,0x24(%esp)
    1048:	65 
    1049:	c7 44 24 28 20 73 65 	movl   $0x63657320,0x28(%esp)
    1050:	63 
    1051:	c7 44 24 2c 6f 6e 64 	movl   $0x20646e6f,0x2c(%esp)
    1058:	20 
    1059:	c7 44 24 30 73 74 72 	movl   $0x69727473,0x30(%esp)
    1060:	69 
    1061:	c7 44 24 34 6e 67 0a 	movl   $0xa676e,0x34(%esp)
    1068:	00 
    1069:	c7 44 24 38 00 00 00 	movl   $0x0,0x38(%esp)
    1070:	00 
    1071:	c7 44 24 3c 00 00 00 	movl   $0x0,0x3c(%esp)
    1078:	00 
   ELF_Print("1st one");
    1079:	c7 04 24 ba 10 00 00 	movl   $0x10ba,(%esp)
    1080:	e8 2b 00 00 00       	call   10b0 <ELF_Print>
   ELF_Print(s1);
    1085:	c7 04 24 e0 20 00 00 	movl   $0x20e0,(%esp)
    108c:	e8 1f 00 00 00       	call   10b0 <ELF_Print>
   ELF_Print("2nd one");
    1091:	c7 04 24 c2 10 00 00 	movl   $0x10c2,(%esp)
    1098:	e8 13 00 00 00       	call   10b0 <ELF_Print>
   ELF_Print(s2); 
    109d:	8d 44 24 18          	lea    0x18(%esp),%eax
    10a1:	89 04 24             	mov    %eax,(%esp)
    10a4:	e8 07 00 00 00       	call   10b0 <ELF_Print>

   return 0;
    10a9:	b8 00 00 00 00       	mov    $0x0,%eax
}
For , everything else , the values in here are same as the value in the interrupt handler while debugging , but as mentioned above , I don't know how to compare the string allocated on the stack by the above code with the value for the string address in the interrupt handler :( (i.e 48856) .Also , if you can give a lilnk after explaining , it would be nice
richi18007
Member
Member
Posts: 35
Joined: Mon Mar 07, 2011 1:41 pm

Re: Reading the ELF file

Post by richi18007 »

I'm not using any script for linking
just a plain linker command as

ld -o "something" -Ttext 1000("start from 1000") .
Can I see the stack and block pointer with any program/script ?? Also , even if you could tell me what's the default value of esp/ebp .It would do ..
Thanks in advance
And please elaborate ur suggestions .
Post Reply