Page 1 of 1

Reading the ELF file

Posted: Tue Apr 12, 2011 3:07 am
by richi18007
Hello , everyone .
I'm have developed a loader (a full proof code , no errors) .But now I have to develop a stub function which fills in the segment tables etc information for the loader to load the image . The test code is simple

Code: Select all

void ELF_Print(char* msg);


char  s1[40] = "Hi ! This is the first string\n";

int main(int argc, char** argv)
{
   char  s2[40] = "Hi ! This is the second string\n";

   ELF_Print(s1);
   ELF_Print(s2);

   return 0;
}
And this is what I do in the stub function

Code: Select all

int Parse_ELF_Executable(
	char *exeFileData,
	ulong_t exeFileLength,
    struct Exe_Format *exeFormat)
{	
	//var declaration
	int i;
	
	//map an elfHeader to the exeFileData
	elfHeader* eh = (elfHeader*) exeFileData;
	//check the elf-identification
	if(eh->ident[0] != 0x7F && 
			eh->ident[1] != 'E' &&
			eh->ident[2] != 'L' &&
			eh->ident[3] != 'F')
	{
		//TODO is it necessary, that the other bytes are checked too?
		Print("\nnot an executable file\n");
		return ENOEXEC;
	}
	if(eh->ident[4] == 0 ||
			eh->ident[4] == 2)
	{
		Print("\ninvalid class or 64bit object\n");
		return ENOEXEC;
	}
	
	//check header for correct data
	/*if(eh->type != 2)
	{ //executable type - first we only implement exec files
		Print("\nunimplemented executable type: %d\n", eh->type);
		return ENOEXEC;
	}*/
	/*if(eh->machine != 2)
	{ //machine type - we implement this on x86 machines
		Print("\nunsupported machine type: %d\n", eh->machine);
		return ENOEXEC;
	}*/
	
	//fill in data to exeFormat
	//- set nr. of exeSegments
	exeFormat->numSegments = eh->phnum;
	//- set the entry address
	exeFormat->entryAddr = eh->entry;
	//- get exeSegments
	if(eh->phoff == 0)
	{
		Print("\nno program header available\n");
		return ENOEXEC;
	}
	
	int nextProgramHeaderOffset = (eh->phoff);
	for(i=0; i<eh->phnum; i++)
	{
		if(i >= EXE_MAX_SEGMENTS)
		{ //the number of segments must be less or equals the maximum numbers set for the exeFormat
			Print("\nmaximum number of segments exceeded\n");
			return ENOEXEC;
		}
		
		//map the programHeader to the exeFileData
		programHeader* ph = (programHeader*) (exeFileData+nextProgramHeaderOffset);
	
		//fill in the sections in the exeFormat
		(exeFormat->segmentList[i]).offsetInFile = ph->offset;
		(exeFormat->segmentList[i]).lengthInFile = ph->fileSize;
		(exeFormat->segmentList[i]).startAddress = ph->vaddr;
		(exeFormat->segmentList[i]).sizeInMemory = ph->memSize;
		(exeFormat->segmentList[i]).protFlags	 = ph->flags;
		
		//get the address of the next programHeader
		nextProgramHeaderOffset += eh->phentsize;
	}
	
    return 0;
}
That is , I just copy up the segments from the elf file into these structures . And that's all to do , linking (dynamic linking etc ) will all be handled by the linker

The code prints the global string fine . But it does not print up the local string . Is there something I'm missing ?? Or is it the loader ??

debugging gave me that one of the segment has type as "segment type 1685382481 offset 0" . This is not a valid segment either .

Re: Reading the ELF file

Posted: Tue Apr 12, 2011 7:30 am
by richi18007
So , I'm supposed to copy all the segments into this ELF executable structure passed as an argument . Do , I need to copy only those headers which are to be loaded ? Or should the loader(a separate function) do this work ?

Also , I dug deeper into this and found out the following :-

If I place char * str="This is second string" .It prints out the result just fine .
but If i do it as char str[40]="This is second string , it does not " .Here's whjat happens after I load up the segments into the data structure

The following function is called up

Code: Select all

static int Spawn_Program(char *exeFileData, struct Exe_Format *exeFormat)
struct exeFormat is where I loaded all the segments from the program headers as shown in the code above
for each segments it finds out the upper limit of the virtual memory (user space ) address as

Code: Select all

ulong_t topva = segment->startAddress + segment->sizeInMemory;
if(topva>maxva)
{
          maxva=topva ;
}
Then it finds out the space required from 0 to maxva in terms of pages +pages for stack segments
And allocates that much memory which is then pointed to by the variable *VirSize*

Now returning to the c.exe code to be loaded . It executes a print_ELF library call to print the string . This called function does the following simple job

Code: Select all

asm __volatile__ ("int 0x90 :a(msg)")
// that is put the pointer to the string in eax . and call the interupt 90 .This interrupt calls the Print_interrupt , also a very simple function all it does is

Code: Select all

static void Printrap_Handler( struct Interrupt_State* state )
{
  char * msg = (char *)virtSpace + state->eax;

  Print(msg);

  g_needReschedule = true;
  return;
}
virtSpace points to the maloced memory for segments allocated previously (pages (not frames) from 0 to maxva)

So , I have two questions now

1)Char * works but char [] , does not ,It seems they are in a different segment because when char [] type value is passed to the interrupt handler it resolves it incorrectly with garbage
2)What the hell is happening over here !
I am adding the code as well for both the files as an attachment

Re: Reading the ELF file

Posted: Tue Apr 12, 2011 7:35 am
by Solar
Did you look at the compiled files to check your assumptions about the strings turning up in specific sections? (Hint: objdump / nm are of immense help with this.)

Re: Reading the ELF file

Posted: Tue Apr 12, 2011 8:03 am
by richi18007
I loaded three segments as verified by objdump

Code: Select all

    LOAD off    0x00001000 vaddr 0x00001000 paddr 0x00001000 align 2**12
         filesz 0x000000a4 memsz 0x000000a4 flags r-x
    LOAD off    0x000010c0 vaddr 0x000020c0 paddr 0x000020c0 align 2**12
         filesz 0x00000028 memsz 0x0000002c flags rw-
   STACK off    0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**2
         filesz 0x00000000 memsz 0x00000000 flags rw-
Also , deaasembly gives

Code: Select all

  1055:	8b 44 24 1c          	mov    0x1c(%esp),%eax
    1059:	89 04 24             	mov    %eax,(%esp)
    105c:	e8 07 00 00 00       	call   1068 <ELF_Print>
Clearly before calling ELF_PRINT() , all it does is push return address on the stack + push the address for the string in main , which was allocated on the stack (not pasting it for brevity ;) )

But still , no conclusion :(

Re: Reading the ELF file

Posted: Tue Apr 12, 2011 8:17 am
by Solar
What I meant is, are the strings actually in the binary and where your code expects them to be?

Re: Reading the ELF file

Posted: Tue Apr 12, 2011 9:19 am
by richi18007
Solar wrote:What I meant is, are the strings actually in the binary and where your code expects them to be?
Dude , I found out that all the addresses were correct except that s2[40] one , debugger shows the address as virtual_base_address+48856 (in decimal) , for the string to be printed .While objdump shows the following sequence of instructions

Code: Select all

 109d:	8d 44 24 18          	lea    0x18(%esp),%eax
    10a1:	89 04 24             	mov    %eax,(%esp)
    10a4:	e8 07 00 00 00       	call   10b0 <ELF_Print>
I know that it selects the 0x18th entry from the stack segent , puts it in the eax , and goes on . But what is the value of the stack segment ? Here , the ebp value is 48820 , but I still could not find the location of the string in the stack :(

I'm writing an important objdump data .plz help me :|

Code: Select all

/*
 * Entry point.  Calls user program's main() routine, then exits.
 */
void _Entry(void)
{
    1000:	55                   	push   %ebp
    1001:	89 e5                	mov    %esp,%ebp
    1003:	83 ec 18             	sub    $0x18,%esp

    /* Call main(); arguments won't be needed */
    main(0, 0);
    1006:	c7 44 24 04 00 00 00 	movl   $0x0,0x4(%esp)
    100d:	00 
    100e:	c7 04 24 00 00 00 00 	movl   $0x0,(%esp)
    1015:	e8 06 00 00 00       	call   1020 <main>

    /* make the inter-selector jump back */
  __asm__ __volatile__ ("leave");
    101a:	c9                   	leave  
  __asm__ __volatile__ ("lret");
    101b:	cb                   	lret   

}
    101c:	c9                   	leave  
    101d:	c3                   	ret    
    101e:	90                   	nop
    101f:	90                   	nop

00001020 <main>:

char  s1[40] = "Hi ! This is the first string\n";
//char  s2[40] = "Hi ! This is the second string\n";
int kkm;
int main(int argc, char** argv)
{
    1020:	55                   	push   %ebp
    1021:	89 e5                	mov    %esp,%ebp
    1023:	83 e4 f0             	and    $0xfffffff0,%esp
    1026:	83 ec 40             	sub    $0x40,%esp
   char  s2[40] = "Hi ! This is the second string\n"; 
    1029:	c7 44 24 18 48 69 20 	movl   $0x21206948,0x18(%esp)
    1030:	21 
    1031:	c7 44 24 1c 20 54 68 	movl   $0x69685420,0x1c(%esp)
    1038:	69 
    1039:	c7 44 24 20 73 20 69 	movl   $0x73692073,0x20(%esp)
    1040:	73 
    1041:	c7 44 24 24 20 74 68 	movl   $0x65687420,0x24(%esp)
    1048:	65 
    1049:	c7 44 24 28 20 73 65 	movl   $0x63657320,0x28(%esp)
    1050:	63 
    1051:	c7 44 24 2c 6f 6e 64 	movl   $0x20646e6f,0x2c(%esp)
    1058:	20 
    1059:	c7 44 24 30 73 74 72 	movl   $0x69727473,0x30(%esp)
    1060:	69 
    1061:	c7 44 24 34 6e 67 0a 	movl   $0xa676e,0x34(%esp)
    1068:	00 
    1069:	c7 44 24 38 00 00 00 	movl   $0x0,0x38(%esp)
    1070:	00 
    1071:	c7 44 24 3c 00 00 00 	movl   $0x0,0x3c(%esp)
    1078:	00 
   ELF_Print("1st one");
    1079:	c7 04 24 ba 10 00 00 	movl   $0x10ba,(%esp)
    1080:	e8 2b 00 00 00       	call   10b0 <ELF_Print>
   ELF_Print(s1);
    1085:	c7 04 24 e0 20 00 00 	movl   $0x20e0,(%esp)
    108c:	e8 1f 00 00 00       	call   10b0 <ELF_Print>
   ELF_Print("2nd one");
    1091:	c7 04 24 c2 10 00 00 	movl   $0x10c2,(%esp)
    1098:	e8 13 00 00 00       	call   10b0 <ELF_Print>
   ELF_Print(s2); 
    109d:	8d 44 24 18          	lea    0x18(%esp),%eax
    10a1:	89 04 24             	mov    %eax,(%esp)
    10a4:	e8 07 00 00 00       	call   10b0 <ELF_Print>

   return 0;
    10a9:	b8 00 00 00 00       	mov    $0x0,%eax
}
For , everything else , the values in here are same as the value in the interrupt handler while debugging , but as mentioned above , I don't know how to compare the string allocated on the stack by the above code with the value for the string address in the interrupt handler :( (i.e 48856) .Also , if you can give a lilnk after explaining , it would be nice

Re: Reading the ELF file

Posted: Thu Apr 14, 2011 11:15 pm
by richi18007
I'm not using any script for linking
just a plain linker command as

ld -o "something" -Ttext 1000("start from 1000") .
Can I see the stack and block pointer with any program/script ?? Also , even if you could tell me what's the default value of esp/ebp .It would do ..
Thanks in advance
And please elaborate ur suggestions .