Page 1 of 1

From file to execution - about sections and loaders

Posted: Thu May 28, 2015 10:59 pm
by hubris
I have a question about how the loader goes from a (see title). My understanding is limited because I cannot seem to find an explicit reference (although I know they must be out there somewhere).

So by inference, parse the file looking for section description which typically give and offset and a length, and of course this is relative to the base address of the executable.

So I assume the role of the Loader is to load/parse the file, and then place the contents of the file (sections) at the appropriate addresses given by the section descriptors. None of this is needed of course if you have a flat binary image.

Different formats store this information using differing internal structures.

So am I on the right path when I think I need a specific loader for each type of format?

ps: so good to have people to talk to about this stuff, noone else within my circle is even vaguely interested, introversion and OCD have their place.

Re: From file to execution - about sections and loaders

Posted: Fri May 29, 2015 12:00 am
by Schol-R-LEA
hubris wrote:I have a question about how the loader goes from a (see title). My understanding is limited because I cannot seem to find an explicit reference (although I know they must be out there somewhere).
The go-to reference works for this subject are Assemblers and Loaders by David Solomon and Linkers and Loaders by John R. Levine. While neither is entirely up to date, they are excellent resources for the subject. Levine's book specifically covers the COFF, PE, and ELF formats, which are the most relevant ones for most systems today (though it doesn't cover the 64-bit versions of them). The former book is freely available online as a PDF, while the manuscript for the latter (but not the final version) is freely available as well.
hubris wrote:So I assume the role of the Loader is to load/parse the file, and then place the contents of the file (sections) at the appropriate addresses given by the section descriptors. None of this is needed of course if you have a flat binary image.
Correct as far as it goes, though depending on the format and the linking parameters, is may have to resolve the addresses as well. While the default for ELF files (IIUC) is to resolve addresses at link time and load to a fixed location in the process' virtual memory space, it is also possible to define an executable so that the addresses are relocatable at load time. In the case of shared libraries (again, in my understanding), the address relocation is always done at runtime, on a per-process basis, because otherwise there could be conflicts in the shared memory locations of different shared libraries. What makes this possible is that the shared code sections can be mapped to different locations in different processes by the MMU.

Mind you, those cases where the addresses are resolved ahead of time are a lot easier for the loader to deal with, and not just because it doesn't need to resolve the addresses: as I understand it, the layout of such executables can be made in such a way that the 'loading' is done by mapping the executable section into memory pages, and then simply letting the MMU handle it. If anyone can confirm this, and provide more details, I would be grateful.

Also, executable formats usually have additional information, such as constants to be loaded to the global R/O memory, the size of the global variable space (the .bss section) and it's initializations, and so forth.
hubris wrote:Different formats store this information using differing internal structures.

So am I on the right path when I think I need a specific loader for each type of format?
Well, yes, if you are going to support multiple formats. It is very unusual for a ny system to do that, though, unless one of the formats of a legacy format that is being phased out (e.g., AOUT under Linux circa 2000-2005 - technically, it is still possible to use AOUT, but there's hardly ever a reason to do so).

Re: From file to execution - about sections and loaders

Posted: Fri May 29, 2015 6:33 am
by hubris
Thank you for the pointers :lol: I am off to read the documents you have recommended, back in a week (I guess)

Re: From file to execution - about sections and loaders

Posted: Sat May 30, 2015 4:40 am
by hubris
well it was a good read for the reference, but I am still wandering in the dark. I have what I think is a successful loader which loads an executable, locates the entry point, and jumps to that entry point. QEMU and BOCHS both run but I have a problem. In the newly loaded kernel the only thing I do is write a message directly to the video buffer 0x000B8000 nothing appears on the screen but the cursor moves along as if it has displayed the characters.

using cygwin to build 32 bit kernel.

So I am wondering if anyone has had a similar issue?

Re: From file to execution - about sections and loaders

Posted: Sat May 30, 2015 5:09 am
by glauxosdev
Hi,

Are you sure you don't print black characters on black background? Are you sure you are in text mode? Are you sure you have understood how video memory in text mode works?

To help you a bit, the first byte of the video memory (0x000B8000) contains the ASCII code of the character that is meant to appear at the top-left corner. The second byte (0x000B8001) contains the attributes of this character. The third byte contains the ASCII code of the second character and the fourth contains its attributes.

Hope this helps,
glauxosdev

Re: From file to execution - about sections and loaders

Posted: Sat May 30, 2015 8:20 am
by SpyderTL
And, just FYI, writing characters to video memory (0xb8000) won't move the cursor. You have to communicate with the VGA controller via I/O to move the cursor.

Re: From file to execution - about sections and loaders

Posted: Sat May 30, 2015 12:20 pm
by glauxosdev
Hi,
writing characters to video memory (0x8b000) won't move the cursor.
Don't mess things up, video memory is at 0x000B8000. I suppose this is just a typo, so I don't blame you. But, please, edit your post.

Regards,
glauxosdev

Re: From file to execution - about sections and loaders

Posted: Sat May 30, 2015 6:10 pm
by hubris
To respond to the last three replies. I am sure of the mode, as I use it to print messages in protected mode prior to jumping to the kernel, so I know the mode is correct. As an additional piece of info I have explicitly moved a character to the memory address and this has appeared on the screen as the first line of the kernel so I know I am not printing black on black.
In my print routine I am controlling the position of the cursor so that it remaining synchronised with the printed character which is why I can see my routine is called because the cursor moves as if the character has been printed,
The area where I am expecting the message to appear has some characters that were printed using the int in real mode, so I am expecting them to be over written and they are not, but the cursor still moves over these previously printed characters and these characters remain visible. I have tried printing a binary zero character and the position is blanked out so even if the string were composed of zeroes it should still overwrite the previously displayed characters.
I have read the (or one of) articales on osdev about printing to the screen in protected mode and one of the issues mentions is not correctly setting up the .rodata section, however the kernel stub I am using has only a .text section which I explicitly name in the assembly and no other sections appear in the object or executable files, so no clues there either.

Re: From file to execution - about sections and loaders

Posted: Sat May 30, 2015 6:28 pm
by gerryg400
I suspect you have a bug. It's quite difficult for us to imagine what it is without seeing all your code, understanding how you build it and see what happens when it executes.

Re: From file to execution - about sections and loaders

Posted: Sat May 30, 2015 7:15 pm
by SpyderTL
glauxosdev wrote:Hi,
writing characters to video memory (0x8b000) won't move the cursor.
Don't mess things up, video memory is at 0x000B8000. I suppose this is just a typo, so I don't blame you. But, please, edit your post.

Regards,
glauxosdev
Yep. I seem to have a bit of dyslexia when it comes to hexadecimal...

Hex-lexia, if you will.

Image

Re: From file to execution - about sections and loaders

Posted: Sat May 30, 2015 7:19 pm
by hubris
After some more exploration, I realised that I had not described that I my kernel is in PE format, which may have some bearing on the issue. I have been comparing the hexdump of my kernel with the information in the pe headers and they do not meet my expectations, which probably means I am not understanding the format correctly.

So what I am doing is parsing the headers to find the entry point offset and the image base address and using the base+entry point to identify the kernel address to jump to, which I do. I know this jump is correct because the BOCHS debugger shows me the correct instructions being executed at that address.

So the lines relevant to this discussion are

Code: Select all

identity						db	'runner 02', 0

global prolog
prolog:


The identity is the string I am attempting to display so my entry point address is 0x00400214, and given there is only one section when I look at this I calculate that the address of the string ix 0x0040020A and this is the address that I can see is being passed to the display routine. However when I look at the hexdump this string is at 0x00000608 and when I inspect the bytes at this address 0x00400608 they are the actual value, when I inspect the bytes at the calculated address 0x0040020A they are all zeroes.

So this all points to something to do with the way I am interpreting how to load a PE format file I guess.

Re: From file to execution - about sections and loaders-sol

Posted: Sat May 30, 2015 7:55 pm
by hubris
The clue was in the difference between what I expected for the addresses and what was actually being executed. My linker script positioned the .text section at 0x00400000 but the actual kernel.bin file located the .text section at 0x00400400 because of the prefixed header information so the answer was to ensure that the LMA and file address of the .text section matched so I changed the linker script to position the .text section at 0x00400400 and every thing worked.

thank you all for your help, even the dead ends are good because they remove possibilities sooner.