From file to execution - about sections and loaders
From file to execution - about sections and loaders
I have a question about how the loader goes from a (see title). My understanding is limited because I cannot seem to find an explicit reference (although I know they must be out there somewhere).
So by inference, parse the file looking for section description which typically give and offset and a length, and of course this is relative to the base address of the executable.
So I assume the role of the Loader is to load/parse the file, and then place the contents of the file (sections) at the appropriate addresses given by the section descriptors. None of this is needed of course if you have a flat binary image.
Different formats store this information using differing internal structures.
So am I on the right path when I think I need a specific loader for each type of format?
ps: so good to have people to talk to about this stuff, noone else within my circle is even vaguely interested, introversion and OCD have their place.
So by inference, parse the file looking for section description which typically give and offset and a length, and of course this is relative to the base address of the executable.
So I assume the role of the Loader is to load/parse the file, and then place the contents of the file (sections) at the appropriate addresses given by the section descriptors. None of this is needed of course if you have a flat binary image.
Different formats store this information using differing internal structures.
So am I on the right path when I think I need a specific loader for each type of format?
ps: so good to have people to talk to about this stuff, noone else within my circle is even vaguely interested, introversion and OCD have their place.
- Schol-R-LEA
- Member
- Posts: 1925
- Joined: Fri Oct 27, 2006 9:42 am
- Location: Athens, GA, USA
Re: From file to execution - about sections and loaders
The go-to reference works for this subject are Assemblers and Loaders by David Solomon and Linkers and Loaders by John R. Levine. While neither is entirely up to date, they are excellent resources for the subject. Levine's book specifically covers the COFF, PE, and ELF formats, which are the most relevant ones for most systems today (though it doesn't cover the 64-bit versions of them). The former book is freely available online as a PDF, while the manuscript for the latter (but not the final version) is freely available as well.hubris wrote:I have a question about how the loader goes from a (see title). My understanding is limited because I cannot seem to find an explicit reference (although I know they must be out there somewhere).
Correct as far as it goes, though depending on the format and the linking parameters, is may have to resolve the addresses as well. While the default for ELF files (IIUC) is to resolve addresses at link time and load to a fixed location in the process' virtual memory space, it is also possible to define an executable so that the addresses are relocatable at load time. In the case of shared libraries (again, in my understanding), the address relocation is always done at runtime, on a per-process basis, because otherwise there could be conflicts in the shared memory locations of different shared libraries. What makes this possible is that the shared code sections can be mapped to different locations in different processes by the MMU.hubris wrote:So I assume the role of the Loader is to load/parse the file, and then place the contents of the file (sections) at the appropriate addresses given by the section descriptors. None of this is needed of course if you have a flat binary image.
Mind you, those cases where the addresses are resolved ahead of time are a lot easier for the loader to deal with, and not just because it doesn't need to resolve the addresses: as I understand it, the layout of such executables can be made in such a way that the 'loading' is done by mapping the executable section into memory pages, and then simply letting the MMU handle it. If anyone can confirm this, and provide more details, I would be grateful.
Also, executable formats usually have additional information, such as constants to be loaded to the global R/O memory, the size of the global variable space (the .bss section) and it's initializations, and so forth.
Well, yes, if you are going to support multiple formats. It is very unusual for a ny system to do that, though, unless one of the formats of a legacy format that is being phased out (e.g., AOUT under Linux circa 2000-2005 - technically, it is still possible to use AOUT, but there's hardly ever a reason to do so).hubris wrote:Different formats store this information using differing internal structures.
So am I on the right path when I think I need a specific loader for each type of format?
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Re: From file to execution - about sections and loaders
Thank you for the pointers I am off to read the documents you have recommended, back in a week (I guess)
Re: From file to execution - about sections and loaders
well it was a good read for the reference, but I am still wandering in the dark. I have what I think is a successful loader which loads an executable, locates the entry point, and jumps to that entry point. QEMU and BOCHS both run but I have a problem. In the newly loaded kernel the only thing I do is write a message directly to the video buffer 0x000B8000 nothing appears on the screen but the cursor moves along as if it has displayed the characters.
using cygwin to build 32 bit kernel.
So I am wondering if anyone has had a similar issue?
using cygwin to build 32 bit kernel.
So I am wondering if anyone has had a similar issue?
-
- Member
- Posts: 119
- Joined: Tue Jan 20, 2015 9:01 am
- Libera.chat IRC: glauxosdever
Re: From file to execution - about sections and loaders
Hi,
Are you sure you don't print black characters on black background? Are you sure you are in text mode? Are you sure you have understood how video memory in text mode works?
To help you a bit, the first byte of the video memory (0x000B8000) contains the ASCII code of the character that is meant to appear at the top-left corner. The second byte (0x000B8001) contains the attributes of this character. The third byte contains the ASCII code of the second character and the fourth contains its attributes.
Hope this helps,
glauxosdev
Are you sure you don't print black characters on black background? Are you sure you are in text mode? Are you sure you have understood how video memory in text mode works?
To help you a bit, the first byte of the video memory (0x000B8000) contains the ASCII code of the character that is meant to appear at the top-left corner. The second byte (0x000B8001) contains the attributes of this character. The third byte contains the ASCII code of the second character and the fourth contains its attributes.
Hope this helps,
glauxosdev
Re: From file to execution - about sections and loaders
And, just FYI, writing characters to video memory (0xb8000) won't move the cursor. You have to communicate with the VGA controller via I/O to move the cursor.
Last edited by SpyderTL on Sat May 30, 2015 7:04 pm, edited 1 time in total.
Project: OZone
Source: GitHub
Current Task: LIB/OBJ file support
"The more they overthink the plumbing, the easier it is to stop up the drain." - Montgomery Scott
Source: GitHub
Current Task: LIB/OBJ file support
"The more they overthink the plumbing, the easier it is to stop up the drain." - Montgomery Scott
-
- Member
- Posts: 119
- Joined: Tue Jan 20, 2015 9:01 am
- Libera.chat IRC: glauxosdever
Re: From file to execution - about sections and loaders
Hi,
Regards,
glauxosdev
Don't mess things up, video memory is at 0x000B8000. I suppose this is just a typo, so I don't blame you. But, please, edit your post.writing characters to video memory (0x8b000) won't move the cursor.
Regards,
glauxosdev
Re: From file to execution - about sections and loaders
To respond to the last three replies. I am sure of the mode, as I use it to print messages in protected mode prior to jumping to the kernel, so I know the mode is correct. As an additional piece of info I have explicitly moved a character to the memory address and this has appeared on the screen as the first line of the kernel so I know I am not printing black on black.
In my print routine I am controlling the position of the cursor so that it remaining synchronised with the printed character which is why I can see my routine is called because the cursor moves as if the character has been printed,
The area where I am expecting the message to appear has some characters that were printed using the int in real mode, so I am expecting them to be over written and they are not, but the cursor still moves over these previously printed characters and these characters remain visible. I have tried printing a binary zero character and the position is blanked out so even if the string were composed of zeroes it should still overwrite the previously displayed characters.
I have read the (or one of) articales on osdev about printing to the screen in protected mode and one of the issues mentions is not correctly setting up the .rodata section, however the kernel stub I am using has only a .text section which I explicitly name in the assembly and no other sections appear in the object or executable files, so no clues there either.
In my print routine I am controlling the position of the cursor so that it remaining synchronised with the printed character which is why I can see my routine is called because the cursor moves as if the character has been printed,
The area where I am expecting the message to appear has some characters that were printed using the int in real mode, so I am expecting them to be over written and they are not, but the cursor still moves over these previously printed characters and these characters remain visible. I have tried printing a binary zero character and the position is blanked out so even if the string were composed of zeroes it should still overwrite the previously displayed characters.
I have read the (or one of) articales on osdev about printing to the screen in protected mode and one of the issues mentions is not correctly setting up the .rodata section, however the kernel stub I am using has only a .text section which I explicitly name in the assembly and no other sections appear in the object or executable files, so no clues there either.
Re: From file to execution - about sections and loaders
I suspect you have a bug. It's quite difficult for us to imagine what it is without seeing all your code, understanding how you build it and see what happens when it executes.
If a trainstation is where trains stop, what is a workstation ?
Re: From file to execution - about sections and loaders
Yep. I seem to have a bit of dyslexia when it comes to hexadecimal...glauxosdev wrote:Hi,
Don't mess things up, video memory is at 0x000B8000. I suppose this is just a typo, so I don't blame you. But, please, edit your post.writing characters to video memory (0x8b000) won't move the cursor.
Regards,
glauxosdev
Hex-lexia, if you will.
Project: OZone
Source: GitHub
Current Task: LIB/OBJ file support
"The more they overthink the plumbing, the easier it is to stop up the drain." - Montgomery Scott
Source: GitHub
Current Task: LIB/OBJ file support
"The more they overthink the plumbing, the easier it is to stop up the drain." - Montgomery Scott
Re: From file to execution - about sections and loaders
After some more exploration, I realised that I had not described that I my kernel is in PE format, which may have some bearing on the issue. I have been comparing the hexdump of my kernel with the information in the pe headers and they do not meet my expectations, which probably means I am not understanding the format correctly.
So what I am doing is parsing the headers to find the entry point offset and the image base address and using the base+entry point to identify the kernel address to jump to, which I do. I know this jump is correct because the BOCHS debugger shows me the correct instructions being executed at that address.
So the lines relevant to this discussion are
The identity is the string I am attempting to display so my entry point address is 0x00400214, and given there is only one section when I look at this I calculate that the address of the string ix 0x0040020A and this is the address that I can see is being passed to the display routine. However when I look at the hexdump this string is at 0x00000608 and when I inspect the bytes at this address 0x00400608 they are the actual value, when I inspect the bytes at the calculated address 0x0040020A they are all zeroes.
So this all points to something to do with the way I am interpreting how to load a PE format file I guess.
So what I am doing is parsing the headers to find the entry point offset and the image base address and using the base+entry point to identify the kernel address to jump to, which I do. I know this jump is correct because the BOCHS debugger shows me the correct instructions being executed at that address.
So the lines relevant to this discussion are
Code: Select all
identity db 'runner 02', 0
global prolog
prolog:
The identity is the string I am attempting to display so my entry point address is 0x00400214, and given there is only one section when I look at this I calculate that the address of the string ix 0x0040020A and this is the address that I can see is being passed to the display routine. However when I look at the hexdump this string is at 0x00000608 and when I inspect the bytes at this address 0x00400608 they are the actual value, when I inspect the bytes at the calculated address 0x0040020A they are all zeroes.
So this all points to something to do with the way I am interpreting how to load a PE format file I guess.
Re: From file to execution - about sections and loaders-sol
The clue was in the difference between what I expected for the addresses and what was actually being executed. My linker script positioned the .text section at 0x00400000 but the actual kernel.bin file located the .text section at 0x00400400 because of the prefixed header information so the answer was to ensure that the LMA and file address of the .text section matched so I changed the linker script to position the .text section at 0x00400400 and every thing worked.
thank you all for your help, even the dead ends are good because they remove possibilities sooner.
thank you all for your help, even the dead ends are good because they remove possibilities sooner.