The correct starting location of a program

icecoder · Post by **icecoder** » Sat Jun 26, 2010 2:19 am

Hi,
after a bunch of days spent reading the intel manuals and learning on how segmentation works in pmode, I managed to set a correct gdt and enter prot mode, now I am in front of a big question mark....

How does a C program know where it is located?
Ok that can sound a strange question, let me explain in a few steps what I want to do:

1) I created a LBR that looks in my fat16 disk (image) for a file called BOOT.BIN
2) BOOT.BIN enables a20, saves the bios memory map, sets a gdt and enables pmode.
3) Then, it should load my C kernel from the disk image using a small driver written on the fly for the occurrence.

But imagine a unuseful kernel like this:

Code: Select all

void main()
{
goto hello;
(2 megabytes of space here)
hello:
other_code();
}

There are 2 megabytes between the jump and the label, so it should use a far jump, but how does it computes the absolute address of the 'hello' label? using assembler I just add the starting address to the relative one, in order to obtain the absolute address.. but how does C know where the program is located?

Something to do with the loader?

(Hope you're able to understand what I tried to explain..)

EDIT: maybe "paging" is the answer?

thepowersgang · Post by **thepowersgang** » Sat Jun 26, 2010 5:06 am

The simple answer is, the linker tells it.

The way you actually load the C code is by using assembly code that reads it from disk and places it in memory at the place it was linked to and passes control to it. You don't just create a binary with 2mb of empty space and load all that.

I suggest reading the tutorials on the wiki to get a better grasp of how to do things.

icecoder · Post by **icecoder** » Sat Jun 26, 2010 5:45 am

thepowersgang wrote:The simple answer is, the linker tells it.

The way you actually load the C code is by using assembly code that reads it from disk and places it in memory at the place it was linked to and passes control to it. You don't just create a binary with 2mb of empty space and load all that.

I suggest reading the tutorials on the wiki to get a better grasp of how to do things.

Thank you for the reply, that's very useful.. So, if I tell the linker to place it at 0x500, then reading the header, I have to map the whole thing at 0x500 and then execute, got it.

Obviously, I was not willing to create a file with 2mb of empty space

gravaera · Post by **gravaera** » Sat Jun 26, 2010 8:36 am

Hi:

Technically, variables in C with automatic scope are allocated on the stack, so there is no gap between the emitted machine code instructions. And in C there is no way to execute outside of the context of a function, so the issue you raised does not arise.

--All the best,
gravaera

icecoder · Post by **icecoder** » Sun Jun 27, 2010 1:18 am

gravaera wrote:Hi:

Technically, variables in C with automatic scope are allocated on the stack, so there is no gap between the emitted machine code instructions. And in C there is no way to execute outside of the context of a function, so the issue you raised does not arise.

--All the best,
gravaera

Thank you for the reply, yes I know about stacked C local variables, but I was talking about absolute addresses (e.g. "how does C far-call without knowing the place I put it?"), I didn't know about this mapping needs of progs, so yesterday studied the elf format and found the information thepowersgang was talking about.

Thank you.

Combuster · Post by **Combuster** » Sun Jun 27, 2010 5:22 am

icecoder wrote:how does C far-call

It does not. In fact, all regular function calls it does use are by default position independent, so if your program has no data references you can already put it anywhere. Some compilers can even make truly position-independent code.

icecoder · Post by **icecoder** » Sun Jun 27, 2010 6:10 am

Combuster wrote:
icecoder wrote:how does C far-call
It does not. In fact, all regular function calls it does use are by default position independent, so if your program has no data references you can already put it anywhere. Some compilers can even make truly position-independent code.

That's new to me, I will surely investigate about it, thank you.

Tosi · Post by **Tosi** » Tue Jun 29, 2010 11:06 pm

For a kernel, there is no correct starting location.
It can be loaded into any block of memory that is not already in use by the BIOS* or memory-mapped hardware.
A typical i386 kernel may be loaded at 1 MB, or it could be a higher half kernel, and so on.
To get it there, you could do at least three things: one Good, and two Bad I can think of.

The bad one is easier, of course. Assemble a flat binary with all data/function accesses offset by the address you're loading at and use a flat memory model.
A better way is to assembly a a flat binary kernel with no offsets, but create a GDT with the kernel data and code segment base equal to the address your kernel is loaded at.
The good way, and the one that many real-world kernels use is to have the kernel a regular executable file (e.g., .EXE for Windows, ELF for Linux) which is parsed and set up by the bootloader. If you use a known binary format, then you don't have to write your bootloader.

A regular executable file loaded in userspace is a different story. It may be loaded to any available physical address, but could be always mapped to the same virtual address. Or it could be mapped through paging or some other similar mechanism to some virtual address specified in the executable file itself. Either way, both the application and the kernel know about the loading address.

OSDev.org

The correct starting location of a program

The correct starting location of a program

Re: The correct starting location of a program

Re: The correct starting location of a program

Re: The correct starting location of a program

Re: The correct starting location of a program

Re: The correct starting location of a program

Re: The correct starting location of a program

Re: The correct starting location of a program