The correct starting location of a program

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
icecoder
Posts: 16
Joined: Tue Jun 15, 2010 3:28 am

The correct starting location of a program

Post by icecoder »

Hi,
after a bunch of days spent reading the intel manuals and learning on how segmentation works in pmode, I managed to set a correct gdt and enter prot mode, now I am in front of a big question mark....

How does a C program know where it is located?
Ok that can sound a strange question, let me explain in a few steps what I want to do:

1) I created a LBR that looks in my fat16 disk (image) for a file called BOOT.BIN
2) BOOT.BIN enables a20, saves the bios memory map, sets a gdt and enables pmode.
3) Then, it should load my C kernel from the disk image using a small driver written on the fly for the occurrence.

But imagine a unuseful kernel like this:

Code: Select all

void main()
{
goto hello;
(2 megabytes of space here)
hello:
other_code();
}
There are 2 megabytes between the jump and the label, so it should use a far jump, but how does it computes the absolute address of the 'hello' label? using assembler I just add the starting address to the relative one, in order to obtain the absolute address.. but how does C know where the program is located?

Something to do with the loader?

(Hope you're able to understand what I tried to explain..)

EDIT: maybe "paging" is the answer?
User avatar
thepowersgang
Member
Member
Posts: 734
Joined: Tue Dec 25, 2007 6:03 am
Libera.chat IRC: thePowersGang
Location: Perth, Western Australia
Contact:

Re: The correct starting location of a program

Post by thepowersgang »

The simple answer is, the linker tells it.

The way you actually load the C code is by using assembly code that reads it from disk and places it in memory at the place it was linked to and passes control to it. You don't just create a binary with 2mb of empty space and load all that.

I suggest reading the tutorials on the wiki to get a better grasp of how to do things.
Kernel Development, It's the brain surgery of programming.
Acess2 OS (c) | Tifflin OS (rust) | mrustc - Rust compiler
Currently Working on: mrustc
icecoder
Posts: 16
Joined: Tue Jun 15, 2010 3:28 am

Re: The correct starting location of a program

Post by icecoder »

thepowersgang wrote:The simple answer is, the linker tells it.

The way you actually load the C code is by using assembly code that reads it from disk and places it in memory at the place it was linked to and passes control to it. You don't just create a binary with 2mb of empty space and load all that.

I suggest reading the tutorials on the wiki to get a better grasp of how to do things.
Thank you for the reply, that's very useful.. So, if I tell the linker to place it at 0x500, then reading the header, I have to map the whole thing at 0x500 and then execute, got it.

Obviously, I was not willing to create a file with 2mb of empty space :D
User avatar
gravaera
Member
Member
Posts: 737
Joined: Tue Jun 02, 2009 4:35 pm
Location: Supporting the cause: Use \tabs to indent code. NOT \x20 spaces.

Re: The correct starting location of a program

Post by gravaera »

Hi:

Technically, variables in C with automatic scope are allocated on the stack, so there is no gap between the emitted machine code instructions. And in C there is no way to execute outside of the context of a function, so the issue you raised does not arise.

--All the best,
gravaera
17:56 < sortie> Paging is called paging because you need to draw it on pages in your notebook to succeed at it.
icecoder
Posts: 16
Joined: Tue Jun 15, 2010 3:28 am

Re: The correct starting location of a program

Post by icecoder »

gravaera wrote:Hi:

Technically, variables in C with automatic scope are allocated on the stack, so there is no gap between the emitted machine code instructions. And in C there is no way to execute outside of the context of a function, so the issue you raised does not arise.

--All the best,
gravaera
Thank you for the reply, yes I know about stacked C local variables, but I was talking about absolute addresses (e.g. "how does C far-call without knowing the place I put it?"), I didn't know about this mapping needs of progs, so yesterday studied the elf format and found the information thepowersgang was talking about.

Thank you.
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Re: The correct starting location of a program

Post by Combuster »

icecoder wrote:how does C far-call
It does not. In fact, all regular function calls it does use are by default position independent, so if your program has no data references you can already put it anywhere. Some compilers can even make truly position-independent code.
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
icecoder
Posts: 16
Joined: Tue Jun 15, 2010 3:28 am

Re: The correct starting location of a program

Post by icecoder »

Combuster wrote:
icecoder wrote:how does C far-call
It does not. In fact, all regular function calls it does use are by default position independent, so if your program has no data references you can already put it anywhere. Some compilers can even make truly position-independent code.
That's new to me, I will surely investigate about it, thank you.
Tosi
Member
Member
Posts: 255
Joined: Tue Jun 15, 2010 9:27 am
Location: Flyover State, United States
Contact:

Re: The correct starting location of a program

Post by Tosi »

For a kernel, there is no correct starting location.
It can be loaded into any block of memory that is not already in use by the BIOS* or memory-mapped hardware.
A typical i386 kernel may be loaded at 1 MB, or it could be a higher half kernel, and so on.
To get it there, you could do at least three things: one Good, and two Bad I can think of.

The bad one is easier, of course. Assemble a flat binary with all data/function accesses offset by the address you're loading at and use a flat memory model.
A better way is to assembly a a flat binary kernel with no offsets, but create a GDT with the kernel data and code segment base equal to the address your kernel is loaded at.
The good way, and the one that many real-world kernels use is to have the kernel a regular executable file (e.g., .EXE for Windows, ELF for Linux) which is parsed and set up by the bootloader. If you use a known binary format, then you don't have to write your bootloader.

A regular executable file loaded in userspace is a different story. It may be loaded to any available physical address, but could be always mapped to the same virtual address. Or it could be mapped through paging or some other similar mechanism to some virtual address specified in the executable file itself. Either way, both the application and the kernel know about the loading address.
Post Reply