Page 1 of 1

Code that can execute from anywhere in memory

Posted: Mon Jul 09, 2007 6:21 am
by synthetix
As a part of my operating system project (still private, until I bring every base component up (2 years left :D )). I have finished designing the basic code for the HAL, which I call SysCore. As my operating system is extremely (insist on that), lots of things will be external to the SysCore's core (Extensions management, boot code, maybe some communications method with the kenel), including CPU management (the first loaded thing after SysCore). I rely on GRUB to load things in memory, and this means I do not know where my modules are gonna be in memory. So, how do you create code that can execute from 0x12345678 as well as 0x74927583 (fictive numbers, the important is that I must be able to execute the code, wherever it is in memory).

Thanks in advance for any information (and for the kindness of answering my simple 15 years old guy questions)

Re: Code that can execute from anywhere in memory

Posted: Mon Jul 09, 2007 7:41 am
by Brendan
Hi,
synthetix wrote:As a part of my operating system project (still private, until I bring every base component up (2 years left :D )). I have finished designing the basic code for the HAL, which I call SysCore. As my operating system is extremely (insist on that), lots of things will be external to the SysCore's core (Extensions management, boot code, maybe some communications method with the kenel), including CPU management (the first loaded thing after SysCore). I rely on GRUB to load things in memory, and this means I do not know where my modules are gonna be in memory. So, how do you create code that can execute from 0x12345678 as well as 0x74927583 (fictive numbers, the important is that I must be able to execute the code, wherever it is in memory).
There's several ways...

The first way is to use position independant code from a high level language (e.g. ELF). I'm not too sure how this works though...

The next way is to use segmentation, and set CS base, DS base and ES base to the address of your code, and then use FS and GS to access data that doesn't depend on where your code is loaded (e.g. video display memory).

Another way would be to keep the address of your code in a general register, and then add this general register to everything. For example, use "mov eax,[FOO+ebp]" instead of "mov eax,[FOO]", and "mov eax,[functionTable+eax*4+ebp]; add eax,ebp; call eax" instead of "call [functionTable+eax*4]". I can guarantee this will drive you insane.

You could use a mixture of the last 2 methods - e.g. set CS base to the address of your code, then use "mov eax,[cs:FOO]" instead of "mov eax,[FOO]" and "mov [FOO+ebp],eax" instead of "mov [FOO],eax". This will probably drive you insane too, just not quite as quickly.

Of course you could relocate your code to a fixed physical address (e.g. "rep movsd"), or use paging to map it to a fixed linear address, and avoid the problem entirely.


Cheers,

Brendan

Posted: Wed Jul 11, 2007 1:17 am
by JamesM
Just pass -PIC to your linker. That way it produces position independent code (all calls/jumps are relative to the current EIP and not absolute). For more detail check out any shared object/dynamic loading tutorial because they use the same concept.

Posted: Wed Jul 11, 2007 6:33 am
by bluecode
If you are in longmode you can use rip-relative adressing...

No linker option

Posted: Thu Jul 12, 2007 6:31 am
by synthetix
I use a GNU toolchain, and the linker does not seem to include -PIC option. It has the -pic-executable option, but this relies on a dynamic loader to point symbols to their address. As even the CPU management code is a module, I do not have memory management. The linker has an option called -pie (position-independent execution ?), but has no description associated.

If I cannot get this to work, I'm gonna have to implement CPU management intoSysCore itself, and I do not want this.

I thought that EIP relative addresses were a good idea, but I can't seem to be able to do that ...