Direct memory addressing VS. variable addressing in C.

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
Post Reply
User avatar
01000101
Member
Member
Posts: 1599
Joined: Fri Jun 22, 2007 12:47 pm
Contact:

Direct memory addressing VS. variable addressing in C.

Post by 01000101 »

I have been working on a way to compress my driver data into memory areas I call apartments & buildings. There is one building per device type, but there can be multiple apartments within the building (multiple device of the same type). The 'type' of driver allocates a certain amount of memory space that I have calculated for it to need, and if multiple deviced of the same 'type' are found, there then becomes two apartments but within on building per'se.

I have also specified the location of variables for each apartment aligned with the amount of space allocated per apartment.

say each device requires 16k of memory, then the second device would start initializing variables at offset 16k, and then allocate another 16k for its needs.

my main question is: with this kind of variable assignment that must be aligned and precise, should I just access that memory via something like *(unsigned char*)0x12345678, or would it be just as efficient to declare a variable, then re-assign its memory position to suit the memory map?

I personally think, cutting out the variables would cutout some overhead.
speal
Member
Member
Posts: 43
Joined: Wed Mar 07, 2007 10:09 am
Location: Minneapolis, Minnesota
Contact:

Post by speal »

Barring optimization and other goofy things I can't address without seeing all the code:

*(unsigned char*)0xABCDEF = 12345;

will result in the same general machine instructions as:

unsigned char* address = 0xABCDEF;
*address = 12345;

You may save yourself some headaches if you stick things in variables. A compiler, even with optimizations turned off, will handle both of these in very similar (or identical) ways.

I hope I got the gist of the problem..

Edit: kind of a silly mistake:
mov [0xABCDEF], %rax
vs.
lea %rax, [address wrt %rip]
mov [%rax], 12345

The first line would be possible (and 1 cycle faster I believe), but fixing the address of everything in your kernel seems like a bad idea. You'd end up with the second example if you want to support a variable number of devices (like an array), and I assume you do.
Neptune - 64 bit microkernel OS in D - www.devlime.com
User avatar
JackScott
Member
Member
Posts: 1031
Joined: Thu Dec 21, 2006 3:03 am
Location: Hobart, Australia
Contact:

Post by JackScott »

Using the preprocessor may also be possible in this situation. Save the cycle, save a variable, and save the day?
exkor
Member
Member
Posts: 111
Joined: Wed May 23, 2007 9:38 pm

Post by exkor »

speal wrote: mov [0xABCDEF], %rax
vs.
lea %rax, [address wrt %rip]
mov [%rax], 12345
in LongMode

Code: Select all

 mov [imm], r0-r15 ;7 bytes, write to mem on static addr

 ;you'll need to declare additional variable probaby? :
 ;15 bytes, 11bytes for dword
 var dq 0
 mov [var], rax
 
 ;7bytes also
 lea rax, [rcx+127]
 mov [rax], rax

 ;6 bytes
 lea eax, [rcx+127]  ;if rcx replaced with r8-r15 then +1 byte
 mov [rax], rax    
using offset higher than +/-127 will add 3 more bytes
using r8-r15 in "mov [rax], reg64" instead of rax will add 1 more byte
so 11 bytes lea+mov in worst ase scenario

I think doing such optimization in C is pointless.
User avatar
01000101
Member
Member
Posts: 1599
Joined: Fri Jun 22, 2007 12:47 pm
Contact:

Post by 01000101 »

Also, this may be a very basic ASM question, but when performing a mov such as:

Code: Select all

mov byte [0x00500000], byte 0xFF;
no registers are altered correct?
wouldn't that be another advantage over the other method using variables or registers to store the value?
exkor
Member
Member
Posts: 111
Joined: Wed May 23, 2007 9:38 pm

Post by exkor »

01000101 wrote: no registers are altered correct?
yes, that's advantage
the only disadvantage of such method is that you can't relocate your data block dynamically other than using paging I guess.
User avatar
JamesM
Member
Member
Posts: 2935
Joined: Tue Jul 10, 2007 5:27 am
Location: York, United Kingdom
Contact:

Post by JamesM »

I personally think, cutting out the variables would cutout some overhead.
Let the compiler deal with it. Internally the compiler creates temporary variables all over the shop, for example:

Code: Select all

a = *(unsigned int*)0x1000;
Would internally turn into:

Code: Select all

unsigned int *tmp = 0x1000;
a = *tmp;
Similarly for arithmetic operations:

Code: Select all

unsigned int a = b + c - (d+e);
->

Code: Select all

unsigned int a = b + c;
unsigned int tmp = d + e;
a = a - tmp;
There is a name for the form the compiler generates, but it escapes me for the moment. The idea is there is only one operation per line, which makes assembly code easier to generate.

Once that form has been created, assembly code is created and optimisations take place - where to store each temporary - register? stack? can two instructions be merged because of complex addressing modes? etc.

So really the variable declarations in your C code have absolutely no correlation with what the compiler outputs (on anything over -O0).
Also, this may be a very basic ASM question, but when performing a mov such as:

Code: Select all

mov byte [0x00500000], byte 0xFF;
no registers are altered correct?
wouldn't that be another advantage over the other method using variables or registers to store the value?
Whether the code you give is better or worse than two instructions that achieve the same result depends wholly on the processor in question.

On the one hand, the entire operation is achieved in one instruction. Which is good.

On the other hand, that instruction might be heavily microcoded, which slows things down (remember that not all instructions in CISC architectures are as heavily optimised - ones that compilers use get optimised more).

The instruction you give (a store immediate) is so common that I would personally consider it more efficient than a register-move, register-store. However also bear in mind that a store immediate instruction takes up more space than a register store, so if you're using the same constant over again I would reccommend storing it temporarily somewhere.
User avatar
bewing
Member
Member
Posts: 1401
Joined: Wed Feb 07, 2007 1:45 pm
Location: Eugene, OR, US

Post by bewing »

Yes, basically you are turning variables into #define or EQU statements. It is certainly best to let the compiler/linker deal with the details. This only works in virtual memory, of course, and doesn't work well at all in physical mem. But when you put things at known memory locations, it DOES create more opportunities for tightening up the assembler code. Heck, the entire *concept* of assembler SIB byte addressing [base + offset + index*size] assumes that either "base" or "offset" is a known *fixed constant* memory address. Without known fixed constant memory addresses, that entire CPU feature becomes much less useful, and much less of an enhancement to your code.
exkor
Member
Member
Posts: 111
Joined: Wed May 23, 2007 9:38 pm

Post by exkor »

JamesM wrote:[However also bear in mind that a store immediate instruction takes up more space than a register store, so if you're using the same constant over again I would reccommend storing it temporarily somewhere.
In ProtectedMode x86: savings start when you "mov" same constant(a byte like 01000101 wants) 6 or more times
'mov cl, 3' is not considered because its slight hit on performance in most cases
Same goes for LongMode x86-64 if r8-r15 not used

;29bytes
mov ecx, 3
mov [0723872h], cl
mov [0723872h], cl
mov [0723872h], cl
mov [0723872h], cl

;28 bytes
mov byte [0723872h], 3
mov byte [0723872h], 3
mov byte [0723872h], 3
mov byte [0723872h], 3

However x86 is optimized for eax reg:
;25 bytes in ProtectedMode, same 29byte in LongMode
mov eax, 5
mov [0723872h], al
mov [0723872h], al
mov [0723872h], al
mov [0723872h], al
User avatar
~
Member
Member
Posts: 1226
Joined: Tue Mar 06, 2007 11:17 am
Libera.chat IRC: ArcheFire

Post by ~ »

exkor wrote:
JamesM wrote:[However also bear in mind that a store immediate instruction takes up more space than a register store, so if you're using the same constant over again I would reccommend storing it temporarily somewhere.
In ProtectedMode x86: savings start when you "mov" same constant(a byte like 01000101 wants) 6 or more times
'mov cl, 3' is not considered because its slight hit on performance in most cases
Same goes for LongMode x86-64 if r8-r15 not used

;29bytes
mov ecx, 3
mov [0723872h], cl
mov [0723872h], cl
mov [0723872h], cl
mov [0723872h], cl

;28 bytes
mov byte [0723872h], 3
mov byte [0723872h], 3
mov byte [0723872h], 3
mov byte [0723872h], 3

However x86 is optimized for eax reg:
;25 bytes in ProtectedMode, same 29byte in LongMode
mov eax, 5
mov [0723872h], al
mov [0723872h], al
mov [0723872h], al
mov [0723872h], al
Wouldn't it be better to do:

Code: Select all

mov dword[0x723872],0x05050505
That will use 12 bytes in 16-bit mode (Unreal Mode), 11 bytes in 64-bit mode and 10 bytes in 32-bit mode.

Also, why to copy one same value several times in the same memory location?
YouTube:
http://youtube.com/@AltComp126

My x86 emulator/kernel project and software tools/documentation:
http://master.dl.sourceforge.net/projec ... 7z?viasf=1
exkor
Member
Member
Posts: 111
Joined: Wed May 23, 2007 9:38 pm

Post by exkor »

01000101 wants 1 byte not dword. Optimizations must be precise. You must know exactly what you want to you optimize. If you want general optimization leave it to your high level compiler unless you optimize algorithms.
You can change addresses, size of instruction will not change.
User avatar
01000101
Member
Member
Posts: 1599
Joined: Fri Jun 22, 2007 12:47 pm
Contact:

Post by 01000101 »

Even though I was talking about moving bytes of data, if what is said above to be true, wouldn't moving dwords be more efficient and optimised as far as machine instructions per asm line?
User avatar
JamesM
Member
Member
Posts: 2935
Joined: Tue Jul 10, 2007 5:27 am
Location: York, United Kingdom
Contact:

Post by JamesM »

wouldn't moving dwords be more efficient and optimised as far as machine instructions per asm line?
Moving around an architecture's native word size is always efficient - it's what the vast majority of mov's are.
mrvn
Member
Member
Posts: 43
Joined: Tue Mar 11, 2008 6:56 am

Post by mrvn »

speal wrote:Barring optimization and other goofy things I can't address without seeing all the code:

*(unsigned char*)0xABCDEF = 12345;

will result in the same general machine instructions as:

unsigned char* address = 0xABCDEF;
*address = 12345;
I think you mean

Code: Select all

static const unsigned char* address = 0xCAFEBABE;
*address = 12345;
The static const makes this a global variable with a non changing value. For any optimizing compiler this will be just like a #define.


That said, what about the following?

Code: Select all

struct Memory_Mapped_Device {
  uint32_t reg_foo;
  uint32_t reg_bla;
  uint32_t reg_blub;
 ...
} *device = 0xCAFEBAB0;

debice->reg_foo = 17;
device->reg_bla = 23;
Isn't that much more readable than using addresses like below?

Code: Select all

*(uint32_t*)0xCAFEBAB0 = 17;
*(uint32_t*)0xCAFEBAB4 = 23;
I'm pretty certain the two will result in the same or speedwise equivalent code. The former might even be better as the compiler can put the address into a register and access it with an offset. The later might not see 0xCAFEBAB4 as being 0xCAFEBAB0 + 4.

MfG
Goswin
Life - Don't talk to me about LIFE!
So long and thanks for all the fish.
User avatar
01000101
Member
Member
Posts: 1599
Joined: Fri Jun 22, 2007 12:47 pm
Contact:

Post by 01000101 »

I posted a few code snippets in an earlier post about re-loacting a struct, I think that would also be a fair approach to make things more readable all while controlling the memory allocation process for variables.

basically, fill a struct with variables, then move then entire struct, and once place, the variables stack up from the base of the struct.
Post Reply