hi
this is a general questionL i realised that after compiling and linking what is going to become my kernel, procedures arent automatically aligned..is this bad? i thought that *actually" everything should be aligned, but thinking about it, this would mean that every single instruction would also need alignment..so i suppose it's not bad to call procedures at unaligned addresses?
code alignment
-
- Member
- Posts: 199
- Joined: Fri Jul 13, 2007 6:37 am
- Location: Stuttgart/Germany
- Contact:
Some AMD (or maybe all) and Intel PM microprocessors fetch the code stream on 16 bytes boundaries. So assuming that your code is not aligned on a 16-byte boundary, the microprocessor will fetch the 16-byte block that contains your EIP and then would need to fetch other 16-bytes-long blocks in order to read the rest of your code.
Alignment of code can be beneficial for time-critical sub-routines where a label aligned on a certain boundary is called over and over again. However, I don't think aligning unimportant labels that are called a few times is important. In fact, you will lose the cache space and your code segment's data will grow in size unnecessarily.
It is a good programming practice to align subroutine entries on a 16-byte boundary but not every loop label. For example:
The [__MyProcedure] label that is the entry point of the procedure is aligned on a 16-byte boundary which is good but when you align the [.LoopLabel] local label, you will also force the processor to decode and execute any of the NOPs before it that make it aligned on a 16-byte boundary. Now this is not optimal.
Alignment of code can be beneficial for time-critical sub-routines where a label aligned on a certain boundary is called over and over again. However, I don't think aligning unimportant labels that are called a few times is important. In fact, you will lose the cache space and your code segment's data will grow in size unnecessarily.
It is a good programming practice to align subroutine entries on a 16-byte boundary but not every loop label. For example:
Code: Select all
ALIGN 0x10 NOP
__MyProcedure:
MOV ECX , 10
ALIGN 0x10 NOP
.LoopLabel:
; Process something here
DEC ECX
JNZ .LoopLabel
RET
On the field with sword and shield amidst the din of dying of men's wails. War is waged and the battle will rage until only the righteous prevails.
-
- Member
- Posts: 199
- Joined: Fri Jul 13, 2007 6:37 am
- Location: Stuttgart/Germany
- Contact:
I take it that you were to say 16-bit boundary. Why 16 bit? Why not dword-alignment on a 32-bit machine?XCHG wrote:It is a good programming practice to align subroutine entries on a 16-byte boundary but not every loop label. For example:
The [__MyProcedure] label that is the entry point of the procedure is aligned on a 16-byte boundary which is good but when you align the [.LoopLabel] local label, you will also force the processor to decode and execute any of the NOPs before it that make it aligned on a 16-byte boundary. Now this is not optimal.Code: Select all
ALIGN 0x10 NOP __MyProcedure: MOV ECX , 10 ALIGN 0x10 NOP .LoopLabel: ; Process something here DEC ECX JNZ .LoopLabel RET
I strongly suspect he meant 16 bytes. 16-bit alignment is pointless, but 32-bit is not much less pointless. You're most likely trying to make sure the function loads a full cacheline to start with so you align it to a multiple of a power of 2, preferably the cache line size. Cache lines have been made for x86 between 16 and 64 bytes, so 16 bytes is a useful amount. I prefer 64 bytes but I think I also align on 16 byte boundaries.sancho1980 wrote:I take it that you were to say 16-bit boundary. Why 16 bit? Why not dword-alignment on a 32-bit machine?XCHG wrote:It is a good programming practice to align subroutine entries on a 16-byte boundary but not every loop label. For example:
The [__MyProcedure] label that is the entry point of the procedure is aligned on a 16-byte boundary which is good but when you align the [.LoopLabel] local label, you will also force the processor to decode and execute any of the NOPs before it that make it aligned on a 16-byte boundary. Now this is not optimal.Code: Select all
ALIGN 0x10 NOP __MyProcedure: MOV ECX , 10 ALIGN 0x10 NOP .LoopLabel: ; Process something here DEC ECX JNZ .LoopLabel RET