in most OS tutorials that I have read , they usually set SP with FFFFh or ESP with 0000FFFFh
don't care about numerical value , but please take care that it is an odd value !
yet stack will increase 2 or 4 bytes a time , so fin >:(ally sp(or ESP ) will not reach zero at all , will it ?
Stack Pointer
Re:Stack Pointer
That's silly -- SP should be a multiple of 2 bytes (i.e. even not odd), and ESP should be a multiple of 4 bytes.
For example, in 16-bit mode, the CPU accesses the stack 16 bits at a time. That is, when you POP something, it asks the hardware for a single word. The hardware can only access words on word boundaries. So if SP is odd, the CPU must ask for two individual bytes, which is two memory accesses. If SP is even, the CPU can ask for one word.
The same applies to ESP in 32-bit mode, except the worst case is when (ESP mod 4) == 1. Then, the CPU must ask for two single bytes and a word, which is three memory accesses.
The best thing to do is set SP/ESP to point just beyond the end of the stack. For instance, if your stack goes from 0x0 to 0x400, set SP to 0x400, not 0x3FF. Remember that the first PUSH subtracts 2 from SP before writing to memory, so the first PUSH will write to locations 0x3FE and 0x3FF. Because 0x3FE is even, it can do this in one operation instead of two.
For example, in 16-bit mode, the CPU accesses the stack 16 bits at a time. That is, when you POP something, it asks the hardware for a single word. The hardware can only access words on word boundaries. So if SP is odd, the CPU must ask for two individual bytes, which is two memory accesses. If SP is even, the CPU can ask for one word.
The same applies to ESP in 32-bit mode, except the worst case is when (ESP mod 4) == 1. Then, the CPU must ask for two single bytes and a word, which is three memory accesses.
The best thing to do is set SP/ESP to point just beyond the end of the stack. For instance, if your stack goes from 0x0 to 0x400, set SP to 0x400, not 0x3FF. Remember that the first PUSH subtracts 2 from SP before writing to memory, so the first PUSH will write to locations 0x3FE and 0x3FF. Because 0x3FE is even, it can do this in one operation instead of two.
Re:Stack Pointer
All just slightly off. You set ESP to exactly the end of the stack, not beyond.Tim Robinson wrote: For example, in 16-bit mode, the CPU accesses the stack 16 bits at a time. That is, when you POP something, it asks the hardware for a single word. The hardware can only access words on word boundaries. So if SP is odd, the CPU must ask for two individual bytes, which is two memory accesses. If SP is even, the CPU can ask for one word.
The same applies to ESP in 32-bit mode, except the worst case is when (ESP mod 4) == 1. Then, the CPU must ask for two single bytes and a word, which is three memory accesses.
The best thing to do is set SP/ESP to point just beyond the end of the stack.
When accessing any variable that isn't aligned, the processor takes the native size word of the word which contains the first part, and the word which contains the second part. On a Ppro+ that is 8 bytes or 64-bits. It only masks out the parts it doesn't need, shift around and add the two together. No different magic, just load, and, shift & add in a single op.
So, having a 32-bit value unaligned at some odd address doesn't mean it's slow. It just could be slower. 3 out of every 8 32-bit alignments imaginable gives an extra access, but if you align it to a 16-bit boundary, that's 1 out of 4. Align it to a 32-bit boundary and it's 0 out of 2. In short, aligning it "naturally" makes for no alignment errors.