FlashBurn wrote:Yeah, and my point is that since gcc 4.0 they say, that gcc doesn´t change the direction flag and I set it in my assembly stub!
It's in versions since GCC 4.3.0 actually that GCC no longer outputs the cld instruction before any inlined memory function. The SysV ABI assumes that the direction flag has been cleared at the entry of every function. In versions of GCC before 4.3.0, GCC output the cld instruction to make sure of this, but that is no longer the case. You SHOULD clear the direction flag yourself before doing things like memset (or in your case, zeromem4b) to ensure that you do memory operations in the proper direction. If in your bootloader you clear DF and never ever ever set DF, you may be safe. But these are not privileged instructions; userspace is free to use set and clear DF as it pleases.
My larger point here is that you are writing a kernel. You are free to define your own ABI. Even if you don't and are using an existing one (like the SysV ABI), you cannot rely on the compiler to completely implement the ABI all by itself, especially when using such things as inline assembly.
Solution 1:
Code: Select all
static inline void zeromem4b(uint32t *dst, uint32t count) {
uint32t tmp1, tmp2;
asm volatile("xor %%eax,%%eax\n\trep stosl"
:"=c"(tmp1),"=D"(tmp2)
:"0"(count),"1"(dst)
:"%eax", "cc", "memory");
}
What's the point in saving ecx and edi to two variables that are otherwise never used? I considered doing that, but It's a waste of memory and time. It doesn't really tell GCC anything that it already didn't know. That is, it knows the edi register is already clobbered by your asm because it is using edi as an input register! That's why you cannot specify edi as a clobber as well. As I pointed out in my previous post (and as is pointed out in the URL I posted), the use of "cc" and "memory" are a good idea as well. You should use the memory clobber because you're writing to a variable (that is, edi). Without the memory clobber, GCC knows that the EDI register itself has changed (because it was used as an input), but it doesn't know that a memory location was changed. The memory clobber fixes that.
I'm not going to bother explaining the cc clobber again. I'm also still not entirely sure that you need the %eax clobber since memory clobbers all registers (according to the GCC docs).
Solution 2:
Code: Select all
static inline void zeromem4b(uint32t *dst, uint32t count) {
uint32t tmp1, tmp2;
asm volatile("xor %%eax,%%eax\n\trep stosl"
:"+c"(count),"+D"(dst)
:
:"%eax", "cc", "memory");
}
The last one is also silly because (I think so) it is changeing the values of count and dst.
Technically, ecx and edi are both changed by either solution you posted. The "=" modifier signifies an operand is write only (an output), and the "+" modifier says the operand is both an input and an output.
You could further optimize your function by using the following code:
Code: Select all
static inline void zeromem4b(uint32t *dst, uint32t count) {
asm volatile ( "cld; rep stosl"
:
: "a"(0), "c"(count), "D"(dst)
: "cc", "memory" );
}
Note I used cld, because doing so is correct under the SysV ABI (especially for inlined functions!). I used "a"(0) because GCC may be able to arrange things so that zero is already in the EAX register, resulting in one less movl (or xor) during its optimization phase. That also means you don't have to specify "%eax" as a clobber, since GCC already knows it is clobbered (again, since it is used as an input, of which GCC is aware).
Hmm, in case you're not aware, the way GCC handles 'static inline' has changed in GCC 4.3.0 as compared to previous versions as well, if you're using C99.
Please spend some time reading the GCC manual. It goes in to all kinds of detail about inline assembly, and it seems clear to me inline assembly is at least somewhat confusing to you. Don't worry though, it's a syntactical nightmare, and a lot of people have problems with the finer points of it, myself included.