Code: Select all
void put_zmm_value()
{
struct zmm_value buffer;
asm volatile(
"vmovdqa64 %%zmm0, (%[buffer]) \n": :[buffer]"r"(buffer):"%zmm0");
for(int i=0;i<8;i++)
printf("%lx ",buffer.word[i]);
}
Code: Select all
void put_zmm_value()
{
struct zmm_value buffer;
asm volatile(
"vmovdqa64 %%zmm0, (%[buffer]) \n": :[buffer]"r"(buffer):"%zmm0");
for(int i=0;i<8;i++)
printf("%lx ",buffer.word[i]);
}
Code: Select all
void put_zmm_value()
{
struct zmm_value buffer;
asm volatile(
"vmovdqa64 %%zmm0, %[buffer] \n": :[buffer]"m"(buffer):"%zmm0");
for(int i=0;i<8;i++)
printf("%lx ",buffer.word[i]);
}
Why? According to the ABI, there is nothing useful in zmm0 at this point in time.sunnysideup wrote:The purpose of this code is to print out the value of the zmm0 register.
Code: Select all
struct zmm_value
{
uint64_t word[8];
} __attribute__((packed)) __attribute__ ((aligned(64)));
void set_zmm_value(struct zmm_value* val_address)
{
asm volatile
("vmovntdqa (%[val_address]),%%zmm0\n":: [val_address]"r"(val_address):"%zmm0");
}
Code: Select all
asm volatile("" ::: "memory");
Alright, I understand that why it's used for now - as a way for the compiler to ensure that no compile time reordering occurs across this 'barrier'Korona wrote:That's a memory barrier for the compiler.¹ Yes, it is extended asm. The memory clobber forces all loads/stores to globally visible variables that occur before/after the barrier in program order to happen before/after the barrier.
Code: Select all
static void force_read(uint8_t *p) {
asm volatile("" : : "r"(*p) : "memory");
}
It's an assembler statement with a memory clobber. The fact that it's empty is incidental to this. The memory clobber tells GCC that this statement will change "memory", but not which memory and in what way it will be changed. Therefore, GCC cannot assume anything about the state of memory, and must write all changes to memory before the statement, and read all things that are in memory again after the statement.sunnysideup wrote:However, why does it work this way?
This time it is a memory clobber and an input constraint. So in addition to the above, this statement requires that the value of "*p" be put iinto a register beforehand. The statement is empty and doesn't do anything with the value, but GCC doesn't know that, and is therefore forced to emit a read of this memory location. And since memory is clobbered, even multiple reads of this location have to be read, since they might have changed now.sunnysideup wrote:I've also read this piece of code:It's supposed to force a read from memory location p. But how does it work, i.e. why does gcc make it work that way?Code: Select all
static void force_read(uint8_t *p) { asm volatile("" : : "r"(*p) : "memory"); }
Because the GCC developers say so.sunnysideup wrote:However, why does it work this way? Or as a mathematician would say - can you derive it from first principles?
Here's the part of the manual that explains it.Using the "memory" clobber effectively forms a read/write memory barrier for the compiler.
But this holds true only if you use functions like this one with memory barriers to access that location. If you also access it without a memory barrier and the function gets inlined, the read may be combined with prior accesses. There is also no guarantee that the read will occur after all prior statements if the function is inlined.nullplan wrote:And since memory is clobbered, even multiple reads of this location have to be read, since they might have changed now.
Code: Select all
static void force_read(uint8_t *p)
{
asm volatile("" : : "r"(*p) : "memory");
}
Code: Select all
static inline void force_read(uint8_t *p)
{
asm volatile("" : : "r"(*p) : "memory");
}
Code: Select all
inline void force_read(int* address)
{
asm volatile ("": :"r"(*address):"memory" );
}
int main()
{
int a;
scanf("%d",&a);
//I'm doing some calculations with a
a *= 12432;
a += 1231;
force_read(&a); //This shouldn't actually compile to a memory read
}
Code: Select all
.LC0:
.string "%d"
main:
subq $24, %rsp
movl $.LC0, %edi
xorl %eax, %eax
leaq 12(%rsp), %rsi
call __isoc99_scanf
imull $12432, 12(%rsp), %eax
addl $1231, %eax
movl %eax, 12(%rsp)
xorl %eax, %eax
addq $24, %rsp
ret
Code: Select all
inline void force_read(int* address)
{
asm volatile ("mov (%[addr]) , %%rax": :[addr]"m"(address):"memory","rax" ); //Do we even need the memory clobber??
}
Well, there is. The snippet (empty string) is instantiated with the value you wanted in a register, namely in EAX. Question is what this is even supposed to prove and how it is useful in any way. You are calling force_read() on a local variable. But local variables are in normal memory, where a force read is both unnecessary and not useful. Or if it is, I don't know how. force_read() is useful for MMIO, where you sometimes need to read a register because that has side effects, even if you don't need the value. But that means, in terms of C, that you are reading a non-local object through a pointer. For example:sunnysideup wrote:Clearly, there is no "forced read"...
Code: Select all
struct whatever *foo = get_foo();
force_read(&foo->reg);