Page 1 of 1
What's the best assembly sequences for vm64z indexing?
Posted: Fri Apr 10, 2020 9:11 am
by blackoil
e.g.
e0 = 0;
e1 = 8;
e2 = 16;
...
Re: What's the best assembly sequences for vm64z indexing?
Posted: Fri Apr 10, 2020 9:45 am
by nullplan
You are going to have to write a little bit more than that. What is it you wish to do? If I have to google it, I can't answer your question.
Re: What's the best assembly sequences for vm64z indexing?
Posted: Fri Apr 10, 2020 10:03 am
by blackoil
to form [ gpr_base + zmm0 + displacement ], vm64z.
It's a bit slow to use instruction mov zmm0, [ vm64z_index_from_memory ]
Re: What's the best assembly sequences for vm64z indexing?
Posted: Fri Apr 10, 2020 11:29 am
by bzt
blackoil wrote:to form [ gpr_base + zmm0 + displacement ], vm64z.
It's a bit slow to use instruction mov zmm0, [ vm64z_index_from_memory ]
I lost you there. If zmm0 is supposed to be a floating point / SIMD register, then you can't use it for indexing and you can't use "mov". You have to use special instructions like "movaps" with those registers. Otherwise you can speed up the read by using only aligned values and prefetch.
Btw, with indexed addressing you can encode 3 bit shifts and a base in a single mov instruction (like [rbx + rax*8]), and reading memory with it into a gpr is not slow at all. Read about addressing modes in Intel spec.
Cheers,
bzt
Re: What's the best assembly sequences for vm64z indexing?
Posted: Fri Apr 10, 2020 8:17 pm
by blackoil
I used pseudo one.
vmovdqa64 zmm0, [index64] ; vindex instruction from armv8 can do this without memory read
vgatherqpd zmm1, [ rbx + zmm0 ] ; the zmm0 contains offsets for each element of zmm1.
index64:
dq 0
dq 16
dq 32
dq 48
dq 64
dq 80
dq 96
dq 112
Re: What's the best assembly sequences for vm64z indexing?
Posted: Mon Apr 13, 2020 10:03 am
by Octocontrabass
I don't think x86 has any way to do that without an extra memory access to load the indices.
Why is there an 8-byte gap between each of the values you want to load?