OSDev.org

Posted: **Thu Nov 13, 2008 9:02 pm**

Hey guys, I'm trying to use vectorization and the XMM processors to speed up my program, whose job is to multiply floating point numbers.

I understand how I can load bits 0-63 into bits 64-127 of my register. My problem is, how can I load two unique floats into bits 0-63? Can I load one float into 0-31, then bit shift it, then load another? If so, how would I do this?

Here is what I have so far:

Code: Select all

   movaps (%r13), %xmm1 # load A in xmm1
   shl $32, %xmm1 # shift A[0:31] to A[32:63]
   movaps (%r13), %xmm1
   movlhps %xmm1, %xmm1 # shift A[0:63] to A[64:127]

Posted: **Fri Nov 14, 2008 4:04 am**

Try looking up MOVD and PSHUFD

Posted: **Fri Nov 14, 2008 8:36 am**

Will using MOVD allow me to load something into bits [0,31] and [32, 63] without overwriting the other set of bits? i.e. load something into [0,31] without overwriting [32, 63]?

Posted: **Fri Nov 14, 2008 5:13 pm**

RTFM, really

OSDev.org

SIMD programming

SIMD programming

Re: SIMD programming

Re: SIMD programming

Re: SIMD programming