hi everybody, I am working on a program that HAS to be improved in speed, it deals with huge arrays of 4 dimensions, the author of the program told me that it would be very good if I could find a way to apply simd extensions to algorithm however he also told me that he is not sure if these instructions are applicable to the algortihm or not. I do not want to get into details of algorithm and data structures, can you give me some idea on how to try to apply the simd externsions of processor to algorithm, I mean I am trying to find a way to see if algortihm can be turned into simd "appliable" without rewriting it. Thanx in advance.
Arif -Ozgunh82-
simd algorithm
Re:simd algorithm
If the code is in C/C++ then just increase the compiler optimizations, it should do it itself (provided the architecture it's compiling for supports SSE)
Re:simd algorithm
Do u mean compiler automatically compiles using mmx instructions itself? I am not sure about that.. Thank u
- Kevin McGuire
- Member
- Posts: 843
- Joined: Tue Nov 09, 2004 12:00 am
- Location: United States
- Contact:
Re:simd algorithm
I think that is called auto-vectorization and from what I hear and have seen it does not work very well right now at least for GCC < 4.0
I have actually seen it do almost nothing for the Rosetta@Home project when they turned on -msse with it. That was mainly due if anything to the fact that they ordered their data in a serial fashion.
[xyz][xyz][xyz]
You need to break it down into: ( which is abnormal since most people like to use struct{float x, y, z;})
[xxxx][yyyy][zzzz] or [xxxxxxxxxxxxx...]
[yyyyyyyyyyyy....]
[zzzzzzzzzz.......]
Then for MSVC or GCC use the:
xmmintrin.h and the __m128 data type, along with _mm_* intrinsics.
http://www.tuleriit.ee/progs/rexample.php
I have actually seen it do almost nothing for the Rosetta@Home project when they turned on -msse with it. That was mainly due if anything to the fact that they ordered their data in a serial fashion.
[xyz][xyz][xyz]
You need to break it down into: ( which is abnormal since most people like to use struct{float x, y, z;})
[xxxx][yyyy][zzzz] or [xxxxxxxxxxxxx...]
[yyyyyyyyyyyy....]
[zzzzzzzzzz.......]
Then for MSVC or GCC use the:
xmmintrin.h and the __m128 data type, along with _mm_* intrinsics.
http://www.tuleriit.ee/progs/rexample.php