Hey!
I was just trying to hack around and try out GPGPU just for fun, trying to speed up a naive Bayesian classifier to begin with.
Looks like I have 2 portable options - nVidia Cg and 3DLabs' GLSL. ARB is portable too, but I'm not very sure due to lack of branching.
Not thinking of any optimizations right now (don't even know the possibilities) - if I go the Cg/glsl way, I assume they would do it for me.
Anyone having some experience with both or either? which one looks more promising wrt. automatic code generations from AST of operations? Or is the good old ARB the best choice for automatic code gen?
Thanks in advance
--
prashant
Cg vs. glsl vs. ARB for GPGPU
Re: Cg vs. glsl vs. ARB for GPGPU
GLSL is part of the standard since 2.0, so that's your best best if you want to be portable across cards. Cg compiles down to GLSL or ARB programs, so that works out in your favor too.
I'm not sure ARB programs are supported anymore, at least I'm not aware of any new development with them in the last 3 years or so.
Of course, if you have an nVidia GeForce 8/9, there's always CUDA.
I'm not sure ARB programs are supported anymore, at least I'm not aware of any new development with them in the last 3 years or so.
Of course, if you have an nVidia GeForce 8/9, there's always CUDA.
Re: Cg vs. glsl vs. ARB for GPGPU
I have an nVidia 8800, but I'm not sure if I want to use CUDA, I want this thing to be 100% portable.
Looks like GLSL is the way. I'm following some nice tutorials on glsl, and I should be done with this small project pretty soon.
this brings some more interesting thoughts -
Given a java/cil binary - is it possible to use current vectorization techniques used for MMX/SSE2/3 (some tweaks may be needed) and have the vm use the GPU very much like microsoft accelerator does? The problem of frequent texture upload/download is a hard one, wondering if it is solvable at all.
what do you think?
--
prashant
Looks like GLSL is the way. I'm following some nice tutorials on glsl, and I should be done with this small project pretty soon.
this brings some more interesting thoughts -
Given a java/cil binary - is it possible to use current vectorization techniques used for MMX/SSE2/3 (some tweaks may be needed) and have the vm use the GPU very much like microsoft accelerator does? The problem of frequent texture upload/download is a hard one, wondering if it is solvable at all.
what do you think?
--
prashant