I am writing a library that uses a surface (to re-sample and write to a texture) for a performance gain:
...
surface<void, 2> my_surf2D; //allows writing to a texture
...
The target platform GPU has compute capability 2.0 and I can compile my code with:
nvcc -arch=sm_20 ...
and it works just fine.
The problem is when I am trying to develop and debug the library on my laptop which has an NVIDIA ION GPU with compute capability 1.1 (I would also like my library to be backwards compatible). I know this architecture does not support surfaces so I used the nvcc macros in my device code to define an alternate code path for this older architecture:
#if (__CUDA_ARCH__ < 200)
#warning using kernel for CUDA ARCH < 2.0
...
temp_array[...] = tex3D(my_tex,X,Y,Z+0.5f);
#else
...
surf2Dwrite( tex3D(my_tex,X,Y,Z+0.5f), my_surf2D, ix*4, iy,cudaBoundaryModeTrap);
#endif
The problem is that when I do:
nvcc -gencode arch=compute_11,code=sm_11
I get this error:
ptxas PTX/myLibrary.ptx, line 1784; fatal : Parsing error near '.surf': syntax error
When I look at the PTX file is see what appears to be the surface declaration:
.surf .u32 _ZN16LIB_15my_surf2DE;
If I try to put a similar macro around the surface declaration in my source code:
#ifdef __CUDACC__
#if __CUDA_ARCH__ < 200
#warning skipping surface declaration for nvcc trajectory
#else
surface ...
#endif
#else
#warning keeping surface declaration by default
surface ...
#endif
I get an error saying the surface variable is undefined in the host code call to to bind cuda surface to array. Should I add the macro around the bind function as well?
I'm not sure if it is possible, or if I goofed somewhere, please help.