My CUDA kernel looks like this.
#define MY_AWESOME_MACRO(foo, bar) (foo * bar * 123 + 456)
__global__ void my_CUDA_kernel(int* cool, float* beans) {
// Some computation.
}
Should I place my macro inside or outside of the function? I Googled around, and some did both. Is there harm in doing it one way or the other?