cuda - バインドされた CUDA テクスチャがゼロを読み取る

Question

テクスチャから値を読み取り、それらをグローバルメモリに書き戻そうとしています。カーネルに定数値を入れることができ、出力でそれらを確認できるため、書き込み部分が機能すると確信しています。

__global__ void
bartureKernel( float* g_odata, int width, int height) 
{
    unsigned int x = blockIdx.x*blockDim.x + threadIdx.x;
    unsigned int y = blockIdx.y*blockDim.y + threadIdx.y;

    if(x < width && y < height) {
            unsigned int idx = (y*width + x);
            g_odata[idx] = tex2D(texGrad, (float)x, (float)y).x;

    }
}

使用したいテクスチャは 2 つのチャネルを持つ 2D float テクスチャなので、次のように定義しました。

texture<float2, 2, cudaReadModeElementType> texGrad;

そして、カーネルを呼び出すコードは、ゼロ以外の定数値でテクスチャを初期化します。

float* d_data_grad = NULL;

cudaMalloc((void**) &d_data_grad, gradientSize * sizeof(float));
CHECK_CUDA_ERROR;

texGrad.addressMode[0] = cudaAddressModeClamp;
texGrad.addressMode[1] = cudaAddressModeClamp;
texGrad.filterMode = cudaFilterModeLinear;
texGrad.normalized = false;

cudaMemset(d_data_grad, 50, gradientSize * sizeof(float));
CHECK_CUDA_ERROR;

cudaBindTexture(NULL, texGrad, d_data_grad, cudaCreateChannelDesc<float2>(), gradientSize * sizeof(float));

float* d_data_barture = NULL;
cudaMalloc((void**) &d_data_barture, outputSize * sizeof(float));
CHECK_CUDA_ERROR;

dim3 dimBlock(8, 8, 1);
dim3 dimGrid( ((width-1) / dimBlock.x)+1, ((height-1) / dimBlock.y)+1, 1);

bartureKernel<<< dimGrid, dimBlock, 0 >>>( d_data_barture, width, height);

テクスチャバイトをすべて「50」に設定しても、フロートのコンテキストではあまり意味がありませんが、少なくともゼロ以外の値を読み取る必要があります。

ゼロしか読めないけど…

score 8 · Accepted Answer

cudaBindTextureによって割り当てられたメモリにテクスチャをバインドするために使用していますcudaMalloc。カーネルでは、tex2D関数を使用してテクスチャから値を読み取ります。それがゼロを読んでいる理由です。

を使用してテクスチャを線形メモリにバインドすると、カーネル内cudaBindTextureを使用して読み取られます。tex1Dfetch

tex2D関数を使用してピッチ線形メモリ（によって割り当てられるcudaMallocPitch）cudaBindTexture2Dにバインドされているテクスチャ、または関数を使用してcudaArrayにバインドされているテクスチャからのみ読み取るために使用されますcudaBindTextureToArray

これが基本的な表です。残りはプログラミングガイドから読むことができます。

メモリタイプ-----------------を使用して割り当て-----------------を使用してバインド---------- -------------カーネルの読み取り

リニアメモリcudaMalloc.........................................。cudaBindTexture...........................。tex1Dfetch

ピッチリニアメモリcudaMallocPitch................................。cudaBindTexture2Dtex2D

cudaArray cudaMallocArray............................................。cudaBindTextureToArray_ .....tex1D またはtex2D

3D cudaArray cudaMalloc3DArray......................................。cudaBindTextureToArraytex3D

score 3 · Accepted Answer

To add on, access using tex1Dfetch is based on integer indexing. However, the rest are indexed based on floating point, and you have to add +0.5 to get the exact value you want.

I'm curious why do you create float and bind to a float2 texture? It may gives ambiguous results. float2 is not 2D float texture. It can actually be used for representation of complex number.

typedef struct {float x; float y;} float2;

I think this tutorial will help you understand how to use texture memory in cuda. http://www.drdobbs.com/parallel/cuda-supercomputing-for-the-masses-part/218100902

The kernel you shown does not benefit much from using texture. however, if utilized properly, by exploiting locality, texture memory can improve the performance by quite a lot. Also, it is useful for interpolation.

cuda - バインドされた CUDA テクスチャがゼロを読み取る

2 に答える 2

Related

Reference