Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
583 views
in Technique[技术] by (71.8m points)

c++ - 2d char array to CUDA kernel

I need help with transfer char[][] to Cuda kernel. This is my code:

__global__ 
void kernel(char** BiExponent){
  for(int i=0; i<500; i++)
     printf("%c",BiExponent[1][i]); // I want print line 1
}

int main(){
  char (*Bi2dChar)[500] = new char [5000][500];
  char **dev_Bi2dChar;

  ...//HERE I INPUT DATA TO Bi2dChar

  size_t host_orig_pitch = 500 * sizeof(char);
  size_t pitch;
  cudaMallocPitch((void**)&dev_Bi2dChar, &pitch, 500 * sizeof(char), 5000);
  cudaMemcpy2D(dev_Bi2dChar, pitch, Bi2dChar, host_orig_pitch, 500 * sizeof(char), 5000, cudaMemcpyHostToDevice);
  kernel <<< 1, 512 >>> (dev_Bi2dChar);
  free(Bi2dChar); cudaFree(dev_Bi2dChar);
}

I use: nvcc.exe" -gencode=arch=compute_20,code="sm_20,compute_20" --use-local-env --cl-version 2012 -ccbin

Thanks for help.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

cudaMemcpy2D doesn't actually handle 2-dimensional (i.e. double pointer, **) arrays in C. Note that the documentation indicates it expects single pointers, not double pointers.

Generally speaking, moving arbitrary double pointer C arrays between the host and the device is more complicated than a single pointer array.

If you really want to handle the double-pointer array, then search on "CUDA 2D Array" in the upper right hand corner of this page, and you'll find various examples of how to do it. (For example, the answer given by @talonmies here)

Often, an easier approach is simply to "flatten" the array so it can be referenced by a single pointer, i.e. char[] instead of char[][], and then use index arithmetic to simulate 2-dimensional access.

Your flattened code would look something like this: (the code you provided is an uncompilable, incomplete snippet, so mine is also)

#define XDIM 5000
#define YDIM 500

__global__ 
void kernel(char* BiExponent){
  for(int i=0; i<500; i++)
     printf("%c",BiExponent[(1*XDIM)+i]); // I want print line 1
}

int main(){
  char (*Bi2dChar)[YDIM] = new char [XDIM][YDIM];
  char *dev_Bi2dChar;

  ...//HERE I INPUT DATA TO Bi2dChar

  cudaMalloc((void**)&dev_Bi2dChar,XDIM*YDIM * sizeof(char));
  cudaMemcpy(dev_Bi2dChar, &(Bi2dChar[0][0]), host_orig_pitch, XDIM*YDIM * sizeof(char), cudaMemcpyHostToDevice);
  kernel <<< 1, 512 >>> (dev_Bi2dChar);
  free(Bi2dChar); cudaFree(dev_Bi2dChar);
}

If you want a pitched array, you can create it similarly, but you will still do so as single pointer arrays, not double pointer arrays.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...