Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
816 views
in Technique[技术] by (71.8m points)

parallel processing - how to compile .cu file for dynamic parallelism with matlab?

I'm trying to run a simple example of dynamic parallelism in cuda. the code of the .cu file is

__global__ void child_launch(int *data) {
   data[threadIdx.x] = data[threadIdx.x]+1;
}

__global__ void parent_launch(int *data) {
   data[threadIdx.x] = threadIdx.x;

   __syncthreads();

   if (threadIdx.x == 0) {
       child_launch<<< 1, 256 >>>(data);
       cudaDeviceSynchronize();
   }

   __syncthreads();
}

where parent_launch is the kernel I want matlab to run, and each thread of parent_launch can run a grid of blocks with the kernel child_launch (in practice, only the 0th thread should create such a grid, but that's just an example).

I tried to run it all by compiling the .cu file into a .ptx file and then executing the following commands in matlab:

   k = parallel.gpu.CUDAKernel('file_name.ptx', 'file_name.cu');
   k.GridSize = [1,256];
   k.ThreadBlockSize = [1,256];
   r1 = feval(k, data);% data is an array of ints on the gpu

the problem is that when I tried to compile the .cu file, I got the following error:

error: kernel launch from __device__ or __global__ functions requires separate compilation mode

Does anyone know how to fix it?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
等待大神答复

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...