tensorflow2.0 - Tensorflow GPU profiling - OGeek|极客中国-技术改变生活,极客改变未来

I am training a model using the TF keras API, the issue I am having is that I am unable to maximise the usage of the GPU, it is under-utilised in both memory & processing.

When profiling the model, I can see a lot of operations labelled as _Send which I assume is some data hopping between GPU & CPU.

Since I am using keras, I am not directly placing variables on device so I am not clear on why this is occuring or how to optimise.

Another interesting side effect seems to be that larger batches make training slower, with huge long waits for the GPU to get data from the CPU.

The profiler also suggests:

59.4 % of the total step time sampled is spent on 'Kernel Launch'. It could be due to CPU contention with tf.data. In this case, you may try to set the environment variable TF_GPU_THREAD_MODE=gpu_private.

I have set this env var at the top of the notebook, with no effect - I am not clear on how to check if it is having the intended effect.

Your help here would be greatly appreciated, I have read all the available guides on the tensorflow docs.

question from:https://stackoverflow.com/questions/65875612/tensorflow-gpu-profiling

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

Categories

tensorflow2.0 - Tensorflow GPU profiling

tensorflow2.0 - Tensorflow GPU profiling

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags