Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
903 views
in Technique[技术] by (71.8m points)

tensorflow2.0 - Tensorflow GPU profiling

I am training a model using the TF keras API, the issue I am having is that I am unable to maximise the usage of the GPU, it is under-utilised in both memory & processing.

When profiling the model, I can see a lot of operations labelled as _Send which I assume is some data hopping between GPU & CPU.

enter image description here

Since I am using keras, I am not directly placing variables on device so I am not clear on why this is occuring or how to optimise.

Another interesting side effect seems to be that larger batches make training slower, with huge long waits for the GPU to get data from the CPU.

The profiler also suggests:

59.4 % of the total step time sampled is spent on 'Kernel Launch'. It could be due to CPU contention with tf.data. In this case, you may try to set the environment variable TF_GPU_THREAD_MODE=gpu_private.

I have set this env var at the top of the notebook, with no effect - I am not clear on how to check if it is having the intended effect.

Your help here would be greatly appreciated, I have read all the available guides on the tensorflow docs.

question from:https://stackoverflow.com/questions/65875612/tensorflow-gpu-profiling

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...