Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
303 views
in Technique[技术] by (71.8m points)

python - Is it possible to split a network across multiple GPUs in tensorflow?

I plan to run a very large recurrent network (e.g. 2048x5), is it possible to define one layer at one GPU in tensorflow? How should I implement the model to achieve the best efficiency. I understand there is overhead for inter-GPU or GPU-CPU-GPU communication.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Splitting a large model across multiple GPUs is certainly possible in TensorFlow, but doing it optimally is a hard research problem. In general, you will need to do the following:

  1. Wrap large contiguous regions of your code in a with tf.device(...): block, naming the different GPUs:

    with tf.device("/gpu:0"):
      # Define first layer.
    
    with tf.device("/gpu:1"):
      # Define second layer.
    
    # Define other layers, etc.
    
  2. When building your optimizer, pass the optional argument colocate_gradients_with_ops=True to the optimizer.minimize() method:

    loss = ...
    optimizer = tf.train.AdaGradOptimizer(0.01)
    train_op = optimizer.minimize(loss, colocate_gradients_with_ops=True)
    
  3. (Optionally.) You may need to enable "soft placement" in the tf.ConfigProto when you create your tf.Session, if any of the operations in your model cannot run on GPU:

    config = tf.ConfigProto(allow_soft_placement=True)
    sess = tf.Session(config=config)
    

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...