Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
232 views
in Technique[技术] by (71.8m points)

python - Where is the code for gradient descent?

Running some experiments with TensorFlow, want to look at the implementation of some functions just to see exactly how some things are done, started with the simple case of tf.train.GradientDescentOptimizer. Downloaded the zip of the full source code from github, ran some searches over the source tree, got to:

C:ensorflow-masterensorflowpythonraininggradient_descent.py

class GradientDescentOptimizer(optimizer.Optimizer):

  def _apply_dense(self, grad, var):
    return training_ops.apply_gradient_descent(

Okay, so presumably the actual code is in apply_gradient_descent, searched for that... not there. Only three occurrences in the entire source tree, all of which are uses, not definitions.

What about training_ops? There does exist a source file with a suggestive name:

C:ensorflow-masterensorflowpythonrainingraining_ops.py

from tensorflow.python.training import gen_training_ops
# go/tf-wildcard-import
# pylint: disable=wildcard-import
from tensorflow.python.training.gen_training_ops import *
# pylint: enable=wildcard-import

... the above is the entire content of that file. Hmm.

I did find this file:

C:ensorflow-masterensorflowpythonBUILD

tf_gen_op_wrapper_private_py(
    name = "training_ops_gen",
    out = "training/gen_training_ops.py",
)

which seems to confirm such and such other files are object code, generated in the build process - but where is the source code they are generated from?

So this is the point at which I give up and ask for help. Can anyone familiar with the TensorFlow code base point me to where the relevant source code is?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The implementation further goes to the native c++ code. Here's ApplyGradientDescent GPU implementation (core/kernels/training_ops_gpu.cu.cc):

template <typename T>
struct ApplyGradientDescent<GPUDevice, T> {
  void operator()(const GPUDevice& d, typename TTypes<T>::Flat var,
                  typename TTypes<T>::ConstScalar lr,
                  typename TTypes<T>::ConstFlat grad) {
    Eigen::array<typename TTypes<T>::Tensor::Index, 1> bcast;
    bcast[0] = grad.dimension(0);
    Eigen::Sizes<1> single;
    var.device(d) -= lr.reshape(single).broadcast(bcast) * grad;
  }
};

CPU implementation is here (core/kernels/training_ops.cc):

template <typename T>
struct ApplyGradientDescent<CPUDevice, T> {
  void operator()(const CPUDevice& d, typename TTypes<T>::Flat var,
                  typename TTypes<T>::ConstScalar lr,
                  typename TTypes<T>::ConstFlat grad) {
    var.device(d) -= grad * lr();
  }
};

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...