当前位置：网站首页>[problem solving] which nodes are run in tensorflow?

[problem solving] which nodes are run in tensorflow?

2022-06-27 13:30:00 【Can__ er】

We all know Tensorflow Define the diagram structure first , Then drive the nodes for operation . When built graph After the picture , Need to be in a session Start diagram in session , You can create a custom dialog , You can also use the default dialog .

The two typical methods of driving nodes are session.run() and tensor.eval(), call tensor.eval() Equivalent to calling session().run(tensor). The difference between the two is explained in detail in this article ：

The following for run(self, fetches, feed_dict=None, options=None, run_metadata=None) Methods are introduced in detail ：
The common fetches and feed_dict Is the commonly used incoming parameter .fetches It mainly refers to those that retrieve the calculation results from the calculation diagram and put them back placeholder And variables , and feed_dict Is to transfer the corresponding data into the placeholder in the calculation diagram , It is a dictionary data structure that is only valid within the calling method .

One of the things to note is tensorflow Not the whole graph , Just calculated and wanted to fetch The value related part of , This gives the wrong answer in many answers , Look at the official notes , yes “running the necessary graph fragment”：

    ''' This method runs one "step" of TensorFlow computation, by running the necessary graph fragment to execute every `Operation` and evaluate every `Tensor` in `fetches`, substituting the values in `feed_dict` for the corresponding input values. '''

This similar pre “ Calculate what you use ” It feels like a pull , This also explains why a complete network is defined , But sometimes it comes in X and y Training , Sometimes it just comes in X To predict , Because there are no graph nodes with related parameters , So there is no need to pass in .

however , There is one exception , Is the layer parameter of the network itself , Look at the following code ：

X = tf.placeholder(tf.float32, shape=[None, n_inputs])
cross_entropy_loss = NetWork_loss(X)
grads_and_vars = optimizer.compute_gradients(cross_entropy_loss)
# print(grads_and_vars)
gradients = [grad for grad, variable in grads_and_vars]
gradient_placeholders = []
grads_and_vars_feed = []
for grad, variable in grads_and_vars:
    gradient_placeholder = tf.placeholder(tf.float32)
    # gradient_placeholder = tf.placeholder(tf.float32, shape=grad.get_shape())
    gradient_placeholders.append(gradient_placeholder)
    grads_and_vars_feed.append((gradient_placeholder, variable))
    # training_op = optimizer.apply_gradients(grads_and_vars_feed)
training_op = grads_and_vars_feed

The definition of this network will calculate the loss , That is to say cross_entropy_loss Pass in optimizer optimizer, Calculate the gradient .
This gradient calculation method will return tuples one by one , They represent the gradient and parameters of the current layer .
The latter section is for updating the gradient , But here we use “ Policy gradient ”, That is, in reinforcement learning, it needs to update the gradient after several steps .
The updated gradient here is no longer the derivative calculated directly in the neural network , But with your intensive learning reward hook , Fine no longer speak , In short, it is through a cycle , Transfer the calculated gradient of each layer , Then pass in the new gradient calculated according to your own strategy .

The key is coming. , When I use the following code to execute ：

with tf.Session() as sess:
    init.run()
    feed_dict = {
    }
    for var_index, gradient_placeholder in enumerate(gradient_placeholders):
        feed_dict[gradient_placeholder] = [9.]
    a = sess.run(training_op, feed_dict=feed_dict)

I didn't report it wrong ！ We clearly see what to calculate training_op What is needed is grads_and_vars_feed, The latter parameter uses variable, That is, what is ultimately needed is cross_entropy_loss, This loss According to the incoming X Calculated
Then go back to the source , This training_op There are two places to fill , And we just introduced the gradient , Why don't you pass it on X You can execute ？

After my exploration , Find a variable become grad An error will be reported and an error will be passed in X, But these two are obviously taken out by a cycle .

After step-by-step breakpoint debugging , It's finally solved , It turns out the variable It's network parameters , Are some member variables in the defined graph structure , That's execution init.run() Has been initialized by ~ In the diagram structure, such variables will be directly marked as “ It has been calculated that ”. and grad You have to go through X, That is, the above normal logic is calculated step by step , So it didn't pass in X This variable will be marked as “ Not calculated ”, It will only be traced back to when pulling X.

therefore tensorflow Not the whole graph , Just calculated and wanted to fetch The value related part of , And this “ Relevant part ” It means that it has not been marked as “ Calculated part ”, Especially for network parameters “ Inherent ”, Although on the surface, it seems that it is obtained through a function by function , In fact, it's not , Very misleading .

原网站

版权声明
本文为[Can__ er]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/178/202206271307060974.html

当前位置：网站首页>[problem solving] which nodes are run in tensorflow?

[problem solving] which nodes are run in tensorflow?

边栏推荐

猜你喜欢

随机推荐