当前位置:网站首页>[problem solving] which nodes are run in tensorflow?
[problem solving] which nodes are run in tensorflow?
2022-06-27 13:30:00 【Can__ er】
We all know Tensorflow Define the diagram structure first , Then drive the nodes for operation . When built graph After the picture , Need to be in a session Start diagram in session , You can create a custom dialog , You can also use the default dialog .
The two typical methods of driving nodes are session.run() and tensor.eval(), call tensor.eval() Equivalent to calling session().run(tensor). The difference between the two is explained in detail in this article :
The following for run(self, fetches, feed_dict=None, options=None, run_metadata=None) Methods are introduced in detail :
The common fetches and feed_dict Is the commonly used incoming parameter .fetches It mainly refers to those that retrieve the calculation results from the calculation diagram and put them back placeholder And variables , and feed_dict Is to transfer the corresponding data into the placeholder in the calculation diagram , It is a dictionary data structure that is only valid within the calling method .
One of the things to note is tensorflow Not the whole graph , Just calculated and wanted to fetch The value related part of , This gives the wrong answer in many answers , Look at the official notes , yes “running the necessary graph fragment”:
''' This method runs one "step" of TensorFlow computation, by running the necessary graph fragment to execute every `Operation` and evaluate every `Tensor` in `fetches`, substituting the values in `feed_dict` for the corresponding input values. '''
This similar pre “ Calculate what you use ” It feels like a pull , This also explains why a complete network is defined , But sometimes it comes in X and y Training , Sometimes it just comes in X To predict , Because there are no graph nodes with related parameters , So there is no need to pass in .
however , There is one exception , Is the layer parameter of the network itself , Look at the following code :
X = tf.placeholder(tf.float32, shape=[None, n_inputs])
cross_entropy_loss = NetWork_loss(X)
grads_and_vars = optimizer.compute_gradients(cross_entropy_loss)
# print(grads_and_vars)
gradients = [grad for grad, variable in grads_and_vars]
gradient_placeholders = []
grads_and_vars_feed = []
for grad, variable in grads_and_vars:
gradient_placeholder = tf.placeholder(tf.float32)
# gradient_placeholder = tf.placeholder(tf.float32, shape=grad.get_shape())
gradient_placeholders.append(gradient_placeholder)
grads_and_vars_feed.append((gradient_placeholder, variable))
# training_op = optimizer.apply_gradients(grads_and_vars_feed)
training_op = grads_and_vars_feed
The definition of this network will calculate the loss , That is to say cross_entropy_loss Pass in optimizer optimizer, Calculate the gradient .
This gradient calculation method will return tuples one by one , They represent the gradient and parameters of the current layer .
The latter section is for updating the gradient , But here we use “ Policy gradient ”, That is, in reinforcement learning, it needs to update the gradient after several steps .
The updated gradient here is no longer the derivative calculated directly in the neural network , But with your intensive learning reward hook , Fine no longer speak , In short, it is through a cycle , Transfer the calculated gradient of each layer , Then pass in the new gradient calculated according to your own strategy .
The key is coming. , When I use the following code to execute :
with tf.Session() as sess:
init.run()
feed_dict = {
}
for var_index, gradient_placeholder in enumerate(gradient_placeholders):
feed_dict[gradient_placeholder] = [9.]
a = sess.run(training_op, feed_dict=feed_dict)
I didn't report it wrong ! We clearly see what to calculate training_op What is needed is grads_and_vars_feed, The latter parameter uses variable, That is, what is ultimately needed is cross_entropy_loss, This loss According to the incoming X Calculated
Then go back to the source , This training_op There are two places to fill , And we just introduced the gradient , Why don't you pass it on X You can execute ?
After my exploration , Find a variable become grad An error will be reported and an error will be passed in X, But these two are obviously taken out by a cycle .
After step-by-step breakpoint debugging , It's finally solved , It turns out the variable It's network parameters , Are some member variables in the defined graph structure , That's execution init.run() Has been initialized by ~ In the diagram structure, such variables will be directly marked as “ It has been calculated that ”. and grad You have to go through X, That is, the above normal logic is calculated step by step , So it didn't pass in X This variable will be marked as “ Not calculated ”, It will only be traced back to when pulling X.
therefore tensorflow Not the whole graph , Just calculated and wanted to fetch The value related part of , And this “ Relevant part ” It means that it has not been marked as “ Calculated part ”, Especially for network parameters “ Inherent ”, Although on the surface, it seems that it is obtained through a function by function , In fact, it's not , Very misleading .
边栏推荐
- A statistical problem of shell script
- Embedded development: embedded foundation callback function
- 万物互联时代到来,锐捷发布场景化无线零漫游方案
- Infiltration learning diary day20
- Awk concise tutorial
- 清楚的自我定位
- Stack calculation (whether the order of entering and leaving the stack is legal) - Code
- MySQL 索引及其分类
- [dynamic programming] - Knapsack Problem
- 【TcaplusDB知识库】TcaplusDB-tcapsvrmgr工具介绍(三)
猜你喜欢
随机推荐
OpenFeign服务接口调用
Summary of redis master-slave replication principle
GCC compiling dynamic and static libraries
命令行编辑器 sed 基础用法总结
Using FRP tool to realize intranet penetration
scrapy
Cesium realizes satellite orbit detour
Centos7 command line installation Oracle11g
Can flush open an account for stock trading? Is it safe?
Does Xinhua San still have to rely on ICT to realize its 100 billion enterprise dream?
Intranet learning notes (8)
Bluetooth health management device based on stm32
Stack calculation (whether the order of entering and leaving the stack is legal) - Code
一道shell脚本的统计题
7 killer JS lines of code
新华三的千亿企业梦,还得靠吃ICT老本来实现?
[tcapulusdb knowledge base] Introduction to tcapulusdb tcapsvrmgr tool (III)
Read a poem
微服务如何拆分
What kind of air conditioner is this?









