When dealing with large-scale data , Data cannot be fully loaded into memory , We usually use two options
- Use
tfrecords - Use
tf.data.Dataset.from_generator()
tfrecords Parallel use of The above has already been introduced , No more details here . If we don't want to generate tfrecord Intermediate document , Then the generator is what you need .
This paper mainly records that from_generator() Parallelization method of , stay tf.data in , Parallelization is mainly realized through map and num_parallel_calls Realization , But for some scenes , our generator() There is some processing logic in , It cannot be parallelized directly , The easiest way is to put generator() The logic in , Use map Realization .
tf.data.Dataset generator parallel
Yes generator() Complex logic in , We simplify it , That is, only some subscript value type operations are performed in the generator , take generator() In the processing section of py_function The parcel (wrapped) , And then call map Handle .
def func(i):
i = i.numpy() # Decoding from the EagerTensor object
x, y = your_processing_function(training_set[i])
return x, y
z = list(range(len(training_set))) # The index generator
dataset = tf.data.Dataset.from_generator(lambda: z, tf.uint8)
dataset = dataset.map(lambda i: tf.py_function(func=func,
inp=[i],
Tout=[tf.uint8,
tf.float32]
),
num_parallel_calls=tf.data.AUTOTUNE)
Because of implicit inference , Sometimes tensor Output shape It is unknown. , Need extra treatment
dataset = dataset.batch(8)
def _fixup_shape(x, y):
x.set_shape([None, None, None, nb_channels]) # n, h, w, c
y.set_shape([None, nb_classes]) # n, nb_classes
return x, y
dataset = dataset.map(_fixup_shape)
tf.Tensor And tf.EagerTensor
Why tf.py_function, Let's start with tf.Tensor And tf.EagerTensor
EagerTensor It's real time , You can get its value at any time , That is, through numpy obtain
Tensor It's not real time , It is a component in a static diagram , Only when feeding data 、 The... Can only be obtained after the operation is completed Tensor Value ,
map Function operation of mapping in , And just tell dataset, Every time you take out a sample, you should do it first function Used after an operation , therefore function Is called at each iteration dataset Is called when , Belong to Static diagram logic
tensorflow.python.framework.ops.EagerTensor
tensorflow.python.framework.ops.Tensor
tf.py_function What role does it play here ?
Wraps a python function into a TensorFlow op that executes it eagerly.
Just now map Data static diagram logic , The default parameters are Tensor. and Use tf.py_function() After packing , The parameter becomes EagerTensor.
references
【2】https://blog.csdn.net/qq_27825451/article/details/105247211
【3】https://www.tensorflow.org/guide/data_performance#parallelizing_data_extraction
tf.data( Two ) —— Parallelization tf.data.Dataset More about generators
- QR code Data Matrix Decoding implementation of (zxing-cpp)
QR code Data Matrix You can refer to http://blog.csdn.net/fengbingchun/article/details/44279967 , The following is through zxing-cpp Open source library implementation ...
- QR code Data Matrix code 、 Decoding uses examples
QR code Data Matrix See : http://blog.csdn.net/fengbingchun/article/details/44279967 , Here is a simple write to generate two-dimensional code and two-dimensional code for ...
- Principles and framework of deep learning - Image completion ( Principle and code ) 1.tf.nn.moments( Find the mean and the standard deviation ) 2.tf.control_dependencies( First perform internal operations ) 3.tf.cond( Distinguish between functions before and after execution ) 4.tf.nn.atrous_conv2d 5.tf.nn.conv2d_transpose( deconvolution ) 7.tf.train.get_checkpoint_state( Judge sess Whether there is
1. tf.nn.moments(x, axes=[0, 1, 2]) # Average and standard deviation of the first three dimensions , The result is the last dimension , For each feature_map Find the mean and the standard deviation Parameter description :x For input fe ...
- Thesis translation :Data mining with big data
original text : Wu X, Zhu X, Wu G Q, et al. Data mining with big data[J]. IEEE transactions on knowledge and dat ...
- Principles and framework of deep learning -Tensorflow Basic operation - Variable common operations 1.tf.random_normal( Generate a normal distribution random number ) 2.tf.random_shuffle( Shuffle the cards ) 3. tf.assign( Assignment operation ) 4.tf.convert_to_tensor( Convert to tensor type ) 5.tf.add( Add operation ) tf.divide( Multiplication operation ) 6.tf.placeholder( Input data placeholder
1. Use tf.random_normal([2, 3], mean=-1, stddev=4) Create a random number with a normal distribution Parameter description :[2, 3] Represents the dimension of a random number ,mean Means mean ,stddev Express ...
- tensorflow in tf.train.slice_input_producer and tf.train.batch function ( turn )
tensorflow Data reading mechanism tensorflow In order to make full use of GPU, Reduce GPU Idle time waiting for data , Two threads are used to perform data reading and data calculation respectively . Specifically, it is to use a thread to continuously count the number of pictures in the hard disk ...
- tensorflow in tf.train.slice_input_producer and tf.train.batch function
tensorflow Data reading mechanism tensorflow In order to make full use of GPU, Reduce GPU Idle time waiting for data , Two threads are used to perform data reading and data calculation respectively . Specifically, it is to use a thread to continuously count the number of pictures in the hard disk ...
- tensorflow Basic functions (1.tf.split, 2.tf.concat,3.tf.squeeze, 4.tf.less_equal, 5.tf.where, 6.tf.gather, 7.tf.cast, 8.tf.expand_dims, 9.tf.argmax, 10.tf.reshape, 11.tf.stack, 12tf.less, 13.tf.boolean_mask
1. tf.split(3, group, input) # Split function 3 It means in the third dimension , group Indicates the number of splits , input Represents the value entered import tensorflow ...
- 【 Reprint 】 tensorflow in tf.train.slice_input_producer and tf.train.batch function
Original address : https://blog.csdn.net/dcrmg/article/details/79776876 ----------------------------------------- ...
- tensorflow Data reading mechanism tf.train.slice_input_producer and tf.train.batch function
tensorflow In order to make full use of GPU, Reduce GPU Idle time waiting for data , Two threads are used to perform data reading and data calculation respectively . Specifically, a thread is used to continuously read the image data in the hard disk into a memory queue , Another thread ...
Random recommendation
- [ original ]AD9212 Sampling method
Notes Recently, it has been used for engineering reasons ADC Sampling of , Choose the ADI The company's AD9212 chip , Eight channels 10 position ADC. It's going on ADC When sampling , See several ways to think , Take a note here . AD9212 brief introduction Details can be found in A ...
- Third articles SQL Server Agent alerts and operators
This article is SQL Server The third in the agency series , Please refer to the original for details . As I said in the last article in this series ,SQL Server A proxy job consists of a series of job steps , Each step is performed by a separate type , In addition to the work performed in the steps ...
- The finger of the sword Offer: Interview questions 18—— The substructure of a tree (java Realization )
Problem description : Input two binary trees A and B, Judge B Is it right? A Substructure of . The definition of binary tree node is as follows : public class TreeNode { int val = 0; TreeNode left = null; ...
- mysql Use of triggers ( Memo )
Four elements of trigger creation Syntax : 1. Surveillance location (table) 2. Monitoring events (insert/update/delete) 3. Trigger time (after/before) 4. Triggering event (insert/update/del ...
- PHP Of PSR-0 Naming standard
PSR yes Proposing a Standards Recommendation( Make standard recommendations ) Abbreviation , By PHP Framework Interoperability Group(PHP The universal framework is small ...
- Building Apps with Over 65K Methods( solve APP The total number of referenced methods exceeds 65536)
This article is translated from http://developer.android.com/intl/zh-cn/tools/building/multidex.html#about. When we Android App Zhonghan ...
- codeforces #256 A. Rewards
A. Rewards time limit per test 1 second memory limit per test 256 megabytes input standard input out ...
- 【js】 operation checkbox radio Operation summary of
Abstract Always forget checkbox radio The specific operation of , Always pit yourself , Make a summary and write it down html <input type="checkbox" value="1" ...
- ASP.NET/MVC To configure log4net Enable the write error log function
<?xml version="1.0" encoding="utf-8"?> <!-- About how to configure ASP.NET Application details , Please visit ...
- 【Python】 Magic methods
Magic methods This name is really very important = =( Or translation is too strong , As a foreign language learner, I really want to roast about this ..) In form , The magic method is underlined before and after the name of the method . functionally , All magic ...









