当前位置:网站首页>Parallelization of accelerated training tf data. Dataset generator
Parallelization of accelerated training tf data. Dataset generator
2022-06-12 04:37:00 【u012804784】
High quality resource sharing
| Learning route guidance ( Click unlock ) | Knowledge orientation | Crowd positioning |
|---|---|---|
| 🧡 Python Actual wechat ordering applet 🧡 | Progressive class | This course is python flask+ Perfect combination of wechat applet , From the deployment of Tencent to the launch of the project , Create a full stack ordering system . |
| Python Quantitative trading practice | beginner | Take you hand in hand to create an easy to expand 、 More secure 、 More efficient quantitative trading system |
When dealing with large-scale data , Data cannot be fully loaded into memory , We usually use two options
- Use
tfrecords - Use
tf.data.Dataset.from_generator()
tfrecords Parallel use of The above has already been introduced , No more details here . If we don't want to generate tfrecord Intermediate document , Then the generator is what you need .
This paper mainly records that from_generator() Parallelization method of , stay tf.data in , Parallelization is mainly realized through map and num_parallel_calls Realization , But for some scenes , our generator() There is some processing logic in , It cannot be parallelized directly , The easiest way is to put generator() The logic in , Use map Realization .
tf.data.Dataset generator parallel
Yes generator() Complex logic in , We simplify it , That is, only some subscript value type operations are performed in the generator , take generator() In the processing section of py_function The parcel (wrapped) , And then call map Handle .
def func(i):
i = i.numpy() # Decoding from the EagerTensor object
x, y = your_processing_function(training_set[i])
return x, y
z = list(range(len(training_set))) # The index generator
dataset = tf.data.Dataset.from_generator(lambda: z, tf.uint8)
dataset = dataset.map(lambda i: tf.py_function(func=func,
inp=[i],
Tout=[tf.uint8,
tf.float32]
),
num_parallel_calls=tf.data.AUTOTUNE)
Because of implicit inference , Sometimes tensor Output shape It is unknown. , Need extra treatment
dataset = dataset.batch(8)
def \_fixup\_shape(x, y):
x.set_shape([None, None, None, nb_channels]) # n, h, w, c
y.set_shape([None, nb_classes]) # n, nb\_classes
return x, y
dataset = dataset.map(_fixup_shape)
tf.Tensor And tf.EagerTensor
Why tf.py_function, Let's start with tf.Tensor And tf.EagerTensor
EagerTensor It's real time , You can get its value at any time , That is, through numpy obtain
Tensor It's not real time , It is a component in a static diagram , Only when feeding data 、 The... Can only be obtained after the operation is completed Tensor Value ,
map Function operation of mapping in , And just tell dataset, Every time you take out a sample, you should do it first function Used after an operation , therefore function Is called at each iteration dataset Is called when , Belong to Static diagram logic
tensorflow.python.framework.ops.EagerTensor
tensorflow.python.framework.ops.Tensor
tf.py_function What role does it play here ?
Wraps a python function into a TensorFlow op that executes it eagerly.
Just now map Data static diagram logic , The default parameters are Tensor. and Use tf.py_function() After packing , The parameter becomes EagerTensor.
references
【2】https://blog.csdn.net/qq_27825451/article/details/105247211
【3】https://www.tensorflow.org/guide/data_performance#parallelizing_data_extraction
边栏推荐
- Construction case of Expressway Precast Beam Yard (with scheme text)
- Unable to resolve dependency tree
- 分布式锁介绍
- Enterprise Architect v16
- In the era of smart retail, Weimeng reshapes the value of "shopping guide"
- PostgreSQL age XID maintenance prevents the database from being read-only
- [automation] generate xlsx report based on openstack automated patrol deployed by kolla
- Gavin teacher's perception of transformer live class - rasa dialogue robot project practice in the field of education agency mode and core component source code analysis under the microservice of educ
- [software tool] [original] tutorial on using VOC dataset class alias batch modification tool
- [SC] OpenService FAILED 5: Access is denied.
猜你喜欢

1. Mx6ull learning notes (III) - busybox creates root file system

Legendary biological car-t has been approved by FDA, becoming the first domestic cell therapy product to successfully go to sea

路灯照明物联网技术方案,ESP32-S3芯片通信应用,智能WiFi远程控制

1. Mx6ull learning notes (II) - uboot migration

AI and logistics Patent

Zabbix6.0新功能Geomap 地图标记 你会用吗?

Oracle's instr()

【高效】最强开发工具Ctool编译踩坑

疫情数据分析平台工作报告【1】数据采集

Interview must ask: summary of ten classic sorting algorithms
随机推荐
Using datetime in MySQL
mysqld: Can‘t create directory ‘D: oftinstall\mysql57 (Errcode: 2 - No such file or directory)
Things to challenge
请用递归的方法计算下列函数的值:px(x,n)=x-x^2 +x^3- x^4+… ((-1)n-1)(xn) n>0 **输入格式要求:“%lf%d“ 提示信息:“Enter X and N:”
Smart Panel wifi Linkage Technology, esp32 wireless chip module, Internet of Things WiFi Communication Application
Mysql主从搭建与Django实现读写分离
Construction case of Expressway Precast Beam Yard (with scheme text)
From science to startup
2022 examination questions and simulation examination for crane driver (limited to bridge crane)
Epidemic data analysis platform work report [3] website deployment
Tasks in C #
[C language] encapsulation interface (addition, subtraction, multiplication and division)
Create a new table in the database. There was no problem before. Today
无线物联网WiFi模块方案,ESP32-S3芯片技术,助力设备智能化
InnoDB data storage structure – MySQL
spacy中en_core_web_sm安装问题
SQL safe backup display and zoom font support
How to construct a search string?
Thousand word masterpiece "programming biography"
Encapsulation manuelle d'un foreach et d'une carte