当前位置:网站首页>加速訓練之並行化 tf.data.Dataset 生成器
加速訓練之並行化 tf.data.Dataset 生成器
2022-06-12 04:40:00 【u012804784】
優質資源分享
| 學習路線指引(點擊解鎖) | 知識定比特 | 人群定比特 |
|---|---|---|
| 🧡 Python實戰微信訂餐小程序 🧡 | 進階級 | 本課程是python flask+微信小程序的完美結合,從項目搭建到騰訊雲部署上線,打造一個全棧訂餐系統。 |
| Python量化交易實戰 | 入門級 | 手把手帶你打造一個易擴展、更安全、效率更高的量化交易系統 |
在處理大規模數據時,數據無法全部載入內存,我們通常用兩個選項
- 使用
tfrecords - 使用
tf.data.Dataset.from_generator()
tfrecords的並行化使用前文已經有過介紹,這裏不再贅述。如果我們不想生成tfrecord中間文件,那麼生成器就是你所需要的。
本文主要記錄針對 from_generator()的並行化方法,在 tf.data 中,並行化主要通過 map和 num_parallel_calls 實現,但是對一些場景,我們的generator()中有一些處理邏輯,是無法直接並行化的,最簡單的方法就是將generator()中的邏輯抽出來,使用map實現。
tf.data.Dataset generator 並行
對generator()中的複雜邏輯,我們對其進行簡化,即僅在生成器中做一些下標取值的類型操作,將generator()中處理部分使用py_function 包裹(wrapped) ,然後調用map處理。
def func(i):
i = i.numpy() # Decoding from the EagerTensor object
x, y = your_processing_function(training_set[i])
return x, y
z = list(range(len(training_set))) # The index generator
dataset = tf.data.Dataset.from_generator(lambda: z, tf.uint8)
dataset = dataset.map(lambda i: tf.py_function(func=func,
inp=[i],
Tout=[tf.uint8,
tf.float32]
),
num_parallel_calls=tf.data.AUTOTUNE)
由於隱式推斷的原因,有時tensor的輸出shape是未知的,需要額外處理
dataset = dataset.batch(8)
def \_fixup\_shape(x, y):
x.set_shape([None, None, None, nb_channels]) # n, h, w, c
y.set_shape([None, nb_classes]) # n, nb\_classes
return x, y
dataset = dataset.map(_fixup_shape)
tf.Tensor與tf.EagerTensor
為什麼需要 tf.py_function,先來看下tf.Tensor與tf.EagerTensor
EagerTensor是實時的,可以在任何時候獲取到它的值,即通過numpy獲取
Tensor是非實時的,它是靜態圖中的組件,只有當喂入數據、運算完成才能獲得該Tensor的值,
map中映射的函數運算,而僅僅是告訴dataset,你每一次拿出來的樣本時要先進行一遍function運算之後才使用的,所以function的調用是在每次迭代dataset的時候才調用的,屬於靜態圖邏輯
tensorflow.python.framework.ops.EagerTensor
tensorflow.python.framework.ops.Tensor
tf.py_function在這裏起了什麼作用?
Wraps a python function into a TensorFlow op that executes it eagerly.
剛才說到map數據靜態圖邏輯,默認參數都是Tensor。而 使用tf.py_function()包裝後,參數就變成了EagerTensor。
references
【2】https://blog.csdn.net/qq_27825451/article/details/105247211
【3】https://www.tensorflow.org/guide/data_performance#parallelizing_data_extraction
边栏推荐
- Memory protection
- Recommended system cleaning tools, cocktail Download
- Detailed explanation of software testing process
- 1. Mx6ull learning notes (III) - busybox creates root file system
- Operation of simulated examination platform for theoretical question bank of G2 utility boiler stoker in 2022
- Please calculate the value of the following function recursively: PX (x, n) =x-x^2 +x^3- x^4+... (-1) n-1) (xn) n > 0 * * input format requirements: "%lf%d" prompt: "enter X and n:"
- Thousand word masterpiece "programming biography"
- Interview must ask: summary of ten classic sorting algorithms
- 2022-02-28 WPF upper computer 126 understand mqtt
- PostgreSQL age XID maintenance prevents the database from being read-only
猜你喜欢

Simple Tetris

D1 哪吒开发板 上电记录

QT compile 45 graphic report of security video monitoring system
![[efficient] the most powerful development tool, ctool, is a compilation tool](/img/23/a5eb401affd64119590db273d60c23.png)
[efficient] the most powerful development tool, ctool, is a compilation tool

2022 electrician (elementary) operation certificate examination question bank and online simulation examination

Zabbix6.0 new feature GEOMAP map marker can you use it?

疫情数据分析平台工作报告【6.5】疫情地图

2022 examination questions and simulation examination for crane driver (limited to bridge crane)

JWT学习与使用

Using datetime in MySQL
随机推荐
SQL injection upload one sentence Trojan horse (turn)
Data processing and data set preparation
JS function and variable have the same name (function and variable parsing rules)
Install/Remove of the Service Denied!
Notes on relevant knowledge points such as original code / inverse code / complement code, size end, etc
Sqel easy to use
JWT learning and use
2022-02-28 WPF upper computer 126 understand mqtt
New year news of osdu open underground data space Forum
LabVIEW about TDMS and Binary Storage Speed
Is there a row limit for a single MySQL table
leetcode 205. Isomorphic Strings
How do I extract files from the software?
Work report on epidemic data analysis platform [7] Alibaba cloud related
Advanced MySQL knowledge points (7)
Illustrating the use of Apache skywalking UI
Daily practice (28): balance binary tree
[wechat applet] the mobile terminal selects and publishes pictures
JWT學習與使用
Work report of epidemic data analysis platform [1] data collection