当前位置:网站首页>Image data preprocessing
Image data preprocessing
2022-07-08 00:55:00 【booze-J】
1. Download datasets
First of all, we need to download the cat and dog data set on the Internet :
Cat and dog classification data set download address :https://pan.baidu.com/s/1i4SKqWH
password :d8mt
2. Data set partitioning
Data just downloaded train and test They are all pictures of cats and dogs , It needs to be modified and re divided train and test The cat and dog in are divided separately . The file structure is as follows :
|_image
|_train
|_dog
|_cat
|_test
|_dog
|_cat
Due to the problem of training time , It's just... Here 2000 Picture for training ,1000 Picture to verify . You can decide the size of the training and test data set .
3. Data preprocessing code
The code running platform is jupyter-notebook, Code blocks in the article , According to jupyter-notebook Written in the order of division in , Run article code , Glue directly into jupyter-notebook that will do .
from keras.preprocessing.image import ImageDataGenerator,array_to_img,img_to_array,load_img
- rotation_range It's a 0~180 Degrees , Used to specify the angle of randomly selected pictures
- width_shift and height_shift Used to specify the degree of random movement in horizontal and vertical directions , These are two 0~1 Ratio between
- rescale The value will be multiplied over the entire image before performing other processing , Our image is RGB All the channels are 0~255 The integer of , Such an operation may make the value of the image too high or too low , So we set this value as 0~1 Number between
- shear_range Is the degree of shear transformation , Reference shear transform
- zoom_range For random amplification
- horizontal_flip Randomly flip the picture horizontally , This parameter is applicable when the horizontal flip does not affect the image semantics
- fill_mode Used to specify when pixel filling is required , Like rotation 、 Horizontal and vertical displacement , How to fill new pixels
datagen = ImageDataGenerator(
rotation_range=40, # Random rotation angle
width_shift_range=0.2, # Random horizontal translation
height_shift_range=0.2, # Random vertical translation
rescale=1./255, # Normalization of values
shear_range=0.2, # Random cutting
zoom_range=0.2, # Random amplification
horizontal_flip=True, # Flip horizontal
fill_mode="nearest" # fill style
)
Here we use a picture to demonstrate the effect of data processing :
# Load Images
img = load_img("./image/train/cat/cat.1.jpg")
# Convert picture to array data format
x = img_to_array(img)
# (280, 300, 3) = (H,W,channels)
print(x.shape)
# Add a dimension to the picture This dimension is mainly added because a four-dimensional picture is required during training
x = x.reshape((1,)+x.shape)
# (1, 280, 300, 3) = (batch_size,H,W,channels)
print(x.shape)
i = 0
# Generate 21 A picture
# flow Randomly generated image save_prefix Prefix the newly generated name
for batch in datagen.flow(x,batch_size=1,save_to_dir='temp',save_prefix="cat",save_format="jpeg"):
# perform 20 Time
i += 1
if i>20:
break
Pictures of the test :
Code run results :
You can see that the effect of this data enhancement is still good !
边栏推荐
- 【obs】Impossible to find entrance point CreateDirect3D11DeviceFromDXGIDevice
- 英雄联盟胜负预测--简易肯德基上校
- Codeforces Round #804 (Div. 2)(A~D)
- Codeforces Round #804 (Div. 2)(A~D)
- Letcode43: string multiplication
- AI遮天传 ML-初识决策树
- Four stages of sand table deduction in attack and defense drill
- Course of causality, taught by Jonas Peters, University of Copenhagen
- Service Mesh介绍,Istio概述
- Su embedded training - day4
猜你喜欢
The standby database has been delayed. Check that the MRP is wait_ for_ Log, apply after restarting MRP_ Log but wait again later_ for_ log
SDNU_ACM_ICPC_2022_Summer_Practice(1~2)
【笔记】常见组合滤波电路
Application practice | the efficiency of the data warehouse system has been comprehensively improved! Data warehouse construction based on Apache Doris in Tongcheng digital Department
接口测试进阶接口脚本使用—apipost(预/后执行脚本)
QT establish signal slots between different classes and transfer parameters
他们齐聚 2022 ECUG Con,只为「中国技术力量」
【愚公系列】2022年7月 Go教学课程 006-自动推导类型和输入输出
赞!idea 如何单窗口打开多个项目?
Lecture 1: the entry node of the link in the linked list
随机推荐
What is load balancing? How does DNS achieve load balancing?
Hotel
深潜Kotlin协程(二十三 完结篇):SharedFlow 和 StateFlow
ThinkPHP kernel work order system source code commercial open source version multi user + multi customer service + SMS + email notification
The method of server defense against DDoS, Hangzhou advanced anti DDoS IP section 103.219.39 x
A brief history of information by James Gleick
华泰证券官方网站开户安全吗?
Tapdata 的 2.0 版 ,开源的 Live Data Platform 现已发布
Thinkphp内核工单系统源码商业开源版 多用户+多客服+短信+邮件通知
Basic types of 100 questions for basic grammar of Niuke
Vscode software
The weight of the product page of the second level classification is low. What if it is not included?
v-for遍历元素样式失效
手写一个模拟的ReentrantLock
基于卷积神经网络的恶意软件检测方法
Cve-2022-28346: Django SQL injection vulnerability
Service mesh introduction, istio overview
NVIDIA Jetson测试安装yolox过程记录
Jemter distributed
Su embedded training - day4