当前位置:网站首页>Yolov5 input (II) | CSDN creative punch in
Yolov5 input (II) | CSDN creative punch in
2022-07-03 05:07:00 【TT ya】
Beginner little rookie , I hope it's like taking notes and recording what I've learned , Also hope to help the same entry-level people , I hope the big guys can help correct it ~ Tort made delete .
Catalog
One 、 Adaptive anchor frame calculation
One 、 Adaptive anchor frame calculation
YOLOv5 For different data sets , There are anchor boxes with initial length and width . In the process of training , Output the prediction box based on the initial anchor box , Then with the real box groundtruth compare , That is, the difference between the real border position and the preset border is calculated , Then reverse update , Iterative network parameters , Keep training .
Anchor Box The definition of : Described by the height and width of the border , At first glance, you will feel this Anchor Box It's not fixed , It can form countless on the picture . Here we need a central point , And this central point is extracted by the subsequent network Feature Map The point of , So an initial Anchor box There is no need to specify a central location .
stay yolov5s.yaml The initialization in the file is :
anchors:
- [10,13, 16,30, 33,23] # P3/8
- [30,61, 62,45, 59,119] # P4/16
- [116,90, 156,198, 373,326] # P5/32stay YOLOv5 This function is optional , If you think it doesn't work well, you can turn it off , Add
“–noautoanchor” Options
The specific choice is train.py In file
parser.add_argument('--noautoanchor', action='store_true', help='disable AutoAnchor')Two 、 Adaptive image scaling
1、 Principle analysis
Usually we find pictures of different sizes , But when it comes to online training , It is necessary to ensure that the image size is consistent .
But if we simply use resize, It will cause the distortion of the picture , And then affect our results .
So take a better approach ——letterbox Adaptive image scaling technology
notes :train It is not used in the process letterbox Adaptive image scaling technology , Only in detect In-process use .
train The same size of the picture is kept because it puts 4 The parts of the picture form a large picture with the same size , So there's no need to use letterbox
See previous blogs for details YOLOv5 Input end ( One )—— Mosaic Data to enhance |CSDN Creative punch in _tt Ya's blog -CSDN Blog
letterbox Adaptive image scaling technology tries to maintain the aspect ratio , The missing parts shall be filled with gray edges to reach the fixed size .
Next, let's combine it with the code to see its principle
2、 The code analysis
This part is in utils/augmentations.py In the document letterbox function
First, get the height and width of the current picture , Then ensure that the height and width of the transformed image is an integer
def letterbox(im, new_shape=(640, 640), color=(114, 114, 114), auto=True, scaleFill=False, scaleup=True, stride=32):
# Resize and pad image while meeting stride-multiple constraints
shape = im.shape[:2] # current shape [height, width]
# Get the height and width of the current picture
if isinstance(new_shape, int):# Determine whether the final image height and width is an integer
new_shape = (new_shape, new_shape)
# Save the final picture height and width in the form of pixel segments to new_shape in Calculate scale , Take the smallest proportion , And we only shrink, not enlarge
r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])# Calculate scale
if not scaleup: # only scale down, do not scale up (for better val mAP)
r = min(r, 1.0)# The operation here is to scale down only , Don't zoom in Then calculate the length and width of the fill :
First, calculate the height and width of the scaled image
Then calculate the length to be filled , And again stride Divide by the remainder
among stride Represents the number of samples under the model 2 To the power of ,( Feel the problem of wild ) stay YOLOV5 The number of middle and lower samples is 5, So for 32
Finally, the filling length is halved , Because we have to divide up and down , left and right
# Compute padding
ratio = r, r # width, height ratios
new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))# Calculate the length and width of the contracted picture
dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1] # wh padding
# Calculate the length and width to be filled
if auto: # minimum rectangle
dw, dh = np.mod(dw, stride), np.mod(dh, stride) # wh padding
#stride Represents the number of samples under the model 2 To the power of ,( Feel the problem of wild ) stay YOLOV5 The number of middle and lower samples is 5, So for 32
elif scaleFill: # stretch
dw, dh = 0.0, 0.0
new_unpad = (new_shape[1], new_shape[0])
ratio = new_shape[1] / shape[1], new_shape[0] / shape[0] # width, height ratios
dw /= 2 # divide padding into 2 sides
dh /= 2
# And a half ( Up and down , about )Transform and return :
call resize Function to deform , Then finally determine the number to be filled up, down, left and right ( The guarantee is greater than or equal to 0 The integer of )
Call again copyMakeBorder fill
Finally, return the image after the operation , Scale and fill height and width
if shape[::-1] != new_unpad: # resize
im = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR)
# Deformation operation
top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
# Directly let less than 1 For the 0, Calculate how much to fill in up, down, left, right ( Integers )
im = cv2.copyMakeBorder(im, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color) # add border
# fill
return im, ratio, (dw, dh)You are welcome to criticize and correct in the comment area , Thank you. ~
边栏推荐
- sql语句模糊查询遇到的问题
- Self introduction and objectives
- Gbase8s composite index (I)
- Overview of basic knowledge of C language
- JS function algorithm interview case
- C language program ideas and several commonly used filters
- Three representations of signed numbers: original code, inverse code and complement code
- Review the configuration of vscode to develop golang
- Market status and development prospect forecast of global button dropper industry in 2022
- JS string and array methods
猜你喜欢

SSM framework integration

Three representations of signed numbers: original code, inverse code and complement code

leetcode406. Rebuild the queue based on height

【批处理DOS-CMD命令-汇总和小结】-CMD窗口的设置与操作命令-关闭cmd窗口、退出cmd环境(exit、exit /b、goto :eof)

Review the old and know the new: Notes on Data Science

Esp32-c3 learning and testing WiFi (II. Wi Fi distribution - smart_config mode and BlueIf mode)

Use Sqlalchemy module to obtain the table name and field name of the existing table in the database

Analysis of proxy usage of ES6 new feature
![[basic grammar] C language uses for loop to print Pentagram](/img/9e/021c6c0e748e0981d4233f74c83e76.jpg)
[basic grammar] C language uses for loop to print Pentagram

Mobile terminal - uniapp development record (public request encapsulation)
随机推荐
50 practical applications of R language (36) - data visualization from basic to advanced
编译GCC遇到的“pthread.h” not found问题
C language program ideas and several commonly used filters
[set theory] relational power operation (relational power operation | examples of relational power operation | properties of relational power operation)
Based on RFC 3986 (unified resource descriptor (URI): general syntax)
Pan details of deep learning
Handler understands the record
Realize file download through the tag of < a > and customize the file name
[set theory] relationship properties (common relationship properties | relationship properties examples | relationship operation properties)
Promise
[research materials] 2021 annual report on mergers and acquisitions in the property management industry - Download attached
JS scope
1095 cars on campus (30 points)
[backtrader source code analysis 5] rewrite several time number conversion functions in utils with Python
[research materials] 2022q1 game preferred casual game distribution circular - Download attached
What is UUID
JQ style, element operation, effect, filtering method and transformation, event object
1114 family property (25 points)
[research materials] annual report of China's pension market in 2021 - Download attached
Market status and development prospect prediction of the global fire alarm sensor industry in 2022