当前位置:网站首页>Yolov5 input (II) | CSDN creative punch in
Yolov5 input (II) | CSDN creative punch in
2022-07-03 05:07:00 【TT ya】
Beginner little rookie , I hope it's like taking notes and recording what I've learned , Also hope to help the same entry-level people , I hope the big guys can help correct it ~ Tort made delete .
Catalog
One 、 Adaptive anchor frame calculation
One 、 Adaptive anchor frame calculation
YOLOv5 For different data sets , There are anchor boxes with initial length and width . In the process of training , Output the prediction box based on the initial anchor box , Then with the real box groundtruth compare , That is, the difference between the real border position and the preset border is calculated , Then reverse update , Iterative network parameters , Keep training .
Anchor Box The definition of : Described by the height and width of the border , At first glance, you will feel this Anchor Box It's not fixed , It can form countless on the picture . Here we need a central point , And this central point is extracted by the subsequent network Feature Map The point of , So an initial Anchor box There is no need to specify a central location .
stay yolov5s.yaml The initialization in the file is :
anchors:
- [10,13, 16,30, 33,23] # P3/8
- [30,61, 62,45, 59,119] # P4/16
- [116,90, 156,198, 373,326] # P5/32
stay YOLOv5 This function is optional , If you think it doesn't work well, you can turn it off , Add
“–noautoanchor” Options
The specific choice is train.py In file
parser.add_argument('--noautoanchor', action='store_true', help='disable AutoAnchor')
Two 、 Adaptive image scaling
1、 Principle analysis
Usually we find pictures of different sizes , But when it comes to online training , It is necessary to ensure that the image size is consistent .
But if we simply use resize, It will cause the distortion of the picture , And then affect our results .
So take a better approach ——letterbox Adaptive image scaling technology
notes :train It is not used in the process letterbox Adaptive image scaling technology , Only in detect In-process use .
train The same size of the picture is kept because it puts 4 The parts of the picture form a large picture with the same size , So there's no need to use letterbox
See previous blogs for details YOLOv5 Input end ( One )—— Mosaic Data to enhance |CSDN Creative punch in _tt Ya's blog -CSDN Blog
letterbox Adaptive image scaling technology tries to maintain the aspect ratio , The missing parts shall be filled with gray edges to reach the fixed size .
Next, let's combine it with the code to see its principle
2、 The code analysis
This part is in utils/augmentations.py In the document letterbox function
First, get the height and width of the current picture , Then ensure that the height and width of the transformed image is an integer
def letterbox(im, new_shape=(640, 640), color=(114, 114, 114), auto=True, scaleFill=False, scaleup=True, stride=32):
# Resize and pad image while meeting stride-multiple constraints
shape = im.shape[:2] # current shape [height, width]
# Get the height and width of the current picture
if isinstance(new_shape, int):# Determine whether the final image height and width is an integer
new_shape = (new_shape, new_shape)
# Save the final picture height and width in the form of pixel segments to new_shape in
Calculate scale , Take the smallest proportion , And we only shrink, not enlarge
r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])# Calculate scale
if not scaleup: # only scale down, do not scale up (for better val mAP)
r = min(r, 1.0)# The operation here is to scale down only , Don't zoom in
Then calculate the length and width of the fill :
First, calculate the height and width of the scaled image
Then calculate the length to be filled , And again stride Divide by the remainder
among stride Represents the number of samples under the model 2 To the power of ,( Feel the problem of wild ) stay YOLOV5 The number of middle and lower samples is 5, So for 32
Finally, the filling length is halved , Because we have to divide up and down , left and right
# Compute padding
ratio = r, r # width, height ratios
new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))# Calculate the length and width of the contracted picture
dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1] # wh padding
# Calculate the length and width to be filled
if auto: # minimum rectangle
dw, dh = np.mod(dw, stride), np.mod(dh, stride) # wh padding
#stride Represents the number of samples under the model 2 To the power of ,( Feel the problem of wild ) stay YOLOV5 The number of middle and lower samples is 5, So for 32
elif scaleFill: # stretch
dw, dh = 0.0, 0.0
new_unpad = (new_shape[1], new_shape[0])
ratio = new_shape[1] / shape[1], new_shape[0] / shape[0] # width, height ratios
dw /= 2 # divide padding into 2 sides
dh /= 2
# And a half ( Up and down , about )
Transform and return :
call resize Function to deform , Then finally determine the number to be filled up, down, left and right ( The guarantee is greater than or equal to 0 The integer of )
Call again copyMakeBorder fill
Finally, return the image after the operation , Scale and fill height and width
if shape[::-1] != new_unpad: # resize
im = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR)
# Deformation operation
top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
# Directly let less than 1 For the 0, Calculate how much to fill in up, down, left, right ( Integers )
im = cv2.copyMakeBorder(im, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color) # add border
# fill
return im, ratio, (dw, dh)
You are welcome to criticize and correct in the comment area , Thank you. ~
边栏推荐
- C language program ideas and several commonly used filters
- Actual combat 8051 drives 8-bit nixie tube
- Esp32-c3 learning and testing WiFi (II. Wi Fi distribution - smart_config mode and BlueIf mode)
- [research materials] annual report of China's pension market in 2021 - Download attached
- Current market situation and development prospect forecast of global UV sensitive resin 3D printer industry in 2022
- [basic grammar] Snake game written in C language
- Current market situation and development prospect prediction of global direct energy deposition 3D printer industry in 2022
- 1087 all roads lead to Rome (30 points)
- [batch dos-cmd command - summary and summary] - CMD window setting and operation command - close CMD window and exit CMD environment (exit, exit /b, goto: EOF)
- SSM framework integration
猜你喜欢
[research materials] 2021 annual report on mergers and acquisitions in the property management industry - Download attached
Esp32-c3 learning and testing WiFi (II. Wi Fi distribution - smart_config mode and BlueIf mode)
[batch dos-cmd command - summary and summary] - CMD window setting and operation command - close CMD window and exit CMD environment (exit, exit /b, goto: EOF)
Web APIs exclusivity
Basic knowledge of reflection (detailed explanation)
[Yu Yue education] basic reference materials of interchangeability and measurement technology of Zhongyuan Institute of Technology
Without 50W bride price, my girlfriend was forcibly dragged away. What should I do
Shuttle + alluxio accelerated memory shuffle take-off
On typescript and grammar
Do you know UVs in modeling?
随机推荐
1086 tree traversals again (25 points)
The 19th Zhejiang I. barbecue
Class loading mechanism (detailed explanation of the whole process)
Market status and development prospect prediction of global colorimetric cup cover industry in 2022
ZABBIX monitoring of lamp architecture (3): zabbix+mysql (to be continued)
First + only! Alibaba cloud's real-time computing version of Flink passed the stability test of big data products of the Institute of ICT
Market status and development prospect prediction of global neutral silicone sealant industry in 2022
Web APIs exclusivity
Problems encountered in fuzzy query of SQL statements
最大连续子段和(动态规划,递归,递推)
Compile and decompile GCC common instructions
Common methods of JS array
Go language interface learning notes
My first Smartphone
Appium 1.22. L'Inspecteur appium après la version X doit être installé séparément
How to connect the network: Chapter 2 (Part 1): a life cycle of TCP connection | CSDN creation punch in
[basic grammar] Snake game written in C language
Market status and development prospect prediction of the global fire alarm sensor industry in 2022
Pan details of deep learning
LVS load balancing cluster of efficient multi-purpose cluster (NAT mode)