当前位置:网站首页>Yolov5 input (II) | CSDN creative punch in
Yolov5 input (II) | CSDN creative punch in
2022-07-03 05:07:00 【TT ya】
Beginner little rookie , I hope it's like taking notes and recording what I've learned , Also hope to help the same entry-level people , I hope the big guys can help correct it ~ Tort made delete .
Catalog
One 、 Adaptive anchor frame calculation
One 、 Adaptive anchor frame calculation
YOLOv5 For different data sets , There are anchor boxes with initial length and width . In the process of training , Output the prediction box based on the initial anchor box , Then with the real box groundtruth compare , That is, the difference between the real border position and the preset border is calculated , Then reverse update , Iterative network parameters , Keep training .
Anchor Box The definition of : Described by the height and width of the border , At first glance, you will feel this Anchor Box It's not fixed , It can form countless on the picture . Here we need a central point , And this central point is extracted by the subsequent network Feature Map The point of , So an initial Anchor box There is no need to specify a central location .
stay yolov5s.yaml The initialization in the file is :
anchors:
- [10,13, 16,30, 33,23] # P3/8
- [30,61, 62,45, 59,119] # P4/16
- [116,90, 156,198, 373,326] # P5/32stay YOLOv5 This function is optional , If you think it doesn't work well, you can turn it off , Add
“–noautoanchor” Options
The specific choice is train.py In file
parser.add_argument('--noautoanchor', action='store_true', help='disable AutoAnchor')Two 、 Adaptive image scaling
1、 Principle analysis
Usually we find pictures of different sizes , But when it comes to online training , It is necessary to ensure that the image size is consistent .
But if we simply use resize, It will cause the distortion of the picture , And then affect our results .
So take a better approach ——letterbox Adaptive image scaling technology
notes :train It is not used in the process letterbox Adaptive image scaling technology , Only in detect In-process use .
train The same size of the picture is kept because it puts 4 The parts of the picture form a large picture with the same size , So there's no need to use letterbox
See previous blogs for details YOLOv5 Input end ( One )—— Mosaic Data to enhance |CSDN Creative punch in _tt Ya's blog -CSDN Blog
letterbox Adaptive image scaling technology tries to maintain the aspect ratio , The missing parts shall be filled with gray edges to reach the fixed size .
Next, let's combine it with the code to see its principle
2、 The code analysis
This part is in utils/augmentations.py In the document letterbox function
First, get the height and width of the current picture , Then ensure that the height and width of the transformed image is an integer
def letterbox(im, new_shape=(640, 640), color=(114, 114, 114), auto=True, scaleFill=False, scaleup=True, stride=32):
# Resize and pad image while meeting stride-multiple constraints
shape = im.shape[:2] # current shape [height, width]
# Get the height and width of the current picture
if isinstance(new_shape, int):# Determine whether the final image height and width is an integer
new_shape = (new_shape, new_shape)
# Save the final picture height and width in the form of pixel segments to new_shape in Calculate scale , Take the smallest proportion , And we only shrink, not enlarge
r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])# Calculate scale
if not scaleup: # only scale down, do not scale up (for better val mAP)
r = min(r, 1.0)# The operation here is to scale down only , Don't zoom in Then calculate the length and width of the fill :
First, calculate the height and width of the scaled image
Then calculate the length to be filled , And again stride Divide by the remainder
among stride Represents the number of samples under the model 2 To the power of ,( Feel the problem of wild ) stay YOLOV5 The number of middle and lower samples is 5, So for 32
Finally, the filling length is halved , Because we have to divide up and down , left and right
# Compute padding
ratio = r, r # width, height ratios
new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))# Calculate the length and width of the contracted picture
dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1] # wh padding
# Calculate the length and width to be filled
if auto: # minimum rectangle
dw, dh = np.mod(dw, stride), np.mod(dh, stride) # wh padding
#stride Represents the number of samples under the model 2 To the power of ,( Feel the problem of wild ) stay YOLOV5 The number of middle and lower samples is 5, So for 32
elif scaleFill: # stretch
dw, dh = 0.0, 0.0
new_unpad = (new_shape[1], new_shape[0])
ratio = new_shape[1] / shape[1], new_shape[0] / shape[0] # width, height ratios
dw /= 2 # divide padding into 2 sides
dh /= 2
# And a half ( Up and down , about )Transform and return :
call resize Function to deform , Then finally determine the number to be filled up, down, left and right ( The guarantee is greater than or equal to 0 The integer of )
Call again copyMakeBorder fill
Finally, return the image after the operation , Scale and fill height and width
if shape[::-1] != new_unpad: # resize
im = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR)
# Deformation operation
top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
# Directly let less than 1 For the 0, Calculate how much to fill in up, down, left, right ( Integers )
im = cv2.copyMakeBorder(im, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color) # add border
# fill
return im, ratio, (dw, dh)You are welcome to criticize and correct in the comment area , Thank you. ~
边栏推荐
- MPM model and ab pressure test
- [research materials] 2021 annual report on mergers and acquisitions in the property management industry - Download attached
- My first Smartphone
- Thesis reading_ Tsinghua Ernie
- 50 practical applications of R language (36) - data visualization from basic to advanced
- Wechat applet distance and map
- Valentine's day limited withdrawal guide: for one in 200 million of you
- The current market situation and development prospect of the global gluten tolerance test kit industry in 2022
- Notes | numpy-07 Slice and index
- 112 stucked keyboard (20 points)
猜你喜欢

The programmer resigned and was sentenced to 10 months for deleting the code. JD came home and said that it took 30000 to restore the database. Netizen: This is really a revenge

Thesis reading_ Chinese medical model_ eHealth

Online VR model display - 3D visual display solution

Valentine's day limited withdrawal guide: for one in 200 million of you

Prepare for 2022 and welcome the "golden three silver four". The "summary of Android intermediate and advanced interview questions in 2022" is fresh, so that your big factory interview can go smoothly

5-36v input automatic voltage rise and fall PD fast charging scheme drawing 30W low-cost chip
![[set theory] relation properties (reflexivity | reflexivity theorem | reflexivity | reflexivity theorem | example)](/img/2a/362f3b0491f721d89336d4f468c9dd.jpg)
[set theory] relation properties (reflexivity | reflexivity theorem | reflexivity | reflexivity theorem | example)

Pan details of deep learning

Esp32-c3 learning and testing WiFi (II. Wi Fi distribution - smart_config mode and BlueIf mode)

音频焦点系列:手写一个demo理解音频焦点与AudioMananger
随机推荐
Distinguish between releases and snapshots in nexus private library
appium1.22.x 版本后的 appium inspector 需单独安装
Notes | numpy-10 Iterative array
Market status and development prospect forecast of global heat curing adhesive industry in 2022
Wechat applet distance and map
Messy change of mouse style in win system
Interface frequency limit access
The programmer resigned and was sentenced to 10 months for deleting the code. JD came home and said that it took 30000 to restore the database. Netizen: This is really a revenge
How to connect the network: Chapter 2 (Part 1): a life cycle of TCP connection | CSDN creation punch in
Chapter II program design of circular structure
Shallow and first code
Retirement plan fails, 64 year old programmer starts work again
appium1.22. Appium inspector after X version needs to be installed separately
Problems encountered in fuzzy query of SQL statements
Go language interface learning notes
Notes | numpy-09 Broadcast
Basic use of Metasploit penetration testing framework
1111 online map (30 points)
Concurrent operation memory interaction
Gbase8s composite index (I)