当前位置:网站首页>mmcv常用API介绍
mmcv常用API介绍
2022-06-30 05:08:00 【武乐乐~】
文章目录
前言
本篇主要对mmdet中经常使用mmcv某些API做介绍。
1、前置基础知识
mmcv中包含了大量图像处理的函数,最常用到的两个库就是cv2和pillow。因此,对这两个库常用的API做下简要介绍。
1.1. 读取图像
import cv2
from PIL import Image, ImageDraw
import matplotlib.pyplot as plt
# h>w的图像: (1133, 800, 3)
img_path = '/home/wujian/mmdet-lap/data/coco/val2017/000001.jpg'
img = cv2.imread(img_path)
h,w = img.shape[:2]
print('h:', h, 'w:',w)
img = Image.open(img_path)
w,h = img.size
print('w:', w, 'h:',h)
注意cv2返回的是图像的h和w,而pil返回的是图像的w和h!!
1.2. cv2和pil相互转化
import cv2
import numpy as np
from PIL import Image
# cv2 --> pil
img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
# pil --> cv2
image = cv2.cvtColor(np.array(img), cv2.COLOR_RGB2BGR)
1.3. 转成pil进行可视化
一般在IDE中进行编码,所以转成PIL更加方便可视化,贴下可视化pil图像代码:
from PIL import Image
import matplotlib.pyplot as plt
img = open(img_path)
plt.imshow(img)
plt.show()
1.4. cv2和pil保存图像
只需注意保存的是绝对路径即可。
cv2.imwrite('abs_path', img) # img是经cv2.imread读取的
img.save('abs_path') # img 是经 Image.open()读取的
2、mmcv
这里贴下mmdet中常使用的数据集处理字段:
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True),
dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
]
2.1. 变换图像尺寸Resize
分别选取了两张h>w和w>h的图像进行Resize变换,mmdet中变换操作就是让比例较小的一边变成指定的一边,然后另一边进行scale缩放。当然,变换完成后和原始图像的h和w的大小顺序不发生改变。
from PIL import Image, ImageDraw
import matplotlib.pyplot as plt
from mmcv.image import imrescale
# h>w的图像: (1133, 800, 3),可视化第一张图像
img_path = '/home/wujian/mmdet-lap/data/coco/val2017/000001.jpg'
img = cv2.imread(img_path)
h,w = img.shape[:2]
img, new_scale = imrescale(img, scale=(1333,800), return_scale= True)
print(img.shape)
# cv2 --> pil
img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
plt.imshow(img)
plt.show()
# w>h的图像 :(800, 1067, 3), 可视化第二张图像
img_path = '/home/wujian/mmdet-lap/data/coco/val2017/000003.jpg'
img = cv2.imread(img_path)
h,w = img.shape[:2]
img, new_scale = imrescale(img, scale=(1333,800), return_scale= True)
print(img.shape)
# cv2 --> pil
img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
plt.imshow(img)
plt.show()


2.2. 填充图像
在Resize基础上, Pad操作就是填充宽和高让其两边成为32的倍数。贴下总的代码:
import cv2
from PIL import Image, ImageDraw
import matplotlib.pyplot as plt
from mmcv.image import imrescale
# h>w的图像: (1133, 800, 3)
img_path = '/home/wujian/mmdet-lap/data/coco/val2017/000001.jpg'
img = cv2.imread(img_path)
h,w = img.shape[:2]
img, new_scale = imrescale(img, scale=(1333,800), return_scale= True)
print(img.shape)
# cv2 --> pil
img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
plt.imshow(img)
plt.show()
#pad
from mmcv.image import impad_to_multiple
import numpy as np
# pil --> cv2
image = cv2.cvtColor(np.array(img), cv2.COLOR_RGB2BGR)
pad_img = impad_to_multiple(image, divisor=32, pad_val= 0)
print(pad_img.shape)
pad_img = Image.fromarray(cv2.cvtColor(pad_img, cv2.COLOR_BGR2RGB))
plt.imshow(pad_img)
plt.show()
# w>h的图像 :(800, 1067, 3)
img_path = '/home/wujian/mmdet-lap/data/coco/val2017/000003.jpg'
img = cv2.imread(img_path)
h,w = img.shape[:2]
img, new_scale = imrescale(img, scale=(1333,800), return_scale= True)
print(img.shape)
# cv2 --> pil
img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
plt.imshow(img)
plt.show()
#pad
from mmcv.image import impad_to_multiple
import numpy as np
# pil --> cv2
image = cv2.cvtColor(np.array(img), cv2.COLOR_RGB2BGR)
pad_img = impad_to_multiple(image, divisor=32, pad_val= 0)
print(pad_img.shape)
pad_img = Image.fromarray(cv2.cvtColor(pad_img, cv2.COLOR_BGR2RGB))
plt.imshow(pad_img)
plt.show()


2.3. 图像左右翻转变化
总结
&emps;后续有空会增加并讲解mmcv这部分的源代码。
边栏推荐
- The file has been downloaded incorrectly!
- 【VCS+Verdi联合仿真】~ 以计数器为例
- Unit asynchronous jump progress
- Untiy3d controls scene screenshots through external JSON files
- What is multimodal interaction?
- 2021-06-17 solve the problem of QML borderless window stretching, window jitter and flicker when stretching and shrinking
- 力扣27. 移除元素
- Connect() and disconnect() of socket in C #
- Steamvr causes abnormal scene camera
- Some books you should not miss when you are new to the workplace
猜你喜欢

Unity lens making

What is multimodal interaction?

Unity + hololens common basic functions

Detailed explanation of the process of "flyingbird" small game (camera adjustment and following part)

力扣349. 两个数组的交集

Unity packaging and publishing webgl error reason exception: failed building webgl player

Database base (Study & review for self use)

【 VCS + Verdi joint simulation】 ~ Taking Counter as an Example

Network communication problem locating steps

Oculus quest2 development: (I) basic environment construction and guide package
随机推荐
Photon pun refresh hall room list
ParticleSystem in the official Manual of unity_ Collision module
Nestjs configures static resources, template engine, and post examples
Force buckle 209 Minimum length subarray
Malignant bug: 1252 of unit MySQL export
0 foundation starts self-study unit notes control direction becomes larger
Oculus quest2 development: (I) basic environment construction and guide package
Leetcode 180 Consecutive numbers (2022.06.29)
Pytorchcnn image recognition and classification model training framework
Unity limited time use limited trial time and use times
Postman 做测试的 6 个常见问题
Modbus protocol register
Untiy3d controls scene screenshots through external JSON files
Unity + hololens publishing settings
Connect() and disconnect() of socket in C #
力扣59. 螺旋矩阵 II
Records of some problems encountered during unity development (continuously updated)
Force buckle 704 Binary search
Chapter 9 of OpenGL super classic (version 7): fragment processing and frame buffering
Unity/ue reads OPC UA and OPC Da data (UE4)