当前位置：网站首页>mmcv常用API介绍

mmcv常用API介绍

2022-06-30 05:08:00 【武乐乐~】

文章目录

前言
1、前置基础知识
2、mmcv
总结

前言

本篇主要对mmdet中经常使用mmcv某些API做介绍。

1、前置基础知识

mmcv中包含了大量图像处理的函数，最常用到的两个库就是cv2和pillow。因此，对这两个库常用的API做下简要介绍。

1.1. 读取图像

import cv2
from PIL import Image, ImageDraw
import matplotlib.pyplot as plt

# h>w的图像: (1133, 800, 3)
img_path = '/home/wujian/mmdet-lap/data/coco/val2017/000001.jpg'

img = cv2.imread(img_path)
h,w = img.shape[:2]
print('h:', h, 'w:',w)

img = Image.open(img_path)
w,h = img.size
print('w:', w, 'h:',h)

注意cv2返回的是图像的h和w，而pil返回的是图像的w和h！！

1.2. cv2和pil相互转化

import cv2
import numpy as np
from PIL import Image
# cv2 --> pil
img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
# pil --> cv2
image = cv2.cvtColor(np.array(img), cv2.COLOR_RGB2BGR)

1.3. 转成pil进行可视化

一般在IDE中进行编码，所以转成PIL更加方便可视化，贴下可视化pil图像代码：

from PIL import Image
import matplotlib.pyplot as plt

img = open(img_path)
plt.imshow(img)
plt.show()

1.4. cv2和pil保存图像

只需注意保存的是绝对路径即可。

cv2.imwrite('abs_path', img) # img是经cv2.imread读取的
img.save('abs_path')      # img 是经 Image.open()读取的

2、mmcv

这里贴下mmdet中常使用的数据集处理字段：

train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='LoadAnnotations', with_bbox=True),
    dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
    dict(type='RandomFlip', flip_ratio=0.5),
    dict(type='Normalize', **img_norm_cfg),
    dict(type='Pad', size_divisor=32),
    dict(type='DefaultFormatBundle'),
    dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
]

2.1. 变换图像尺寸Resize

分别选取了两张h>w和w>h的图像进行Resize变换，mmdet中变换操作就是让比例较小的一边变成指定的一边，然后另一边进行scale缩放。当然，变换完成后和原始图像的h和w的大小顺序不发生改变。

from PIL import Image, ImageDraw
import matplotlib.pyplot as plt
from mmcv.image import imrescale

# h>w的图像: (1133, 800, 3)，可视化第一张图像
img_path = '/home/wujian/mmdet-lap/data/coco/val2017/000001.jpg'
img = cv2.imread(img_path)
h,w = img.shape[:2]

img, new_scale = imrescale(img, scale=(1333,800), return_scale= True)
print(img.shape)
# cv2 --> pil
img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
plt.imshow(img)
plt.show()

# w>h的图像 :(800, 1067, 3)， 可视化第二张图像
img_path = '/home/wujian/mmdet-lap/data/coco/val2017/000003.jpg'
img = cv2.imread(img_path)
h,w = img.shape[:2]

img, new_scale = imrescale(img, scale=(1333,800), return_scale= True)
print(img.shape)
# cv2 --> pil
img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))

plt.imshow(img)
plt.show()

在这里插入图片描述

2.2. 填充图像

在Resize基础上， Pad操作就是填充宽和高让其两边成为32的倍数。贴下总的代码：

import cv2
from PIL import Image, ImageDraw
import matplotlib.pyplot as plt
from mmcv.image import imrescale

# h>w的图像: (1133, 800, 3)
img_path = '/home/wujian/mmdet-lap/data/coco/val2017/000001.jpg'
img = cv2.imread(img_path)
h,w = img.shape[:2]

img, new_scale = imrescale(img, scale=(1333,800), return_scale= True)
print(img.shape)
# cv2 --> pil
img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
plt.imshow(img)
plt.show()

#pad
from mmcv.image import impad_to_multiple
import numpy as np
# pil --> cv2
image = cv2.cvtColor(np.array(img), cv2.COLOR_RGB2BGR)
pad_img = impad_to_multiple(image, divisor=32, pad_val= 0)
print(pad_img.shape)
pad_img = Image.fromarray(cv2.cvtColor(pad_img, cv2.COLOR_BGR2RGB))
plt.imshow(pad_img)
plt.show()


# w>h的图像 :(800, 1067, 3)
img_path = '/home/wujian/mmdet-lap/data/coco/val2017/000003.jpg'
img = cv2.imread(img_path)
h,w = img.shape[:2]

img, new_scale = imrescale(img, scale=(1333,800), return_scale= True)
print(img.shape)
# cv2 --> pil
img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))

plt.imshow(img)
plt.show()

#pad
from mmcv.image import impad_to_multiple
import numpy as np
# pil --> cv2
image = cv2.cvtColor(np.array(img), cv2.COLOR_RGB2BGR)
pad_img = impad_to_multiple(image, divisor=32, pad_val= 0)
print(pad_img.shape)
pad_img = Image.fromarray(cv2.cvtColor(pad_img, cv2.COLOR_BGR2RGB))
plt.imshow(pad_img)
plt.show()