当前位置：网站首页>[pytorch series] detailed explanation of the torchvision image processing library of pytorch

[pytorch series] detailed explanation of the torchvision image processing library of pytorch

2022-07-27 20:04:00 【AiFool】

Transform and enhance images （TRANSFORMING AND AUGMENTING）

Conversion is a common image conversion available in the module . have access to Compose Link them together . Most transformation classes have functional equivalents ： Function transformations provide fine-grained control over transformations . If you have to build more complex transformation pipelines （ for example , In the case of segmented tasks ）, This will be very useful .torchvision.transforms Most transformations accept PIL Images and tensor images , Although some transformations only accept PIL, Some transformations accept only tensors . Conversion conversion conversion can be used with PIL Images are transformed into each other .

Accept the transformation of tensor images and accept batches of tensor images . Tensor image is a tensor with shape , Among them are multiple channels , And is the height and width of the image . A batch of tensor images is the tensor of shape , Among them are many images in this batch .(C, H, W)CHW(B, C, H, W)B

The expected range of tensor image values is determined by the tensor dtype The implicit definition . With floating point d Tensor images of type should have The value in . With integer d Tensor images of type should have values , The value can be in this d The maximum value represented in the type .[0, 1)[0,MAX_DTYPE]MAX_DTYPE

Random transformation applies the same transformation to all images in a given batch , But they will generate different transformations between calls . For reproducible transformations across calls , You can use function conversion .

Scriptable Transforms

In order to script the transformations, please use instead of Compose.torch.nn.Sequential

transforms = torch.nn.Sequential(
    transforms.CenterCrop(10),
    transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)),
)
scripted_transforms = torch.jit.script(transforms)

Make sure to use only scriptable transformations, i.e. that work with and does not require lambda functions or .torch.TensorPIL.Image

For any custom transformations to be used with , they should be derived from .torch.jit.scripttorch.nn.Module

Functional Transforms

adjust_brightness（img, brightness_factor）

 Adjust the brightness of the image .

adjust_contrast（img, contrast_factor）

 Adjust the contrast of the image .

adjust_gamma（img, gamma[, gain]）

 Perform gamma correction on the image .

adjust_hue（img, hue_factor）

 Adjust the hue of the image .

adjust_saturation（img, saturation_factor）

 Adjust the color saturation of the image .

adjust_sharpness（img, sharpness_factor）

 Adjust the sharpness of the image .

 affine （img, angle , translation , The zoom , shear ）

 Apply affine transformation to the image , Keep the center of the image unchanged .

 Automatic contrast device （img）

 Maximize the contrast of the image by remapping the pixels of each channel , Make the lowest pixel black , The brightest image turns white .

center_crop（img, output_size）

 Crop the given image in the center .

convert_image_dtype（ Images [,dtype]）

 Transform the tensor image into a given image and scale the value accordingly   This function does not support  PIL  Images .dtype

 tailoring （img,  Top ,  Left ,  Height ,  Width ）

 Crop the given image at the specified position and output size .

 equilibrium （img）

 The histogram of the image is equalized by applying nonlinear mapping to the input , In order to create a uniform distribution of gray values in the output .

erase（img, i, j, h, w, v[, inplace]）

 Erase the input tensor image with a given value .

five_crop（ British system , Size ）

 Crop the given image into four corners and the center .

gaussian_blur（img, kernel_size[, sigma]）

 Gaussian blur is performed by a given check image .

get_dimensions（img）

 Returns the size of the image as  [ passageway 、 Height 、 Width ].

get_image_num_channels（img）

 Return the number of channels of the image .

get_image_size（img）

 Return the size of the image to  [ Width 、 Height ].

hflip（img）

 Flip the given image horizontally .

 reverse （img）

 reverse  RGB/ Color of grayscale image .

 Normalization （ tensor , Average , standard [, In situ ]）

 Normalize floating-point tensor images using mean and standard deviation .

pad（img, padding[, fill, padding_mode]）

 Using the given “pad” Value fills the given image on all sides .

 perspective （img, startpoints, endpoints[, ...]）

 Perform perspective transformation on a given image .

pil_to_tensor（ chart ）

 take  a  Convert to the same type of tensor .PIL Image

posterize（img, bits）

 By reducing the number of bits of each color channel, the image is tone separated .

 Resize （img, size[, interpolation, max_size, ...]）

 Adjust the size of the input image to the given size .

resized_crop（img, top, left, height, width, size）

 Crop the given image and resize it to the desired size .

rgb_to_grayscale（img[, num_output_channels]）

 take  RGB  The image is converted to a grayscale version of the image .

 rotate （img, angle[,  interpolation ,  an , ...]）

 Rotate the image by angle .

solarize（img, threshold）

 By inverting all pixel values above the threshold value, we can make a daily change  RGB/ Grayscale image .

ten_crop（img, size[, vertical_flip]）

 Generate ten cropped images from a given image .

to_grayscale（img[, num_output_channels]）

 Put any mode （RGB,HSV,LAB etc. ） Of PIL The image is converted to a grayscale version of the image .

to_pil_image（ picture [, Pattern ]）

 Will tensor or  ndarray  Convert to  PIL  Images .

to_tensor（ chart ）

 take   or   Convert to tensor .PIL Imagenumpy.ndarray

vflip（img）

 Flip the given image vertically .

data

Torchvision Many built-in data sets are provided in the module , And utility classes for building your own datasets .torchvision.datasets

Built in datasets

All datasets are torch.utils.data.Dataset Subclasses of , That is, the methods they have and implement . therefore , They can all be passed on to torch.utils.data.DataLoader, The latter can be used worker Load multiple samples in parallel . for example ：__getitem____len__torch.multiprocessing

imagenet_data = torchvision.datasets.ImageNet(‘path/to/imagenet_root/’)
data_loader = torch.utils.data.DataLoader(imagenet_data,
batch_size=4,
shuffle=True,
num_workers=args.nThreads)
All data sets have almost similar API. They all have two parameters in common ： Convert input and target respectively . You can also use the provided base classes to create your own data sets .transformtarget_transform

####  Image classification  https://pytorch.org/vision/stable/datasets.html#image-classification
####  Image detection or segmentation ‎ https://pytorch.org/vision/stable/datasets.html#image-detection-or-segmentation
####  Optical flow  https://pytorch.org/vision/stable/datasets.html#optical-flow
####  Video classification  https://pytorch.org/vision/stable/datasets.html#video-classification

Base classes that can be used by custom datasets

DatasetFolder（root, loader, Any], …）
General data loader .
ImageFolder（root, transform, …）
A universal data loader , By default , The images are arranged in this way ：.
VisionDataset（root, transforms, transform, …）
Base class Used to make data sets compatible with torch vision .

To be continued ...

原网站

版权声明
本文为[AiFool]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/208/202207271725514948.html