当前位置：网站首页>In depth analysis of Monai (I) data and transforms

In depth analysis of Monai (I) data and transforms

2022-06-29 16:43:00 【You have to step over when you encounter difficulties】

Catalog

Preface
Easy to use
Other data enhancement methods
Custom data reader
Custom data enhancement actions
summary

Preface

Nothing has happened recently , I studied it monai, It is an excellent one based on pytorch Medical deep learning framework , It includes Tansformers( Responsible for data reading and data enhancement )、Loss functions( Contains common loss functions )、Network architectures( The common medical image segmentation is realized model)、Metrics( Evaluation function during verification )、Optimizer( Optimizer )、Data(Dataset and DataLoader) And several commonly used deep learning components . Through these components , We can define our own model after , Easy to train . In this article, we want to talk about monai Of Transformers Data enhancement components and Data Components .

Easy to use

First of all, let's recall , When you write the training logic, you must define it first Dataset class ,Dataset A class can call its own __getitem__ Method returns data , At this time, the dimension of the data is [C, H, W, D],C Represents the number of channels ,H,W,D They stand for high 、 wide 、 deep ( The depth dimension is only available in three dimensions ). And then through DataLoader Class called more than once Dataset Class __getitem__ Generate multiple samples , Put them together , The returned data dimension is [B, C, H, W, D], B representative batch_size size .
monai This idea is also followed when loading data , First define Dataset, Reuse DataLoader. In defining Dataset When , We can introduce a series of monai Customized data enhancement methods , such as Data reading 、 Random rotation of data 、 tailoring 、 Flip 、 segmentation patch, normalization 、 Standardization 、 To tentor etc. , These data enhancement operations are written uniformly monai.Transformers Module . and Pytorch equally , These data enhancement operations can be unified by monai.transforms.Compose Class , In this way, the data can be automatically streamed , Reduce the amount of code .
Let's look at a simple example , Is a directly invoked data enhancement method without a dictionary , But this method cannot be used dataloader packing .

from monai import transforms, data
#  Define data set list 
    data_list = ["F:/9.4Data/ski10/image/image-001.nii.gz", 
                 "F:/9.4Data/ski10/image/image-002.nii.gz",
                 "F:/9.4Data/ski10/image/image-003.nii.gz", 
                 "F:/9.4Data/ski10/image/image-004.nii.gz"
                ]
#  Define data enhancement actions 
train_transform = transforms.Compose([
    transforms.LoadImage(),  #  Load image , The bottom layer will select the corresponding data reader according to the file name ,nii The end of the file defaults to ITK Reading data 
    transforms.AddChannel(),  #  Increase access ,monai all Transforms The default input format of the method is [C, W, H, ...], The first dimension must be the channel dimension 
    transforms.ToTensor()  #  take numpy To tensor, Pay attention to and pytorch The difference is , This operation does not include the normalization step 
])

In fact, many data enhancement operations are image and label At the same time , such as Crop and rotate . and pytorch Data enhancement method torchvision.transform The difference is ,monai Each data enhancement method class in corresponds to a dictionary enhancement class , With d ending . Such a dictionary enhancement class takes a dictionary object as input , Such as {"image": "", "label": ""}, When constructing, you can use keys The parameter is specified in image or label Operation on top , Inside this class, you can use __call__() Method to perform corresponding data enhancement operations , See the source code for details . The final output is also a dictionary , The key And incoming key Value consistent . Here is a simple example ：

from monai import transforms, data

data_list = [{
    "image": "F:/9.4Data/ski10/image/image-001.nii.gz", "label": "F:/9.4Data/ski10/label/labels-001.nii.gz"}, 
                 {
    "image": "F:/9.4Data/ski10/image/image-002.nii.gz", "label": "F:/9.4Data/ski10/label/labels-002.nii.gz"},
                 {
    "image": "F:/9.4Data/ski10/image/image-003.nii.gz", "label": "F:/9.4Data/ski10/label/labels-003.nii.gz"}, 
                 {
    "image": "F:/9.4Data/ski10/image/image-004.nii.gz", "label": "F:/9.4Data/ski10/label/labels-004.nii.gz"}
                ]
train_transformd = transforms.Compose([
    #  Load image , The corresponding read class will be selected by default according to the file suffix 
    transforms.LoadImaged(keys=["image", "label"]),
    #  Add channel dimension 
    transforms.AddChanneld(keys=["image", "label"]),
    #  Crop according to the foreground , Will cut out the foreground 
    transforms.CropForegroundd(keys=["image", "label"], source_key="label", margin=5),
    #  Turn into tensor, There is no normalization here , It simply turns into tensor Of float
    transforms.ToTensord(keys=["image", "label"])
])

Other data enhancement methods

See official documents （ I have time to add later ）
Here is a special data enhancement method transforms.RandCropByPosNegLabeld.
function ： Mainly On the original drawing, randomly cut out a fixed size according to the proportion of positive and negative samples patch block , It is applicable to the case that the positive and negative samples are unbalanced , With this operation, the sample can be balanced , You can also cut out Fixed size Of patch Block into the network for training .
There is nothing to say about its function , The main reason is that as we said before, all the data enhancement class inputs and outputs are one A dictionary object , Each dictionary object represents a training object , This class can cut out several patch, So it outputs a Multiple dictionary objects A list of , As shown in the figure below .
Insert picture description here

that , The problem is coming. , How do the multiple dictionaries output by this data enhancement class enter the next data enhancement class ( The input of the data intensifier should be a dictionary , Not a list )?
I'm looking Compose Class source code , I find Compose When sending the output of the previous data enhancement class to the next data enhancement class , Will make a judgment ： If it's a list , Then cycle through the input ; If it's a dictionary , Then enter .
In this way, the input cycle is equivalent to an additional batch_size dimension , The later period also confirmed my idea , When I dataloader Medium batch_size=2, and transforms.RandCropByPosNegLabeld Class num_samples=4 When each iteration is batch_size=2*4=8

from monai import transforms, data

data_list = [{
    "image": "F:/9.4Data/ski10/image/image-001.nii.gz", "label": "F:/9.4Data/ski10/label/labels-001.nii.gz"}, 
                 {
    "image": "F:/9.4Data/ski10/image/image-002.nii.gz", "label": "F:/9.4Data/ski10/label/labels-002.nii.gz"},
                 {
    "image": "F:/9.4Data/ski10/image/image-003.nii.gz", "label": "F:/9.4Data/ski10/label/labels-003.nii.gz"}, 
                 {
    "image": "F:/9.4Data/ski10/image/image-004.nii.gz", "label": "F:/9.4Data/ski10/label/labels-004.nii.gz"}
                ]

train_transformd = transforms.Compose([
    #  Load image , The corresponding read class will be selected by default according to the file suffix 
    transforms.LoadImaged(keys=["image", "label"]),
    #  Add channel dimension 
    transforms.AddChanneld(keys=["image", "label"]),
    #  Crop according to the foreground , Will cut out the foreground 
    transforms.CropForegroundd(keys=["image", "label"], source_key="label", margin=5),
    #  Crop the background and foreground proportionally ,  If num_samples Not for 1, The trimmed samples with the specified values will be , Put in list Back in , Last dataloader Can spell it 
    #  Here, for example. num_samples=4, dataloader Of batch_size=2, Then each iteration will eventually return 4*2=8 Samples , namely bacth_size=8
    # spatial_size An error will be reported when the original data size is exceeded 
    #  Custom normalized data 
    Uniformd(keys=["image"]),
    transforms.RandCropByPosNegLabeld(keys=["image", "label"],
                                      label_key="label",
                                      spatial_size=[256, 256, 80],
                                      pos=1,
                                      neg=1,
                                      num_samples=4,
                                      image_key="image"),
    #  Use the interpolation algorithm to zoom to a fixed size , size_mode='all' It means that the original aspect ratio will not be retained 
    # transforms.Resized(keys=["image", "label"], spatial_size=[256, 256, 100], size_mode="all", mode=["area", "nearest"]),
    #  Normalized zoom pixel values , For example, put and shrink to 0-1; It doesn't apply here ski10 Data sets , because si10 The value range of each sample in the data set is different , So we customized 
    # transforms.ScaleIntensityRanged(keys=["image", "label"],
    # a_min=0, a_max=5000,
    # b_min=0, b_max=1),
    #  Turn into tensor, There is no normalization here , It simply turns into tensor Of float
    transforms.ToTensord(keys=["image", "label"])
])

train_dataset = data.Dataset(data=data_list, transform=train_transformd)

train_dataLoader = data.DataLoader(dataset=train_dataset, batch_size=2, shuffle=True, num_workers=2)

print(' Number of training data sets ', len(train_dataset))

for batch_data in train_dataLoader:
    image, label = batch_data["image"], batch_data["label"]
    print('image shape:', image.shape, 'label shape:', label.shape, 'max:', torch.max(image), 'min:', torch.min(image))

Custom data reader

In the example above , We read the data nii It's using transforms.LoadImaged(keys=["image", "label"]) Method to read the file name of the image , But have we ever thought about how to read data internally ？
It turns out that this class has a reader Parameters , This is a class that reads data , Internally, it is by calling Reader Class to read the file name of the image . So we have defined Reader Class? , The answer is no , The official has written it , Read nii or nii.gz Would call ITKReader class , Read png、jpeg Will use PILReader.
If we want to define our own data readers , What should be done ？
The answer is inheritance data.ImageReader class , Realization get_data,read, verify_suffix The method can ( See the official document for the specific return value ), Here I am ITKReader On the basis of , A normalization class is customized , It can calculate the maximum and minimum values , Thus, the voxel value is normalized to [0, 1], The code is as follows , When used, it can be directly passed in as a parameter :

from monai import transforms, data
#  Custom reader get_data Method , Note that the object that the reader handles is a nii file , He didn't know it was image still label, Is in loadImage In the 
class MyReader(data.ITKReader):
    def __init__(self, channel_dim: Optional[int] = None, series_name: str = "", reverse_indexing: bool = False, series_meta: bool = False, **kwargs):
        super().__init__(channel_dim, series_name, reverse_indexing, series_meta, **kwargs)

    def get_data(self, img):
        image, meta = super().get_data(img)
        image = np.array(image)
        #  Only right image Perform the normalization operation 
        if np.max(image) != 4:
            max_value, min_value = np.max(image), np.min(image)
            #  Normalize and zoom to... According to the maximum and minimum values 0-1
            image = (image - min_value) / (max_value - min_value)
        # print(np.max(image), np.min(image))
        return image, meta

#  Use a custom reader class 
#  Load image , The corresponding read class will be selected by default according to the file suffix 
transforms.LoadImaged(keys=["image", "label"], reader=MyReader)

Custom data enhancement actions

Let's go back to the question just now , I want to base it on each nii Normalize the maximum and minimum values of the file , In addition to advance operation when reading data , Is there any other way ？
Of course. ！ Directly define its own normalized data enhancement class Uniformd Wouldn't it be more convenient .
How do you define it ？
Officials did not say , But I see the source code , The first is to inherit ’MapTransform, InvertibleTransform’ Two classes , Then implement __call__（ Data enhanced forward call with ） and inverse Method （ The enhanced data returns the original data , It seems almost useless ） that will do .
emphasize , because monai Each dictionary enhancement class in corresponds to a data enhancement class with the same function without a dictionary , Therefore, the official internal implementation directly instantiates a , Then call inside .
And I define it myself for simplicity , Is to write a function directly to complete the operation .
The code is as follows :

class Uniformd(MapTransform, InvertibleTransform):
    """  Normalized value  """
    def __init__(
        self,
        keys,
        dtype: Optional[torch.dtype] = None,
        device: Optional[torch.device] = None,
        wrap_sequence: bool = True,
        allow_missing_keys: bool = False,
    ) -> None:
        super().__init__(keys, allow_missing_keys)

    def __call__(self, data):
        d = dict(data)
        for key in self.key_iterator(d):
            self.push_transform(d, key)
            d[key] = self.uniform(d[key])
        return d

    def uniform(self, data):
        max_value, min_value = np.max(data), np.min(data)
        #  Normalize and zoom to... According to the maximum and minimum values 0-1
        data = (data - min_value) / (max_value - min_value)
        return data

    def inverse(self, data):
        d = deepcopy(dict(data))
        for key in self.key_iterator(d):
            # Create inverse transform
            # inverse_transform = ToNumpy()
            # Apply inverse
            d[key] = self.uniform(d[key])
            # Remove the applied transform
            self.pop_transform(d, key)
        return d

After that, you can initialize the call like the official data enhancement class # Custom normalized data Uniformd(keys=["image"]).

summary

For two days , Finally solved many of my doubts , It seems that the best learning materials are source code and official website , Everyone makes good use of !
monai The website links

原网站

版权声明
本文为[You have to step over when you encounter difficulties]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/180/202206291626215311.html