当前位置:网站首页>In depth analysis of Monai (I) data and transforms
In depth analysis of Monai (I) data and transforms
2022-06-29 16:43:00 【You have to step over when you encounter difficulties】
Catalog
Preface
Nothing has happened recently , I studied it monai, It is an excellent one based on pytorch Medical deep learning framework , It includes Tansformers( Responsible for data reading and data enhancement )、Loss functions( Contains common loss functions )、Network architectures( The common medical image segmentation is realized model)、Metrics( Evaluation function during verification )、Optimizer( Optimizer )、Data(Dataset and DataLoader) And several commonly used deep learning components . Through these components , We can define our own model after , Easy to train . In this article, we want to talk about monai Of Transformers Data enhancement components and Data Components .
Easy to use
First of all, let's recall , When you write the training logic, you must define it first Dataset class ,Dataset A class can call its own __getitem__ Method returns data , At this time, the dimension of the data is [C, H, W, D],C Represents the number of channels ,H,W,D They stand for high 、 wide 、 deep ( The depth dimension is only available in three dimensions ). And then through DataLoader Class called more than once Dataset Class __getitem__ Generate multiple samples , Put them together , The returned data dimension is [B, C, H, W, D], B representative batch_size size .
monai This idea is also followed when loading data , First define Dataset, Reuse DataLoader. In defining Dataset When , We can introduce a series of monai Customized data enhancement methods , such as Data reading 、 Random rotation of data 、 tailoring 、 Flip 、 segmentation patch, normalization 、 Standardization 、 To tentor etc. , These data enhancement operations are written uniformly monai.Transformers Module . and Pytorch equally , These data enhancement operations can be unified by monai.transforms.Compose Class , In this way, the data can be automatically streamed , Reduce the amount of code .
Let's look at a simple example , Is a directly invoked data enhancement method without a dictionary , But this method cannot be used dataloader packing .
from monai import transforms, data
# Define data set list
data_list = ["F:/9.4Data/ski10/image/image-001.nii.gz",
"F:/9.4Data/ski10/image/image-002.nii.gz",
"F:/9.4Data/ski10/image/image-003.nii.gz",
"F:/9.4Data/ski10/image/image-004.nii.gz"
]
# Define data enhancement actions
train_transform = transforms.Compose([
transforms.LoadImage(), # Load image , The bottom layer will select the corresponding data reader according to the file name ,nii The end of the file defaults to ITK Reading data
transforms.AddChannel(), # Increase access ,monai all Transforms The default input format of the method is [C, W, H, ...], The first dimension must be the channel dimension
transforms.ToTensor() # take numpy To tensor, Pay attention to and pytorch The difference is , This operation does not include the normalization step
])
In fact, many data enhancement operations are image and label At the same time , such as Crop and rotate . and pytorch Data enhancement method torchvision.transform The difference is ,monai Each data enhancement method class in corresponds to a dictionary enhancement class , With d ending . Such a dictionary enhancement class takes a dictionary object as input , Such as {"image": "", "label": ""}, When constructing, you can use keys The parameter is specified in image or label Operation on top , Inside this class, you can use __call__() Method to perform corresponding data enhancement operations , See the source code for details . The final output is also a dictionary , The key And incoming key Value consistent . Here is a simple example :
from monai import transforms, data
data_list = [{
"image": "F:/9.4Data/ski10/image/image-001.nii.gz", "label": "F:/9.4Data/ski10/label/labels-001.nii.gz"},
{
"image": "F:/9.4Data/ski10/image/image-002.nii.gz", "label": "F:/9.4Data/ski10/label/labels-002.nii.gz"},
{
"image": "F:/9.4Data/ski10/image/image-003.nii.gz", "label": "F:/9.4Data/ski10/label/labels-003.nii.gz"},
{
"image": "F:/9.4Data/ski10/image/image-004.nii.gz", "label": "F:/9.4Data/ski10/label/labels-004.nii.gz"}
]
train_transformd = transforms.Compose([
# Load image , The corresponding read class will be selected by default according to the file suffix
transforms.LoadImaged(keys=["image", "label"]),
# Add channel dimension
transforms.AddChanneld(keys=["image", "label"]),
# Crop according to the foreground , Will cut out the foreground
transforms.CropForegroundd(keys=["image", "label"], source_key="label", margin=5),
# Turn into tensor, There is no normalization here , It simply turns into tensor Of float
transforms.ToTensord(keys=["image", "label"])
])
Other data enhancement methods
See official documents ( I have time to add later )
Here is a special data enhancement method transforms.RandCropByPosNegLabeld.
function : Mainly On the original drawing, randomly cut out a fixed size according to the proportion of positive and negative samples patch block , It is applicable to the case that the positive and negative samples are unbalanced , With this operation, the sample can be balanced , You can also cut out Fixed size Of patch Block into the network for training .
There is nothing to say about its function , The main reason is that as we said before, all the data enhancement class inputs and outputs are one A dictionary object , Each dictionary object represents a training object , This class can cut out several patch, So it outputs a Multiple dictionary objects A list of , As shown in the figure below .
that , The problem is coming. , How do the multiple dictionaries output by this data enhancement class enter the next data enhancement class ( The input of the data intensifier should be a dictionary , Not a list )?
I'm looking Compose Class source code , I find Compose When sending the output of the previous data enhancement class to the next data enhancement class , Will make a judgment : If it's a list , Then cycle through the input ; If it's a dictionary , Then enter .
In this way, the input cycle is equivalent to an additional batch_size dimension , The later period also confirmed my idea , When I dataloader Medium batch_size=2, and transforms.RandCropByPosNegLabeld Class num_samples=4 When each iteration is batch_size=2*4=8
from monai import transforms, data
data_list = [{
"image": "F:/9.4Data/ski10/image/image-001.nii.gz", "label": "F:/9.4Data/ski10/label/labels-001.nii.gz"},
{
"image": "F:/9.4Data/ski10/image/image-002.nii.gz", "label": "F:/9.4Data/ski10/label/labels-002.nii.gz"},
{
"image": "F:/9.4Data/ski10/image/image-003.nii.gz", "label": "F:/9.4Data/ski10/label/labels-003.nii.gz"},
{
"image": "F:/9.4Data/ski10/image/image-004.nii.gz", "label": "F:/9.4Data/ski10/label/labels-004.nii.gz"}
]
train_transformd = transforms.Compose([
# Load image , The corresponding read class will be selected by default according to the file suffix
transforms.LoadImaged(keys=["image", "label"]),
# Add channel dimension
transforms.AddChanneld(keys=["image", "label"]),
# Crop according to the foreground , Will cut out the foreground
transforms.CropForegroundd(keys=["image", "label"], source_key="label", margin=5),
# Crop the background and foreground proportionally , If num_samples Not for 1, The trimmed samples with the specified values will be , Put in list Back in , Last dataloader Can spell it
# Here, for example. num_samples=4, dataloader Of batch_size=2, Then each iteration will eventually return 4*2=8 Samples , namely bacth_size=8
# spatial_size An error will be reported when the original data size is exceeded
# Custom normalized data
Uniformd(keys=["image"]),
transforms.RandCropByPosNegLabeld(keys=["image", "label"],
label_key="label",
spatial_size=[256, 256, 80],
pos=1,
neg=1,
num_samples=4,
image_key="image"),
# Use the interpolation algorithm to zoom to a fixed size , size_mode='all' It means that the original aspect ratio will not be retained
# transforms.Resized(keys=["image", "label"], spatial_size=[256, 256, 100], size_mode="all", mode=["area", "nearest"]),
# Normalized zoom pixel values , For example, put and shrink to 0-1; It doesn't apply here ski10 Data sets , because si10 The value range of each sample in the data set is different , So we customized
# transforms.ScaleIntensityRanged(keys=["image", "label"],
# a_min=0, a_max=5000,
# b_min=0, b_max=1),
# Turn into tensor, There is no normalization here , It simply turns into tensor Of float
transforms.ToTensord(keys=["image", "label"])
])
train_dataset = data.Dataset(data=data_list, transform=train_transformd)
train_dataLoader = data.DataLoader(dataset=train_dataset, batch_size=2, shuffle=True, num_workers=2)
print(' Number of training data sets ', len(train_dataset))
for batch_data in train_dataLoader:
image, label = batch_data["image"], batch_data["label"]
print('image shape:', image.shape, 'label shape:', label.shape, 'max:', torch.max(image), 'min:', torch.min(image))
Custom data reader
In the example above , We read the data nii It's using transforms.LoadImaged(keys=["image", "label"]) Method to read the file name of the image , But have we ever thought about how to read data internally ?
It turns out that this class has a reader Parameters , This is a class that reads data , Internally, it is by calling Reader Class to read the file name of the image . So we have defined Reader Class? , The answer is no , The official has written it , Read nii or nii.gz Would call ITKReader class , Read png、jpeg Will use PILReader.
If we want to define our own data readers , What should be done ?
The answer is inheritance data.ImageReader class , Realization get_data,read, verify_suffix The method can ( See the official document for the specific return value ), Here I am ITKReader On the basis of , A normalization class is customized , It can calculate the maximum and minimum values , Thus, the voxel value is normalized to [0, 1], The code is as follows , When used, it can be directly passed in as a parameter :
from monai import transforms, data
# Custom reader get_data Method , Note that the object that the reader handles is a nii file , He didn't know it was image still label, Is in loadImage In the
class MyReader(data.ITKReader):
def __init__(self, channel_dim: Optional[int] = None, series_name: str = "", reverse_indexing: bool = False, series_meta: bool = False, **kwargs):
super().__init__(channel_dim, series_name, reverse_indexing, series_meta, **kwargs)
def get_data(self, img):
image, meta = super().get_data(img)
image = np.array(image)
# Only right image Perform the normalization operation
if np.max(image) != 4:
max_value, min_value = np.max(image), np.min(image)
# Normalize and zoom to... According to the maximum and minimum values 0-1
image = (image - min_value) / (max_value - min_value)
# print(np.max(image), np.min(image))
return image, meta
# Use a custom reader class
# Load image , The corresponding read class will be selected by default according to the file suffix
transforms.LoadImaged(keys=["image", "label"], reader=MyReader)
Custom data enhancement actions
Let's go back to the question just now , I want to base it on each nii Normalize the maximum and minimum values of the file , In addition to advance operation when reading data , Is there any other way ?
Of course. ! Directly define its own normalized data enhancement class Uniformd Wouldn't it be more convenient .
How do you define it ?
Officials did not say , But I see the source code , The first is to inherit ’MapTransform, InvertibleTransform’ Two classes , Then implement __call__( Data enhanced forward call with ) and inverse Method ( The enhanced data returns the original data , It seems almost useless ) that will do .
emphasize , because monai Each dictionary enhancement class in corresponds to a data enhancement class with the same function without a dictionary , Therefore, the official internal implementation directly instantiates a , Then call inside .
And I define it myself for simplicity , Is to write a function directly to complete the operation .
The code is as follows :
class Uniformd(MapTransform, InvertibleTransform):
""" Normalized value """
def __init__(
self,
keys,
dtype: Optional[torch.dtype] = None,
device: Optional[torch.device] = None,
wrap_sequence: bool = True,
allow_missing_keys: bool = False,
) -> None:
super().__init__(keys, allow_missing_keys)
def __call__(self, data):
d = dict(data)
for key in self.key_iterator(d):
self.push_transform(d, key)
d[key] = self.uniform(d[key])
return d
def uniform(self, data):
max_value, min_value = np.max(data), np.min(data)
# Normalize and zoom to... According to the maximum and minimum values 0-1
data = (data - min_value) / (max_value - min_value)
return data
def inverse(self, data):
d = deepcopy(dict(data))
for key in self.key_iterator(d):
# Create inverse transform
# inverse_transform = ToNumpy()
# Apply inverse
d[key] = self.uniform(d[key])
# Remove the applied transform
self.pop_transform(d, key)
return d
After that, you can initialize the call like the official data enhancement class # Custom normalized data Uniformd(keys=["image"]).
summary
For two days , Finally solved many of my doubts , It seems that the best learning materials are source code and official website , Everyone makes good use of !
monai The website links
边栏推荐
- 又拍云 Redis 的改进之路
- UWB precise positioning scheme, centimeter level high-precision technology application, intelligent pairing induction technology
- C language -- printf print base prefix
- SAAS服务都有哪些优势
- Is it safe to open a compass account and speculate in stocks? How individuals open accounts for stock trading
- 把这份关于软件测试一系列笔记研究完,进大厂是个“加分项”...
- Problem solving metauniverse, multi communication scheme in online games
- Paper notes: e (n) equivariant graph neural networks
- Apache atlas breakpoint view
- 英联邦国家有哪些
猜你喜欢

【 OpenGL 】 Random Talk 1. The camera rotates around a point in the space by dragging the mouse

Picture and text show you how to thoroughly understand the atomicity of MySQL transaction undolog

【南京大学】考研初试复试资料分享

一个简单但是能上分的特征标准化方法

为防止被00后整顿,一公司招聘要求员工不能起诉公司

MySQL foundation - multi table query

真正的测试 =“半个产品+半个开发”?

Greedy Apple plans to raise the price of iphone14, which will provide opportunities for Chinese mobile phones

机器学习7-支持向量机

Sophon autocv: help AI industrial production and realize visual intelligent perception
随机推荐
使用kalibr标定工具进行单目相机和双目相机的标定
Tencent cloud released the upgraded version of CDW Clickhouse to provide a fast experience for massive data real-time analysis scenarios
kotlin基础语法
Problem solving metauniverse, multi communication scheme in online games
Paper notes: e (n) equivariant graph neural networks
Advanced MySQL - storage engine
Redis布隆过滤器和布谷鸟过滤器
C language microblog user management system
Flutter技术与实战(1)
MySQL基础——事务
「BUAA OO Unit 4 HW16」第四单元总结与课程回顾
Possible reasons for not triggering onreachbutton
稳定币风险状况:USDT 和 USDC 安全吗?
【 OpenGL 】 Random Talk 1. The camera rotates around a point in the space by dragging the mouse
redolog和binlog
Locust performance pressure test tool
[proteus simulation] progressive increase / decrease of nixie tube with flashing blanking display
全面剖析Seata 分布式事务 AT 与XA
Telnet+ftp to control and upgrade the equipment
PHP delete directory