当前位置:网站首页>In depth analysis of Monai (I) data and transforms
In depth analysis of Monai (I) data and transforms
2022-06-29 16:43:00 【You have to step over when you encounter difficulties】
Catalog
Preface
Nothing has happened recently , I studied it monai, It is an excellent one based on pytorch Medical deep learning framework , It includes Tansformers( Responsible for data reading and data enhancement )、Loss functions( Contains common loss functions )、Network architectures( The common medical image segmentation is realized model)、Metrics( Evaluation function during verification )、Optimizer( Optimizer )、Data(Dataset and DataLoader) And several commonly used deep learning components . Through these components , We can define our own model after , Easy to train . In this article, we want to talk about monai Of Transformers Data enhancement components and Data Components .
Easy to use
First of all, let's recall , When you write the training logic, you must define it first Dataset class ,Dataset A class can call its own __getitem__ Method returns data , At this time, the dimension of the data is [C, H, W, D],C Represents the number of channels ,H,W,D They stand for high 、 wide 、 deep ( The depth dimension is only available in three dimensions ). And then through DataLoader Class called more than once Dataset Class __getitem__ Generate multiple samples , Put them together , The returned data dimension is [B, C, H, W, D], B representative batch_size size .
monai This idea is also followed when loading data , First define Dataset, Reuse DataLoader. In defining Dataset When , We can introduce a series of monai Customized data enhancement methods , such as Data reading 、 Random rotation of data 、 tailoring 、 Flip 、 segmentation patch, normalization 、 Standardization 、 To tentor etc. , These data enhancement operations are written uniformly monai.Transformers Module . and Pytorch equally , These data enhancement operations can be unified by monai.transforms.Compose Class , In this way, the data can be automatically streamed , Reduce the amount of code .
Let's look at a simple example , Is a directly invoked data enhancement method without a dictionary , But this method cannot be used dataloader packing .
from monai import transforms, data
# Define data set list
data_list = ["F:/9.4Data/ski10/image/image-001.nii.gz",
"F:/9.4Data/ski10/image/image-002.nii.gz",
"F:/9.4Data/ski10/image/image-003.nii.gz",
"F:/9.4Data/ski10/image/image-004.nii.gz"
]
# Define data enhancement actions
train_transform = transforms.Compose([
transforms.LoadImage(), # Load image , The bottom layer will select the corresponding data reader according to the file name ,nii The end of the file defaults to ITK Reading data
transforms.AddChannel(), # Increase access ,monai all Transforms The default input format of the method is [C, W, H, ...], The first dimension must be the channel dimension
transforms.ToTensor() # take numpy To tensor, Pay attention to and pytorch The difference is , This operation does not include the normalization step
])
In fact, many data enhancement operations are image and label At the same time , such as Crop and rotate . and pytorch Data enhancement method torchvision.transform The difference is ,monai Each data enhancement method class in corresponds to a dictionary enhancement class , With d ending . Such a dictionary enhancement class takes a dictionary object as input , Such as {"image": "", "label": ""}, When constructing, you can use keys The parameter is specified in image or label Operation on top , Inside this class, you can use __call__() Method to perform corresponding data enhancement operations , See the source code for details . The final output is also a dictionary , The key And incoming key Value consistent . Here is a simple example :
from monai import transforms, data
data_list = [{
"image": "F:/9.4Data/ski10/image/image-001.nii.gz", "label": "F:/9.4Data/ski10/label/labels-001.nii.gz"},
{
"image": "F:/9.4Data/ski10/image/image-002.nii.gz", "label": "F:/9.4Data/ski10/label/labels-002.nii.gz"},
{
"image": "F:/9.4Data/ski10/image/image-003.nii.gz", "label": "F:/9.4Data/ski10/label/labels-003.nii.gz"},
{
"image": "F:/9.4Data/ski10/image/image-004.nii.gz", "label": "F:/9.4Data/ski10/label/labels-004.nii.gz"}
]
train_transformd = transforms.Compose([
# Load image , The corresponding read class will be selected by default according to the file suffix
transforms.LoadImaged(keys=["image", "label"]),
# Add channel dimension
transforms.AddChanneld(keys=["image", "label"]),
# Crop according to the foreground , Will cut out the foreground
transforms.CropForegroundd(keys=["image", "label"], source_key="label", margin=5),
# Turn into tensor, There is no normalization here , It simply turns into tensor Of float
transforms.ToTensord(keys=["image", "label"])
])
Other data enhancement methods
See official documents ( I have time to add later )
Here is a special data enhancement method transforms.RandCropByPosNegLabeld.
function : Mainly On the original drawing, randomly cut out a fixed size according to the proportion of positive and negative samples patch block , It is applicable to the case that the positive and negative samples are unbalanced , With this operation, the sample can be balanced , You can also cut out Fixed size Of patch Block into the network for training .
There is nothing to say about its function , The main reason is that as we said before, all the data enhancement class inputs and outputs are one A dictionary object , Each dictionary object represents a training object , This class can cut out several patch, So it outputs a Multiple dictionary objects A list of , As shown in the figure below .
that , The problem is coming. , How do the multiple dictionaries output by this data enhancement class enter the next data enhancement class ( The input of the data intensifier should be a dictionary , Not a list )?
I'm looking Compose Class source code , I find Compose When sending the output of the previous data enhancement class to the next data enhancement class , Will make a judgment : If it's a list , Then cycle through the input ; If it's a dictionary , Then enter .
In this way, the input cycle is equivalent to an additional batch_size dimension , The later period also confirmed my idea , When I dataloader Medium batch_size=2, and transforms.RandCropByPosNegLabeld Class num_samples=4 When each iteration is batch_size=2*4=8
from monai import transforms, data
data_list = [{
"image": "F:/9.4Data/ski10/image/image-001.nii.gz", "label": "F:/9.4Data/ski10/label/labels-001.nii.gz"},
{
"image": "F:/9.4Data/ski10/image/image-002.nii.gz", "label": "F:/9.4Data/ski10/label/labels-002.nii.gz"},
{
"image": "F:/9.4Data/ski10/image/image-003.nii.gz", "label": "F:/9.4Data/ski10/label/labels-003.nii.gz"},
{
"image": "F:/9.4Data/ski10/image/image-004.nii.gz", "label": "F:/9.4Data/ski10/label/labels-004.nii.gz"}
]
train_transformd = transforms.Compose([
# Load image , The corresponding read class will be selected by default according to the file suffix
transforms.LoadImaged(keys=["image", "label"]),
# Add channel dimension
transforms.AddChanneld(keys=["image", "label"]),
# Crop according to the foreground , Will cut out the foreground
transforms.CropForegroundd(keys=["image", "label"], source_key="label", margin=5),
# Crop the background and foreground proportionally , If num_samples Not for 1, The trimmed samples with the specified values will be , Put in list Back in , Last dataloader Can spell it
# Here, for example. num_samples=4, dataloader Of batch_size=2, Then each iteration will eventually return 4*2=8 Samples , namely bacth_size=8
# spatial_size An error will be reported when the original data size is exceeded
# Custom normalized data
Uniformd(keys=["image"]),
transforms.RandCropByPosNegLabeld(keys=["image", "label"],
label_key="label",
spatial_size=[256, 256, 80],
pos=1,
neg=1,
num_samples=4,
image_key="image"),
# Use the interpolation algorithm to zoom to a fixed size , size_mode='all' It means that the original aspect ratio will not be retained
# transforms.Resized(keys=["image", "label"], spatial_size=[256, 256, 100], size_mode="all", mode=["area", "nearest"]),
# Normalized zoom pixel values , For example, put and shrink to 0-1; It doesn't apply here ski10 Data sets , because si10 The value range of each sample in the data set is different , So we customized
# transforms.ScaleIntensityRanged(keys=["image", "label"],
# a_min=0, a_max=5000,
# b_min=0, b_max=1),
# Turn into tensor, There is no normalization here , It simply turns into tensor Of float
transforms.ToTensord(keys=["image", "label"])
])
train_dataset = data.Dataset(data=data_list, transform=train_transformd)
train_dataLoader = data.DataLoader(dataset=train_dataset, batch_size=2, shuffle=True, num_workers=2)
print(' Number of training data sets ', len(train_dataset))
for batch_data in train_dataLoader:
image, label = batch_data["image"], batch_data["label"]
print('image shape:', image.shape, 'label shape:', label.shape, 'max:', torch.max(image), 'min:', torch.min(image))
Custom data reader
In the example above , We read the data nii It's using transforms.LoadImaged(keys=["image", "label"]) Method to read the file name of the image , But have we ever thought about how to read data internally ?
It turns out that this class has a reader Parameters , This is a class that reads data , Internally, it is by calling Reader Class to read the file name of the image . So we have defined Reader Class? , The answer is no , The official has written it , Read nii or nii.gz Would call ITKReader class , Read png、jpeg Will use PILReader.
If we want to define our own data readers , What should be done ?
The answer is inheritance data.ImageReader class , Realization get_data,read, verify_suffix The method can ( See the official document for the specific return value ), Here I am ITKReader On the basis of , A normalization class is customized , It can calculate the maximum and minimum values , Thus, the voxel value is normalized to [0, 1], The code is as follows , When used, it can be directly passed in as a parameter :
from monai import transforms, data
# Custom reader get_data Method , Note that the object that the reader handles is a nii file , He didn't know it was image still label, Is in loadImage In the
class MyReader(data.ITKReader):
def __init__(self, channel_dim: Optional[int] = None, series_name: str = "", reverse_indexing: bool = False, series_meta: bool = False, **kwargs):
super().__init__(channel_dim, series_name, reverse_indexing, series_meta, **kwargs)
def get_data(self, img):
image, meta = super().get_data(img)
image = np.array(image)
# Only right image Perform the normalization operation
if np.max(image) != 4:
max_value, min_value = np.max(image), np.min(image)
# Normalize and zoom to... According to the maximum and minimum values 0-1
image = (image - min_value) / (max_value - min_value)
# print(np.max(image), np.min(image))
return image, meta
# Use a custom reader class
# Load image , The corresponding read class will be selected by default according to the file suffix
transforms.LoadImaged(keys=["image", "label"], reader=MyReader)
Custom data enhancement actions
Let's go back to the question just now , I want to base it on each nii Normalize the maximum and minimum values of the file , In addition to advance operation when reading data , Is there any other way ?
Of course. ! Directly define its own normalized data enhancement class Uniformd Wouldn't it be more convenient .
How do you define it ?
Officials did not say , But I see the source code , The first is to inherit ’MapTransform, InvertibleTransform’ Two classes , Then implement __call__( Data enhanced forward call with ) and inverse Method ( The enhanced data returns the original data , It seems almost useless ) that will do .
emphasize , because monai Each dictionary enhancement class in corresponds to a data enhancement class with the same function without a dictionary , Therefore, the official internal implementation directly instantiates a , Then call inside .
And I define it myself for simplicity , Is to write a function directly to complete the operation .
The code is as follows :
class Uniformd(MapTransform, InvertibleTransform):
""" Normalized value """
def __init__(
self,
keys,
dtype: Optional[torch.dtype] = None,
device: Optional[torch.device] = None,
wrap_sequence: bool = True,
allow_missing_keys: bool = False,
) -> None:
super().__init__(keys, allow_missing_keys)
def __call__(self, data):
d = dict(data)
for key in self.key_iterator(d):
self.push_transform(d, key)
d[key] = self.uniform(d[key])
return d
def uniform(self, data):
max_value, min_value = np.max(data), np.min(data)
# Normalize and zoom to... According to the maximum and minimum values 0-1
data = (data - min_value) / (max_value - min_value)
return data
def inverse(self, data):
d = deepcopy(dict(data))
for key in self.key_iterator(d):
# Create inverse transform
# inverse_transform = ToNumpy()
# Apply inverse
d[key] = self.uniform(d[key])
# Remove the applied transform
self.pop_transform(d, key)
return d
After that, you can initialize the call like the official data enhancement class # Custom normalized data Uniformd(keys=["image"]).
summary
For two days , Finally solved many of my doubts , It seems that the best learning materials are source code and official website , Everyone makes good use of !
monai The website links
边栏推荐
- Fluent的msh格式网格学习
- C# Winfrom Chart图表控件 柱状图、折线图
- 又拍云 Redis 的改进之路
- DAP large screen theme development description
- 全面剖析Seata 分布式事务 AT 与XA
- Review of mathematical knowledge: curve integral of type I
- About xampp unable to start MySQL database
- 如何利用OpenMesh实现不同格式的3D文件间的转换
- Tool chain empowers hundreds of companies, horizon opens the "Matthew effect" of mass production of intelligent driving
- Tencent cloud released the upgraded version of CDW Clickhouse to provide a fast experience for massive data real-time analysis scenarios
猜你喜欢

研究所的这些优势真香!上岸率还极高!

Tencent cloud released the upgraded version of CDW Clickhouse to provide a fast experience for massive data real-time analysis scenarios

Calibration of binocular camera based on OpenCV

DAP large screen theme development description

After reading the complete code

iNFTnews | Meta在元宇宙中的后续计划会是什么?

使用kalibr標定工具進行單目相機和雙目相機的標定

InheritableThreadLocal 在线程池中进行父子线程间消息传递出现消息丢失的解析

数学知识复习:第一型曲线积分

八年测开经验面试28K公司后,吐血整理出高频面试题和答案
随机推荐
我,大厂测试员,降薪50%去国企,后悔了...
实战 | Change Detection And Batch Update
After studying this series of notes about software testing, it is a "bonus" to enter the factory
curl: (56) Recv failure: Connection reset by peer
C# Winfrom Chart图表控件 柱状图、折线图
研究所的这些优势真香!上岸率还极高!
MySQL进阶——存储引擎
iNFTnews | Meta在元宇宙中的后续计划会是什么?
解题元宇宙,网络游戏中的多元通信方案
Picture and text show you how to thoroughly understand the atomicity of MySQL transaction undolog
隐私计算助力数据的安全流通与共享
Profil de risque de monnaie stable: l'usdt et l'USDC sont - ils sûrs?
Redolog and binlog
New feature of C11 - Auto and decltype type type indicators
InheritableThreadLocal 在线程池中进行父子线程间消息传递出现消息丢失的解析
论文笔记:E(n) Equivariant Graph Neural Networks
Sophon CE社区版上线,免费Get轻量易用、高效智能的数据分析工具
Summary of problems during xampp Apache installation
Calibration of binocular camera based on OpenCV
Practice | solution for image upload, rotation and compression on mobile terminal