当前位置:网站首页>[keras] data of 3D u-net source code analysis py
[keras] data of 3D u-net source code analysis py
2022-06-13 02:08:00 【liyihao76】
[Keras] 3D U-Net Source code analysis data.py
Author code :zishang33/3DUnetCNN
This article mainly introduces the previous train.py The functions mentioned in
This file is quite complicated , It is also messy to write , I hope you can combine my directory with the function calls when you watch , Jump to the corresponding position ( Subfunctions ) see , Otherwise, it is easy to forget what you are doing QAQ
write_data_to_file
write_data_to_file The function is to receive a set of training images and write these images into hdf5 file
def write_data_to_file(training_data_files, out_file, image_shape, truth_dtype=np.uint8, subject_ids=None,
normalize=True, crop=True):
n_samples = len(training_data_files)
n_channels = len(training_data_files[0]) - 1
try:
hdf5_file, data_storage, truth_storage, affine_storage = create_data_file(out_file,
n_channels=n_channels,
n_samples=n_samples,
image_shape=image_shape)
except Exception as e:
# If something goes wrong, delete the incomplete data file
os.remove(out_file)
raise e
write_image_data_to_file(training_data_files, data_storage, truth_storage, image_shape,
truth_dtype=truth_dtype, n_channels=n_channels, affine_storage=affine_storage, crop=crop)
if subject_ids:
hdf5_file.create_array(hdf5_file.root, 'subject_ids', obj=subject_ids)
if normalize:
normalize_data_storage(data_storage)
hdf5_file.close()
return out_file
Parameters
- training_data_files
training_data_files Tuples containing training data files tuple list . In each tuple tuple in , Several modes should be listed in the same order . The last item in each tuple must be a tagged image (truth). It can be written before me train.py Observe this parameter during parsing . for example :
[(‘sub1-T1.nii.gz’, ‘sub1-T2.nii.gz’, ‘sub1-truth.nii.gz’), (‘sub2-T1.nii.gz’, ‘sub2-T2.nii.gz’, ‘sub2-truth.nii.gz’)] - out_file:hdf5 Where the file is written
- image_shape: Need to deposit hdf5 The size of the image in the file
- truth_dtype: The default is 8 Bit unsigned integer
- The function returns : Where the image data is written hdf5 The location of the file
The first code
n_samples = len(training_data_files)
n_channels = len(training_data_files[0]) - 1
In the last analysis train.py When we saw ,len(training_data_files) Representative is preprocessed The number of all image folders in the file , The number of samples . Each folder is stored as a tuple , And in each tuple there is 5 In the form of (“t1”, “t1ce”, “flair”, “t2”+“truth”) Of nii file , And the pictures in each tuple are arranged in the same order . therefore n_channels = len(training_data_files[0]) - 1 There are four forms of training sets nii file , namely channels Count . We can take a look at the output
training_files = fetch_training_data_files()
print(type(training_files))
print(len(training_files))
print(len(training_files[0]) - 1)
# Output
<class 'list'>
30
4
The second code
try:
hdf5_file, data_storage, truth_storage, affine_storage = create_data_file(out_file,
n_channels=n_channels,
n_samples=n_samples,
image_shape=image_shape)
except Exception as e:
# If something goes wrong, delete the incomplete data file
os.remove(out_file)
raise e
create_data_file A detailed explanation is given below , Four outputs are generated table:df5_file, And three scalable compressed arrays
Use here try…except… Program structure to get exceptions (Exception) Information , Can help to quickly locate the location of the program statement with errors .
and , If something goes wrong , adopt os.remove(out_file) Delete incomplete data files
The third code
write_image_data_to_file(training_data_files, data_storage, truth_storage, image_shape,
truth_dtype=truth_dtype, n_channels=n_channels, affine_storage=affine_storage, crop=crop)
function write_image_data_to_file() Is used to write image data to the previously created compressed and extensible array .
Although this piece looks like a line of command , But it involves a lot of subfunctions , I'm right down there write_image_data_to_file Function and its related sub functions are explained in detail , I hope you all jump to the bottom write_image_data_to_file Function view .
The fourth code
if subject_ids:
hdf5_file.create_array(hdf5_file.root, 'subject_ids', obj=subject_ids)
if normalize:
normalize_data_storage(data_storage)
subject_ids I didn't use... When I was running , Don't study here first .
Let's talk about normalize, The definition for
def normalize_data(data, mean, std):
#data:[4,144,144,144]
data -= mean[:, np.newaxis, np.newaxis, np.newaxis]
data /= std[:, np.newaxis, np.newaxis, np.newaxis]
return data
def normalize_data_storage(data_storage):
means = list()
stds = list()
#[n_example,4,144,144,144]
for index in range(data_storage.shape[0]):
#[4,144,144,144]
data = data_storage[index]
# Calculate the mean and standard deviation of each mode respectively
means.append(data.mean(axis=(1, 2, 3)))
stds.append(data.std(axis=(1, 2, 3)))
# Find the mean and standard deviation of each mode on all samples [n_example,4]==>[4]
mean = np.asarray(means).mean(axis=0)
std = np.asarray(stds).mean(axis=0)
for index in range(data_storage.shape[0]):
# Normalize each sample according to the mean and standard deviation
data_storage[index] = normalize_data(data_storage[index], mean, std)
return data_storage
The function is to set all modes of the training set image normalization
The fifth code
hdf5_file.close()
return out_file
Purring , This function finally ends , Remember to close the file before you leave ~
create_data_file
def create_data_file(out_file, n_channels, n_samples, image_shape):
hdf5_file = tables.open_file(out_file, mode='w')
filters = tables.Filters(complevel=5, complib='blosc')
data_shape = tuple([0, n_channels] + list(image_shape))
truth_shape = tuple([0, 1] + list(image_shape))
data_storage = hdf5_file.create_earray(hdf5_file.root, 'data', tables.Float32Atom(), shape=data_shape,
filters=filters, expectedrows=n_samples)
truth_storage = hdf5_file.create_earray(hdf5_file.root, 'truth', tables.UInt8Atom(), shape=truth_shape,
filters=filters, expectedrows=n_samples)
affine_storage = hdf5_file.create_earray(hdf5_file.root, 'affine', tables.Float32Atom(), shape=(0, 4, 4),
filters=filters, expectedrows=n_samples)
return hdf5_file, data_storage, truth_storage, affine_storage
There are a lot of Python Tables Knowledge , You can refer to Python Tables Learning notes
hdf5_file = tables.open_file(out_file, mode='w')
Create a new one hdf5 file , The file name is out_file, Write mode .
filters = tables.Filters(complevel=5, complib='blosc')# Declare the compression type and depth
data_shape = tuple([0, n_channels] + list(image_shape))
truth_shape = tuple([0, 1] + list(image_shape))
data_storage = hdf5_file.create_earray(hdf5_file.root, 'data', tables.Float32Atom(), shape=data_shape,
filters=filters, expectedrows=n_samples)
truth_storage = hdf5_file.create_earray(hdf5_file.root, 'truth', tables.UInt8Atom(), shape=truth_shape,
filters=filters, expectedrows=n_samples)
affine_storage = hdf5_file.create_earray(hdf5_file.root, 'affine', tables.Float32Atom(), shape=(0, 4, 4),
filters=filters, expectedrows=n_samples)
Compressed array (Compression Array)
HDF Files can also be compressed , The compression methods are blosc, zlib, and lzo.Zlib and lzo Compression requires additional packages ,blosc stay tables It is brought by itself. . We need to define one filter To explain the compression method and compression depth . in addition , We use creatCArray To create a compression matrix .
Compress extensible arrays (Compression & Enlargeable Array)
Compressed array , The size cannot be changed after initialization , But in reality, many times , We only know dimensions , Do not know the length of our data . This is the time , We need this array to be extensible .HDF The file also provides such an interface , We can extend one dimension . Same as CArray equally , We also have to decide filter To declare the compression type and depth . most important of all , We can extend this dimension shape Set to 0. Write here 0, Represents that this dimension is extensible .
therefore , We can look at the dimensions of our data
hdf5_file = tables.open_file(config["data_file"], mode='w')
filters = tables.Filters(complevel=5, complib='blosc')
data_shape = tuple([0, n_channels] + list(config["image_shape"]))
truth_shape = tuple([0, 1] + list(config["image_shape"]))
print(data_shape)
print(truth_shape )
# Output
(0, 4, 144, 144, 144)
(0, 1, 144, 144, 144)
You can see , To compress an extensible array , Our understanding of the data shape Has been adjusted , There are four patterns in the training set data channels
data_storage = hdf5_file.create_earray(hdf5_file.root, 'data', tables.Float32Atom(), shape=data_shape,
filters=filters, expectedrows=n_samples)
truth_storage = hdf5_file.create_earray(hdf5_file.root, 'truth', tables.UInt8Atom(), shape=truth_shape,
filters=filters, expectedrows=n_samples)
affine_storage = hdf5_file.create_earray(hdf5_file.root, 'affine', tables.Float32Atom(), shape=(0, 4, 4),
filters=filters, expectedrows=n_samples)
return hdf5_file, data_storage, truth_storage, affine_storage
This create_earray Is a function that creates an extensible matrix (Enlargeable).
Returns four outputs table:df5_file, And three scalable compressed arrays
write_image_data_to_file
def write_image_data_to_file(image_files, data_storage, truth_storage, image_shape, n_channels, affine_storage,
truth_dtype=np.uint8, crop=True):
for set_of_files in image_files:
images = reslice_image_set(set_of_files, image_shape, label_indices=len(set_of_files) - 1, crop=crop)
subject_data = [image.get_data() for image in images]
add_data_to_storage(data_storage, truth_storage, affine_storage, subject_data, images[0].affine, n_channels,
truth_dtype)
return data_storage, truth_storage
function write_image_data_to_file() Is used to write image data to the previously created compressed and extensible array .
for set_of_files in image_files:
Before traversal, use fetch_training_data_files() Get the path of all subfolders . Tuples of different modal image paths (‘sub1-T1.nii.gz’, ‘sub1-T2.nii.gz’, ‘sub1-flair.nii.gz’,‘sub1-t1ce.nii.gz’,‘sub1-truth.nii.gz’)
images = reslice_image_set(set_of_files, image_shape, label_indices=len(set_of_files) - 1, crop=crop)
Yes 4 Modes +truth The image is cropped according to the foreground and background
reslice_image_set The function is defined as follows
def reslice_image_set(in_files, image_shape, out_files=None, label_indices=None, crop=False):
#in_files:('sub1-T1.nii.gz', 'sub1-T2.nii.gz', 'sub1-flair.nii.gz','sub1-t1ce.nii.gz','sub1-truth.nii.gz')
#label_indices: Number of modes -4
# Crop the image
if crop:
# Return the range of each dimension to be trimmed [slice(),slice(),slice()]
crop_slices = get_cropping_parameters([in_files])
else:
crop_slices = None
# Yes in_files Each of the image Crop, zoom, and return image list
images = read_image_files(in_files, image_shape=image_shape, crop=crop_slices, label_indices=label_indices)
if out_files:
for image, out_file in zip(images, out_files):
image.to_filename(out_file)
return [os.path.abspath(out_file) for out_file in out_files]
else:
return images
subject_data = [image.get_data() for image in images]
obtain 4 Modes +truth Of image Array of
add_data_to_storage(data_storage, truth_storage, affine_storage, subject_data, images[0].affine, n_channels,
truth_dtype)
add to 1 Share subject_data data , When writing, the subject_data Extend to and create_data_file The dimensions defined in are the same , And complete the writing of the extensible array
I hope you can turn to the following add_data_to_storage see , It's more detailed
return data_storage, truth_storage
After reading and writing all the pictures , Returns an extensible array of training sets and tags
add_data_to_storage
def add_data_to_storage(data_storage, truth_storage, affine_storage, subject_data, affine, n_channels, truth_dtype):
# add to 1 Share subject_data data , When writing, the subject_data Extend to and create_data_file The dimensions defined in are the same
data_storage.append(np.asarray(subject_data[:n_channels])[np.newaxis])#np.asarray:==>[4,144,144,144] Expand =new.axis:[1,4,144,144,144]
truth_storage.append(np.asarray(subject_data[n_channels], dtype=truth_dtype)[np.newaxis][np.newaxis])#np.asarray:==>[144,144,144] Expand =new.axis:[1,1,144,144,144]
affine_storage.append(np.asarray(affine)[np.newaxis])#np.asarray:==>[4,4] Expand =new.axis:[1,4,,4]
This function is also a little dizzy , In fact, after we got all the image data in the previous step subject_data It is divided into training data and label data and added to the extensible array we have created data_storage And truth_storage in , But now there's a problem , We subject_data The data of the obtained image is the same as that of the extensible array defined before shape Dissimilarity , It can't be used directly append superposition , Now all we have to do is sort out the data and change their shape For use append Write to an extensible array
here np.newaxis The function of is to insert new dimensions , It looks messy , Let's take an example to study :
array=np.arange(40)
print(array)
print(array[:2])
array=array.reshape(5,2,2,2)
print(array)
# Output
[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39]
[0 1]
[[[[ 0 1]
[ 2 3]]
[[ 4 5]
[ 6 7]]]
[[[ 8 9]
[10 11]]
[[12 13]
[14 15]]]
[[[16 17]
[18 19]]
[[20 21]
[22 23]]]
[[[24 25]
[26 27]]
[[28 29]
[30 31]]]
[[[32 33]
[34 35]]
[[36 37]
[38 39]]]]
Here I choose 40 Number , Divide it into (5,2,2,2) In the form of , This is actually related to our subject_data The form is the same , First dimension (shape[0])=5, That is, the four forms of training set and one truth.
data_for_train = array[:4]
print(data_for_train)
print(data_for_train.shape)
# Output
[[[[ 0 1]
[ 2 3]]
[[ 4 5]
[ 6 7]]]
[[[ 8 9]
[10 11]]
[[12 13]
[14 15]]]
[[[16 17]
[18 19]]
[[20 21]
[22 23]]]
[[[24 25]
[26 27]]
[[28 29]
[30 31]]]]
(4, 2, 2, 2)
use array[:4] Divide our training set ,shape=(4, 2, 2, 2) by 4 Of various forms (2, 2, 2) Image data
data_for_truth = array[4]
print(data_for_truth)
print(data_for_truth.shape)
# Output
[[[32 33]
[34 35]]
[[36 37]
[38 39]]]
(2, 2, 2)
array[4] Is our last tag image , Because it's just an image, so shape=(2, 2, 2)
data_for_train=data_for_train[np.newaxis]
print(data_for_train)
print(data_for_train.shape)
# Output
[[[[[ 0 1]
[ 2 3]]
[[ 4 5]
[ 6 7]]]
[[[ 8 9]
[10 11]]
[[12 13]
[14 15]]]
[[[16 17]
[18 19]]
[[20 21]
[22 23]]]
[[[24 25]
[26 27]]
[[28 29]
[30 31]]]]]
(1, 4, 2, 2, 2)
use [np.newaxis] To add one dimension , Become the same form as an extensible array (1, 4, 2, 2, 2)
data_for_truth=data_for_truth[np.newaxis][np.newaxis]
print(data_for_truth)
print(data_for_truth.shape)
# Output
[[[[[32 33]
[34 35]]
[[36 37]
[38 39]]]]]
(1, 1, 2, 2, 2)
Use it twice np.newaxis] To add two dimensions , Become the same form as an extensible array (1, 1, 2, 2, 2)
Through this example , You should understand the principle of this function , Let's look at the output of the function
def add_data_to_storage(data_storage, truth_storage, affine_storage, subject_data, affine, n_channels, truth_dtype):
print('data_storage change :')
print((np.asarray(subject_data[:n_channels])).shape)
data_storage.append(np.asarray(subject_data[:n_channels])[np.newaxis])
print((np.asarray(subject_data[:n_channels])[np.newaxis]).shape)
print('truth_storage change :')
print((np.asarray(subject_data[n_channels])).shape)
truth_storage.append(np.asarray(subject_data[n_channels], dtype=truth_dtype)[np.newaxis][np.newaxis])
print((np.asarray(subject_data[n_channels], dtype=truth_dtype)[np.newaxis]).shape)
print((np.asarray(subject_data[n_channels], dtype=truth_dtype)[np.newaxis][np.newaxis]).shape)
affine_storage.append(np.asarray(affine)[np.newaxis])
def write_image_data_to_file(image_files, data_storage, truth_storage, image_shape, n_channels, affine_storage,
truth_dtype=np.uint8, crop=True):
for set_of_files in image_files:
images = reslice_image_set(set_of_files, image_shape, label_indices=len(set_of_files) - 1, crop=crop)
subject_data = [image.get_data() for image in images]
add_data_to_storage(data_storage, truth_storage, affine_storage, subject_data, images[0].affine, n_channels,
truth_dtype)
return data_storage, truth_storage
write_image_data_to_file(training_files, data_storage, truth_storage, image_shape=(144, 144, 144),
truth_dtype=np.uint8, n_channels=n_channels, affine_storage=affine_storage, crop=True)
Output is
Reading: data/preprocessed/Pre-operative_TCGA_GBM_NIfTI_and_Segmentations/TCGA-02-0006/t1ce.nii.gz
Reading: data/preprocessed/Pre-operative_TCGA_GBM_NIfTI_and_Segmentations/TCGA-02-0006/flair.nii.gz
Reading: data/preprocessed/Pre-operative_TCGA_GBM_NIfTI_and_Segmentations/TCGA-02-0006/t2.nii.gz
Reading: data/preprocessed/Pre-operative_TCGA_GBM_NIfTI_and_Segmentations/TCGA-02-0006/truth.nii.gz
data_storage change :
(4, 144, 144, 144)
(1, 4, 144, 144, 144)
truth_storage change :
(144, 144, 144)
(1, 144, 144, 144)
(1, 1, 144, 144, 144)
Reading: data/preprocessed/Pre-operative_TCGA_GBM_NIfTI_and_Segmentations/TCGA-02-0033/t1.nii.gz
Reading: data/preprocessed/Pre-operative_TCGA_GBM_NIfTI_and_Segmentations/TCGA-02-0033/t1ce.nii.gz
Reading: data/preprocessed/Pre-operative_TCGA_GBM_NIfTI_and_Segmentations/TCGA-02-0033/flair.nii.gz
Reading: data/preprocessed/Pre-operative_TCGA_GBM_NIfTI_and_Segmentations/TCGA-02-0033/t2.nii.gz
Reading: data/preprocessed/Pre-operative_TCGA_GBM_NIfTI_and_Segmentations/TCGA-02-0033/truth.nii.gz
Reading: data/preprocessed/Pre-operative_TCGA_GBM_NIfTI_and_Segmentations/TCGA-02-0033/t1.nii.gz
Reading: data/preprocessed/Pre-operative_TCGA_GBM_NIfTI_and_Segmentations/TCGA-02-0033/t1ce.nii.gz
Reading: data/preprocessed/Pre-operative_TCGA_GBM_NIfTI_and_Segmentations/TCGA-02-0033/flair.nii.gz
Reading: data/preprocessed/Pre-operative_TCGA_GBM_NIfTI_and_Segmentations/TCGA-02-0033/t2.nii.gz
Reading: data/preprocessed/Pre-operative_TCGA_GBM_NIfTI_and_Segmentations/TCGA-02-0033/truth.nii.gz
data_storage change :
(4, 144, 144, 144)
(1, 4, 144, 144, 144)
truth_storage change :
(144, 144, 144)
(1, 144, 144, 144)
(1, 1, 144, 144, 144)
You can see that it is the same as our example , Let's look at the output of our extensible array
print(np.asarray(data_storage).shape)
print(np.asarray(truth_storage).shape)
# Output
(5, 4, 144, 144, 144)
(5, 1, 144, 144, 144)
It can be seen that ,5 After the first operation , The data is superimposed on the first dimension 5 Time , In this way, the writing of extensible array is realized
边栏推荐
- Top level configuration + cooling black technology + cool appearance, the Red Devils 6S Pro is worthy of the flagship game of the year
- cin,cin. get(),cin. Summary of the use of getline() and getline()
- The fastest empty string comparison method C code
- In addition to the full screen without holes under the screen, the Red Devils 7 series also has these black technologies
- Ruixing coffee moves towards "national consumption"
- C语言压缩字符串保存到二进制文件,从二进制文件读取压缩字符串后解压。
- 传感器:MQ-5燃气模块测量燃气值(底部附代码)
- Huawei equipment is configured with IP and virtual private network hybrid FRR
- Using atexit to realize automatic destruct of singleton mode
- Vivo released originos ocean, and the domestic customized system is getting better and better
猜你喜欢

The commercial value of Kwai is being seen by more and more brands and businesses

分享三个关于CMDB的小故事
![[work with notes] NDK compiles the open source library ffmpeg](/img/24/ed33e12a07e001fc708e0c023e479c.jpg)
[work with notes] NDK compiles the open source library ffmpeg
![[the third day of actual combat of smart lock project based on stm32f401ret6 in 10 days] communication foundation and understanding serial port](/img/82/ed215078da0325b3adf95dcd6ffe30.jpg)
[the third day of actual combat of smart lock project based on stm32f401ret6 in 10 days] communication foundation and understanding serial port

Huawei equipment is configured with CE dual attribution

Simple ranging using Arduino and ultrasonic sensors
![[the fourth day of actual combat of stm32f401ret6 smart lock project in 10 days] voice control is realized by externally interrupted keys](/img/fc/f03c7dc4d5ee12aaa301f54e4cd3f4.jpg)
[the fourth day of actual combat of stm32f401ret6 smart lock project in 10 days] voice control is realized by externally interrupted keys

Yovo3 and yovo3 tiny structure diagram

Learning notes 51 single chip microcomputer keyboard (non coding keyboard and coding keyboard, scanning mode of non coding keyboard, independent keyboard, matrix keyboard)

Sqlserver2008 denied select permission on object'***** '(database'*****', schema'dbo')
随机推荐
【Unity】打包WebGL項目遇到的問題及解决記錄
Functional translation
拍拍贷母公司信也季报图解:营收24亿 净利5.3亿同比降10%
如何解决通过new Date()获取时间写出数据库与当前时间相差8小时问题【亲测有效】
[programming idea] communication interface of data transmission and decoupling design of communication protocol
[51nod.3210] binary Statistics (bit operation)
Logging system in chromium
The commercial value of Kwai is being seen by more and more brands and businesses
柏瑞凯电子冲刺科创板:拟募资3.6亿 汪斌华夫妇为大股东
华为设备配置私网IP路由FRR
[the second day of actual combat of smart lock project based on stm32f401ret6 in 10 days] (lighting with library function and register respectively)
华为设备配置IP和虚拟专用网混合FRR
[learning notes] xr872 audio driver framework analysis
Huawei equipment is configured with IP and virtual private network hybrid FRR
C language conditional compilation routine
Application circuit and understanding of BAT54C as power supply protection
Sensor: sht30 temperature and humidity sensor testing ambient temperature and humidity experiment (code attached at the bottom)
Restrict cell input type and display format in CXGRID control
C语言压缩字符串保存到二进制文件,从二进制文件读取压缩字符串后解压。
Basic exercise of test questions Yanghui triangle (two-dimensional array and shallow copy)