当前位置：网站首页>Image semantic segmentation practice: tensorflow deeplobv3+ train your own dataset

Image semantic segmentation practice: tensorflow deeplobv3+ train your own dataset

2022-07-28 16:15:00 【Xiao Wang who sells newspapers】

List of articles

Preface
One 、 Environment configuration
Two 、 Training process
3、 ... and 、 test
summary

Preface

This article is to record deeplabv3+ Training process .

One 、 Environment configuration

My environment ：

ubuntu 16.04
anaconda3
python 3.5
tensorflow-gpu 1.10.0

anaconda3 The installation of can be checked online .

Two 、 Training process

1. Import and stock in

First clone Official tensorflow/models file .

git clone https://github.com/tensorflow/models.git

If the download speed is slow , You can download it on the code cloud .

2. Data set preparation

Convert to VOC Data set in format

Annotation data , Make a satisfactory mask Images . Use the corresponding json file , Convert data to voc Format , It is convenient for further conversion into deeplab Gray image format required for training .

take labelme Download the project locally ：

git clone https://github.com/wkentaro/labelme.git

Find the table of contents /labelme/examples/semantic_segmentation, There is a complete example of conversion , Against the example , Put your own data （ Original picture and corresponding json mark ） Put in data_annotated Folder , Make your own labels.txt, Copy labelme2voc.py The document does not need to be changed , as follows ：
Insert picture description here
Open the terminal in the current directory and execute the following command ：

python labelme2voc.py data_annotated data_dataset_voc --labels labels.txt

Will generate data_dataset_voc Folder , It contains ：
Insert picture description here
JPEGImages In the folder are the original pictures .
SegmentationClassPNG Folder is generated label picture .

Convert to grayscale

deeplab Use the annotation diagram of single channel , That's grayscale , And the pixel mark of the category should be 0,1,2,3…n（ total n+1 Categories , contain 1 Background classes and n A target class ）. perform remove_gt_colormap.py take mask Convert to the required format , Modify the path as needed .

# from models/research/deeplab/datasets
python remove_gt_colormap.py \
  --original_gt_folder="/media/dell/2T/test/testlabel" \
  --output_dir="/media/dell/2T/test/mask"

original_gt_folder： Original tag map folder .
output_dir： Location of the label map folder to be output .

Convert to tfrecord

Make tfrecord Before , The data set needs to be classified into training / test / Verification set .

The data set directory structure is as follows ：

data
- image
- mask
- index
  - train.txt
  - trainval.txt
  - val.txt
- tfrecord

iamge： Store all input pictures , Including training 、 test 、 Verify the image of the set .
mask： Store all labele（ Grayscale ） picture , And input pictures （ namely iamge） It's one-to-one , Same file name .
tfrecord： Deposit is tfrecord Formatted data .
train.txt： File names of all training sets （ Does not include suffix ）
trainval.txt： File names of all validation sets （ Does not include suffix ）
val.txt： File names of all test sets （ Does not include suffix ）

according to index Under the txt File run build_voc2012_data.py convert to tfrecord Format , Execute the following code under the terminal ：

# from /../models/research/deeplab/datasets/
python ./build_voc2012_data.py \
  --image_folder="/home/dell/models/research/deeplab/data/image" \
  --semantic_segmentation_folder="/home/dell/models/research/deeplab/data/mask" \
  --list_folder="/home/dell/models/research/deeplab/data/index" \
  --image_format="png" \
  --output_dir="/home/dell/models/research/deeplab/data/tfrecord"

image_folder ： Data sets image File directory address
semantic_segmentation_folder： Data set mask File directory address
list_folder : Classify the data set into training sets 、 Indication directory of validation set, etc index File directory for
image_format : Input the format of picture data , My dataset is png Format
output_dir： To make the TFRecord Directory address of storage

3. Code preparation before training

modify deeplab/datasets/data_generator.py
stay 100 Add your own dataset description around the row ：

_MYDATA = DatasetDescriptor(
    splits_to_sizes={
    
        'train':150 ,  # num of samples in images/training  Training set 
	    'trainval':168, 
        'val': 18,  # num of samples in images/validation  Test set 
    },
    num_classes=3, # Because there are two labeled classes , With the background background class , So there is 3 Classes （ According to the annotated class +1）
    ignore_label=255, # Background class rgb, Here is black 
)

Then in the code 110 Register data sets around the row ：

_DATASETS_INFORMATION = {
    
    'cityscapes': _CITYSCAPES_INFORMATION,
    'pascal_voc_seg': _PASCAL_VOC_SEG_INFORMATION,
    'ade20k': _ADE20K_INFORMATION,
    'mydata':_MYDATA, # Here is the data class you created in the previous step 
}

modify train_utils.py file
stay train_utils.py in , First, let's talk about 209 OK about exclude_list Modification of settings , The function is to use the pre training weight , Do not load the logit layer ：

# Variables that will not be restored.
exclude_list = ['global_step','logits']
if not initialize_last_layer:
exclude_list.extend(last_layers)

4. Main training parameters

Training documents train.py and common.py The file contains all the parameters needed to train the segmented network .

model_variant：Deeplab Model variables , Optional values are visible core/feature_extractor.py.
- When using mobilenet_v2 when , Set a variable strous_rates=decoder_output_stride=None;
- When using xception_65 or resnet_v1 when , Set up strous_rates=[6,12,18](output stride 16), decoder_output_stride=4.
label_weights： This variable can set the weight value of the label , When there is category imbalance in the data set , This variable can be used to specify the weight value of each category label , Such as label_weights=[0.1, 0.5] It means that the label 0 The weight of is 0.1, label 1 The weight of is 0.5. If the value is None, Then all tags have the same weight 1.0.
Tips ： I didn't modify this variable during the training , I will try to set it up during the next training .
train_logdir： Deposit checkpoint and logs The path of .
log_steps： This value indicates the number of steps to output log information .
save_interval_secs： This value is expressed in seconds , How often do you save the model file to the hard disk .
optimizer： Optimizer , Optional value [‘momentum’, ‘adam’]. The default is momentum.
learning_policy： Learning rate strategy , Optional value [‘poly’, ‘step’].
base_learning_rate： Basic learning rate , The default value is 0.0001.
training_number_of_steps： Iterations of model training .
train_batch_size： Number of batch images for model training .
train_crop_size： Image size used in model training , Default ’513, 513’.
tf_initial_checkpoint： Pre training model .
initialize_last_layer： Whether to initialize the last layer .
last_layers_contain_logits_only： Whether only the logical layer is considered as the last layer .
fine_tune_batch_norm： Is it fine tuned batch norm Parameters .
atrous_rates： The default value is [6, 12, 18].
output_stride： The default value is 16, Ratio of input and output spatial resolution
- about xception_65, If output_stride=8, Then use atrous_rates=[12, 24, 36]
- If output_stride=16, be atrous_rates=[6, 12, 18]
- about mobilenet_v2, Use None
- Be careful ： Different... Can be used in the training and verification stages atrous_rates and output_stride.
dataset： Split data set used , This is the same as the name when the dataset is registered .
train_split： Which data set is used to train , The optional value is the value at the time of data set registration , Such as train, trainval.
dataset_dir： The path where the dataset is stored .
For training parameters , The following points need to be paid attention to ：

1. About whether to load the weight of the pre training network
If you want to fine tune the network on other data sets , You need to focus on the following parameters ：

Use the weights of the pre training network , Set up initialize_last_layer=True
Network only backbone, Set up initialize_last_layer=False and last_layers_contain_logits_only=False
Use all pre training weights , except logits, Because if it is your own data set , Corresponding classes Different （ We have previously set this not to load logits） Set up initialize_last_layer=False and last_layers_contain_logits_only=True
Because my dataset classification is different from the default number of categories , Therefore, the parameter value taken is ：

--initialize_last_layer=false
--last_layers_contain_logits_only=true

2. If resources are limited , Some suggestions for training your dataset ：

Set up output_stride=16 Or even 32（ At the same time, we need to modify atrous_rates Variable , for example , about output_stride=32,atrous_rates=[3, 6, 9]）
Use... As much as possible GPU, change num_clone sign , And will train_batch_size Set as large as possible
adjustment train_crop_size, You can set it smaller , for example 513x513（ even to the extent that 321x321）, In this way, you can use larger batch_size
Use a smaller network backbone , Such as mobilenet_v2

3. About whether to fine tune batch_norm

When training the batch size used train_batch_size Greater than 12（ Better than 16） when , Set up fine_tune_batch_norm=True. otherwise , Set up fine_tune_batch_norm=False.

5. Pre training model

Choose a pre training model , Download according to your own situation , Download address ：https://github.com/tensorflow/models/blob/master/research/deeplab/g3doc/model_zoo.md
I chose xception_71.

6. test model_test.py

Test whether the environment configuration is successful .

Add dependent libraries to PYTHONPATH, In the catalog /home/user/models/research/ Next ：

#From /home/user/models/research/
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
source ~/.bashrc

call model_test.py test ：

# From /home/user/models/research/
python deeplab/model_test.py

7. Training

train.py： Training code file , During training , You need to specify the training parameters provided .

# from /../models/research/
python deeplab/train.py \
    --logtostderr \
    --training_number_of_steps=80000 \
    --train_split="train" \
    --model_variant="xception_71" \
    --atrous_rates=6 \
    --atrous_rates=12 \
    --atrous_rates=18 \
    --output_stride=16 \
    --decoder_output_stride=4 \
    --train_crop_size="321,321"\
    --train_batch_size=8 \
    --fine_tune_batch_norm = False \
    --base_learning_rate=0.01 \
    --dataset="mydata" \
    --tf_initial_checkpoint='/home/dell/models/research/deeplab/backbone/xception_71/model.ckpt' \
    --train_logdir='/home/dell/models/research/deeplab/exp/mydata_train/train' \
    --dataset_dir='/home/dell/models/research/deeplab/data/tfrecord'

8. Visual testing

vis.py： Visual code .

# from /root/models/research/
python deeplab/vis.py \
    --logtostderr \
    --vis_split="val" \
    --model_variant="xception_71" \
    --atrous_rates=6 \
    --atrous_rates=12 \
    --atrous_rates=18 \
    --output_stride=16 \
    --decoder_output_stride=4 \
    --vis_crop_size="512,512" \
    --dataset="mydata" \
    --colormap_type="pascal" \
    --checkpoint_dir='/home/dell/models/research/deeplab/exp/mydata_train/train/' \
    --vis_logdir='/home/dell/models/research/deeplab/exp/mydata_train/vis/' \
    --dataset_dir='/home/dell/models/research/deeplab/data/tfrecord/' \
    --max_number_of_iterations=1

9. verification

eval.py： Verification code , Output mIOU, Used to evaluate the quality of the model .

# from /root/models/research/
python deeplab/eval.py \
    --logtostderr \
    --eval_split="val" \
    --model_variant="xception_71" \
    --atrous_rates=6 \
    --atrous_rates=12 \
    --atrous_rates=18 \
    --output_stride=16 \
    --decoder_output_stride=4 \
    --eval_crop_size="512,512" \
    --dataset="mydata" \
    --checkpoint_dir='/home/dell/models/research/deeplab/exp/mydata_train/train/' \
    --eval_logdir='/home/dell/models/research/deeplab/exp/mydata_train/eval/' \
    --dataset_dir='/home/dell/models/research/deeplab/data/tfrecord/' \
    --max_number_of_iterations=1

10. Check the log

Use Tensorboard Check the progress of training and evaluation .

tensorboard --logdir=${
    PATH_TO_LOG_DIRECTORY}
#  In this paper, log Address 
tensorboard --logdir="./train_logs"

You can use the following command to run :

# see eval journal 
tensorboard --logdir=/home/dell/models/research/deeplab/exp/mydata_train/eval --port 6007

# Check out the training log 
tensorboard --logdir=/home/dell/models/research/deeplab/exp/mydata_train/train

11. The export model

In the process of training , The model file will be saved to the hard disk , as follows ：
Insert picture description here
A script is provided in the code （export_model.py） Can be checkpoint Convert to .pb Format .
stay ./models/research/ Next create a script export_model.sh Used to perform export_model.py, The content is export_model.py Main modification parameters of , The code is as follows ：

export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
python deeplab/export_model.py \
    --logtostderr \
    --checkpoint_path="/media/dell/2T/models/research/deeplab/exp/mydata_train/train/model.ckpt-$1" \
    --export_path="/media/dell/2T/models/research/deeplab/exp/mydata_train/export/inference_graph-$1.pb" \
    --model_variant="xception_71" \
    --atrous_rates=6 \
    --atrous_rates=12 \
    --atrous_rates=18 \
    --output_stride=16 \
    --decoder_output_stride=4 \
    --num_classes=2 \
    --crop_size=3000 \
    --crop_size=3000 \
    --inference_scales=1.0
# checkpoint_path： Training saved checkpoint file 
# export_path： Model export path 
# num_classes： Classification categories 
# crop_size： Image size , The default is [513, 513], You can change the size , This affects the image size of the test image output . However, when the size is too large, the exported model may report errors , It seems to be related to the size of the video memory , And the larger the size , The longer it takes to test the image .
# atrous_rates and output_stride It can be different from training . My configuration is the same .
# inference_scales： Multiscale reasoning , Default [1.0]. Change to  [0.5, 0.75, 1.0, 1.25, 1.5, 1.75]  Multi scale reasoning .

function export_model.sh The export model .

# from /../models/research/
sh export_model.sh 80000
#  The number represents the number after the model name

Generated .pb The documents are as follows ：
Insert picture description here

3、 ... and 、 test

You can write your own test code , The following code is directly copied from other articles for reference , Self revision and simplification , Examples are as follows ：

#!/usr/bin/env python
# coding: utf-8
 
import os
from io import BytesIO
import tarfile
import tempfile
from six.moves import urllib
from matplotlib import gridspec
from matplotlib import pyplot as plt
import numpy as np
from PIL import Image
import tensorflow as tf
import scipy
 
LABEL_NAMES = np.asarray(["background", "class1", "class2"])
 
 
class DeepLabModel(object):
    """Class to load deeplab model and run inference."""
 
    INPUT_TENSOR_NAME = "ImageTensor:0"
    OUTPUT_TENSOR_NAME = "SemanticPredictions:0"
    INPUT_SIZE = 321
    FROZEN_GRAPH_NAME = "frozen_inference_graph"
 
    def __init__(self, modelname):
        """Creates and loads pretrained deeplab model."""
        self.graph = tf.Graph()
        graph_def = None
 
        with open(modelname, "rb") as fd:
            graph_def = tf.GraphDef.FromString(fd.read())
 
        if graph_def is None:
            raise RuntimeError("Cannot find inference graph in tar archive.")
 
        with self.graph.as_default():
            tf.import_graph_def(graph_def, name="")
 
        self.sess = tf.Session(graph=self.graph)
 
    def run(self, image):
        """Runs inference on a single image. Args: image: A PIL.Image object, raw input image. Returns: resized_image: RGB image resized from original input image. seg_map: Segmentation map of `resized_image`. """
        width, height = image.size
        resize_ratio = 1.0 * self.INPUT_SIZE / max(width, height)
        target_size = (int(resize_ratio * width), int(resize_ratio * height))
        resized_image = image.convert("RGB").resize(target_size, Image.ANTIALIAS)
        batch_seg_map = self.sess.run(
            self.OUTPUT_TENSOR_NAME,
            feed_dict={
    self.INPUT_TENSOR_NAME: [np.asarray(resized_image)]},
        )
        seg_map = batch_seg_map[0]
        return resized_image, seg_map
 
 
def create_pascal_label_colormap():
    """Creates a label colormap used in PASCAL VOC segmentation benchmark. Returns: A Colormap for visualizing segmentation results. """
    colormap = np.zeros((256, 3), dtype=int)
    ind = np.arange(256, dtype=int)
 
    for shift in reversed(range(8)):
        for channel in range(3):
            colormap[:, channel] |= ((ind >> channel) & 1) << shift
        ind >>= 3
 
    return colormap
 
 
#  from  label  To  color_image
def label_to_color_image(label):
    """Adds color defined by the dataset colormap to the label. Args: label: A 2D array with integer type, storing the segmentation label. Returns: result: A 2D array with floating type. The element of the array is the color indexed by the corresponding element in the input label to the PASCAL color map. Raises: ValueError: If label is not of rank 2 or its value is larger than color map maximum entry. """
    if label.ndim != 2:
        raise ValueError("Expect 2-D input label")
 
    colormap = create_pascal_label_colormap()
 
    if np.max(label) >= len(colormap):
        raise ValueError("label value too large.")
 
    return colormap[label]
 
 
#  Visualization of segmentation results 
def vis_segmentation(image, seg_map, name):
    """Visualizes input image, segmentation map and overlay view."""
    plt.figure(figsize=(15, 5))
    grid_spec = gridspec.GridSpec(1, 4, width_ratios=[6, 6, 6, 1])
 
    plt.subplot(grid_spec[0])
    plt.imshow(image)
    plt.axis("off")
    plt.title("input image")
 
    plt.subplot(grid_spec[1])
    seg_image = label_to_color_image(seg_map).astype(np.uint8)
    plt.imshow(seg_image)
    plt.axis("off")
    plt.title("segmentation map")
 
    plt.subplot(grid_spec[2])
    plt.imshow(image)
    plt.imshow(seg_image, alpha=0.7)
    plt.axis("off")
    plt.title("segmentation overlay")
 
    unique_labels = np.unique(seg_map)
    ax = plt.subplot(grid_spec[3])
    plt.imshow(FULL_COLOR_MAP[unique_labels].astype(np.uint8), interpolation="nearest")
    ax.yaxis.tick_right()
    plt.yticks(range(len(unique_labels)), LABEL_NAMES[unique_labels])
    plt.xticks([], [])
    ax.tick_params(width=0.0)
    plt.grid("off")
 
    plt.savefig("./seg_map_result/" + name + ".png")
    # plt.show()
 
 
FULL_LABEL_MAP = np.arange(len(LABEL_NAMES)).reshape(len(LABEL_NAMES), 1)
FULL_COLOR_MAP = label_to_color_image(FULL_LABEL_MAP)
 
 
def main_test(filepath):
    #  Load model 
    modelname = "./datasets/quekou/export/inference_graph-80000.pb"
    MODEL = DeepLabModel(modelname)
    print("model loaded successfully!")
 
    filelist = os.listdir(filepath)
    for item in filelist:
        print("process image of ", item)
        name = item.split(".jpg", 1)[0]
        original_im = Image.open(filepath + item)
        resized_im, seg_map = MODEL.run(original_im)
 
        #  Segmentation result splicing 
        vis_segmentation(resized_im, seg_map, name)
 
        #  Save the segmentation results separately 
        # seg_map_name = name + '_seg.png'
        # resized_im_name = name + '_in.png'
        # path = './seg_map_result/'
        # scipy.misc.imsave(path + resized_im_name,resized_im)
        # scipy.misc.imsave(path + seg_map_name,seg_map)
 
 
if __name__ == "__main__":
    filepath = "./datasets/quekou/dataset/JPEGImages/"
    main_test(filepath)

Reference article ：
Link to the original text ：https://blog.csdn.net/malvas/article/details/90776327
Link to the original text ：https://blog.csdn.net/ling620/article/details/105635780
Link to the original text ：https://blog.csdn.net/zong596568821xp/article/details/83350820

summary

That's what we're going to talk about today , This article records tensorflow Take advantage of deeplabv3+ Train and test your data set . There may be other problems during the training , I will continue to add in the future .

原网站

版权声明
本文为[Xiao Wang who sells newspapers]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/196/202207130916113649.html