PSPNet_tensorflow
Important
Code is fine for inference. However, the training code is just for reference and might be only used for fine-tuning. If you want to train from scratch, you need to implement the Synchronize BN layer first to do large batch-size training (as described in the paper). It seems that this repo has reproduced it, you can take a look on it.
Introduction
This is an implementation of PSPNet in TensorFlow for semantic segmentation on the cityscapes dataset. We first convert weight from Original Code by using caffe-tensorflow framework.
Update:
News (2018.11.08 updated):
Now you can try PSPNet on your own image online using ModelDepot live demo!
2018/01/24:
Support evaluation code for ade20k dataset
2018/01/19:
Support inference phase for ade20k dataset
using model of pspnet50 (convert weights from original author)- Using
tf.matmul
to decode label, so as to improve the speed of inference.
2017/11/06:
Support different input size
by padding input image to (720, 720) if original size is smaller than it, and get result by cropping image in the end.
2017/10/27:
Change bn layer from tf.nn.batch_normalization
into tf.layers.batch_normalization
in order to support training phase. Also update initial model in Google Drive.
Install
Get restore checkpoint from Google Drive and put into model
directory. Note: Select the checkpoint corresponding to the dataset.
Inference
To get result on your own images, use the following command:
python inference.py --img-path=./input/test.png --dataset cityscapes
Inference time: ~0.6s
Options:
--dataset cityscapes or ade20k
--flipped-eval
--checkpoints /PATH/TO/CHECKPOINT_DIR
Evaluation
Cityscapes
Perform in single-scaled model on the cityscapes validation datase.
Method | Accuracy |
---|---|
Without flip | 76.99% |
Flip | 77.23% |
ade20k
Method | Accuracy |
---|---|
Without flip | 40.00% |
Flip | 40.67% |
To re-produce evluation results, do following steps:
- Download Cityscape dataset or ADE20k dataset first.
- change
data_dir
to your dataset path inevaluate.py
:
'data_dir': ' = /Path/to/dataset'
- Run the following command:
python evaluate.py --dataset cityscapes
List of Args:
--dataset - ade20k or cityscapes
--flipped-eval - Using flipped evaluation method
--measure-time - Calculate inference time
Image Result
cityscapes
Input image | Output image |
---|---|
ade20k
Input image | Output image |
---|---|
real world
Input image | Output image |
---|---|
Citation
@article{zhao2017pspnet,
author = {Hengshuang Zhao and
Jianping Shi and
Xiaojuan Qi and
Xiaogang Wang and
Jiaya Jia},
title = {Pyramid Scene Parsing Network},
booktitle = {Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2017}
}
Scene Parsing through ADE20K Dataset. B. Zhou, H. Zhao, X. Puig, S. Fidler, A. Barriuso and A. Torralba. Computer Vision and Pattern Recognition (CVPR), 2017. (http://people.csail.mit.edu/bzhou/publication/scene-parse-camera-ready.pdf)
@inproceedings{zhou2017scene,
title={Scene Parsing through ADE20K Dataset},
author={Zhou, Bolei and Zhao, Hang and Puig, Xavier and Fidler, Sanja and Barriuso, Adela and Torralba, Antonio},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
year={2017}
}
Semantic Understanding of Scenes through ADE20K Dataset. B. Zhou, H. Zhao, X. Puig, S. Fidler, A. Barriuso and A. Torralba. arXiv:1608.05442. (https://arxiv.org/pdf/1608.05442.pdf)
@article{zhou2016semantic,
title={Semantic understanding of scenes through the ade20k dataset},
author={Zhou, Bolei and Zhao, Hang and Puig, Xavier and Fidler, Sanja and Barriuso, Adela and Torralba, Antonio},
journal={arXiv preprint arXiv:1608.05442},
year={2016}
}