Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

Overview

Mask R-CNN for Object Detection and Segmentation

This is an implementation of Mask R-CNN on Python 3, Keras, and TensorFlow. The model generates bounding boxes and segmentation masks for each instance of an object in the image. It's based on Feature Pyramid Network (FPN) and a ResNet101 backbone.

Instance Segmentation Sample

The repository includes:

  • Source code of Mask R-CNN built on FPN and ResNet101.
  • Training code for MS COCO
  • Pre-trained weights for MS COCO
  • Jupyter notebooks to visualize the detection pipeline at every step
  • ParallelModel class for multi-GPU training
  • Evaluation on MS COCO metrics (AP)
  • Example of training on your own dataset

The code is documented and designed to be easy to extend. If you use it in your research, please consider citing this repository (bibtex below). If you work on 3D vision, you might find our recently released Matterport3D dataset useful as well. This dataset was created from 3D-reconstructed spaces captured by our customers who agreed to make them publicly available for academic use. You can see more examples here.

Getting Started

  • demo.ipynb Is the easiest way to start. It shows an example of using a model pre-trained on MS COCO to segment objects in your own images. It includes code to run object detection and instance segmentation on arbitrary images.

  • train_shapes.ipynb shows how to train Mask R-CNN on your own dataset. This notebook introduces a toy dataset (Shapes) to demonstrate training on a new dataset.

  • (model.py, utils.py, config.py): These files contain the main Mask RCNN implementation.

  • inspect_data.ipynb. This notebook visualizes the different pre-processing steps to prepare the training data.

  • inspect_model.ipynb This notebook goes in depth into the steps performed to detect and segment objects. It provides visualizations of every step of the pipeline.

  • inspect_weights.ipynb This notebooks inspects the weights of a trained model and looks for anomalies and odd patterns.

Step by Step Detection

To help with debugging and understanding the model, there are 3 notebooks (inspect_data.ipynb, inspect_model.ipynb, inspect_weights.ipynb) that provide a lot of visualizations and allow running the model step by step to inspect the output at each point. Here are a few examples:

1. Anchor sorting and filtering

Visualizes every step of the first stage Region Proposal Network and displays positive and negative anchors along with anchor box refinement.

2. Bounding Box Refinement

This is an example of final detection boxes (dotted lines) and the refinement applied to them (solid lines) in the second stage.

3. Mask Generation

Examples of generated masks. These then get scaled and placed on the image in the right location.

4.Layer activations

Often it's useful to inspect the activations at different layers to look for signs of trouble (all zeros or random noise).

5. Weight Histograms

Another useful debugging tool is to inspect the weight histograms. These are included in the inspect_weights.ipynb notebook.

6. Logging to TensorBoard

TensorBoard is another great debugging and visualization tool. The model is configured to log losses and save weights at the end of every epoch.

6. Composing the different pieces into a final result

Training on MS COCO

We're providing pre-trained weights for MS COCO to make it easier to start. You can use those weights as a starting point to train your own variation on the network. Training and evaluation code is in samples/coco/coco.py. You can import this module in Jupyter notebook (see the provided notebooks for examples) or you can run it directly from the command line as such:

# Train a new model starting from pre-trained COCO weights
python3 samples/coco/coco.py train --dataset=/path/to/coco/ --model=coco

# Train a new model starting from ImageNet weights
python3 samples/coco/coco.py train --dataset=/path/to/coco/ --model=imagenet

# Continue training a model that you had trained earlier
python3 samples/coco/coco.py train --dataset=/path/to/coco/ --model=/path/to/weights.h5

# Continue training the last model you trained. This will find
# the last trained weights in the model directory.
python3 samples/coco/coco.py train --dataset=/path/to/coco/ --model=last

You can also run the COCO evaluation code with:

# Run COCO evaluation on the last trained model
python3 samples/coco/coco.py evaluate --dataset=/path/to/coco/ --model=last

The training schedule, learning rate, and other parameters should be set in samples/coco/coco.py.

Training on Your Own Dataset

Start by reading this blog post about the balloon color splash sample. It covers the process starting from annotating images to training to using the results in a sample application.

In summary, to train the model on your own dataset you'll need to extend two classes:

Config This class contains the default configuration. Subclass it and modify the attributes you need to change.

Dataset This class provides a consistent way to work with any dataset. It allows you to use new datasets for training without having to change the code of the model. It also supports loading multiple datasets at the same time, which is useful if the objects you want to detect are not all available in one dataset.

See examples in samples/shapes/train_shapes.ipynb, samples/coco/coco.py, samples/balloon/balloon.py, and samples/nucleus/nucleus.py.

Differences from the Official Paper

This implementation follows the Mask RCNN paper for the most part, but there are a few cases where we deviated in favor of code simplicity and generalization. These are some of the differences we're aware of. If you encounter other differences, please do let us know.

  • Image Resizing: To support training multiple images per batch we resize all images to the same size. For example, 1024x1024px on MS COCO. We preserve the aspect ratio, so if an image is not square we pad it with zeros. In the paper the resizing is done such that the smallest side is 800px and the largest is trimmed at 1000px.

  • Bounding Boxes: Some datasets provide bounding boxes and some provide masks only. To support training on multiple datasets we opted to ignore the bounding boxes that come with the dataset and generate them on the fly instead. We pick the smallest box that encapsulates all the pixels of the mask as the bounding box. This simplifies the implementation and also makes it easy to apply image augmentations that would otherwise be harder to apply to bounding boxes, such as image rotation.

    To validate this approach, we compared our computed bounding boxes to those provided by the COCO dataset. We found that ~2% of bounding boxes differed by 1px or more, ~0.05% differed by 5px or more, and only 0.01% differed by 10px or more.

  • Learning Rate: The paper uses a learning rate of 0.02, but we found that to be too high, and often causes the weights to explode, especially when using a small batch size. It might be related to differences between how Caffe and TensorFlow compute gradients (sum vs mean across batches and GPUs). Or, maybe the official model uses gradient clipping to avoid this issue. We do use gradient clipping, but don't set it too aggressively. We found that smaller learning rates converge faster anyway so we go with that.

Citation

Use this bibtex to cite this repository:

@misc{matterport_maskrcnn_2017,
  title={Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow},
  author={Waleed Abdulla},
  year={2017},
  publisher={Github},
  journal={GitHub repository},
  howpublished={\url{https://github.com/matterport/Mask_RCNN}},
}

Contributing

Contributions to this repository are welcome. Examples of things you can contribute:

  • Speed Improvements. Like re-writing some Python code in TensorFlow or Cython.
  • Training on other datasets.
  • Accuracy Improvements.
  • Visualizations and examples.

You can also join our team and help us build even more projects like this one.

Requirements

Python 3.4, TensorFlow 1.3, Keras 2.0.8 and other common packages listed in requirements.txt.

MS COCO Requirements:

To train or test on MS COCO, you'll also need:

If you use Docker, the code has been verified to work on this Docker container.

Installation

  1. Clone this repository

  2. Install dependencies

    pip3 install -r requirements.txt
  3. Run setup from the repository root directory

    python3 setup.py install
  4. Download pre-trained COCO weights (mask_rcnn_coco.h5) from the releases page.

  5. (Optional) To train or test on MS COCO install pycocotools from one of these repos. They are forks of the original pycocotools with fixes for Python3 and Windows (the official repo doesn't seem to be active anymore).

Projects Using this Model

If you extend this model to other datasets or build projects that use it, we'd love to hear from you.

4K Video Demo by Karol Majek.

Mask RCNN on 4K Video

Images to OSM: Improve OpenStreetMap by adding baseball, soccer, tennis, football, and basketball fields.

Identify sport fields in satellite images

Splash of Color. A blog post explaining how to train this model from scratch and use it to implement a color splash effect.

Balloon Color Splash

Segmenting Nuclei in Microscopy Images. Built for the 2018 Data Science Bowl

Code is in the samples/nucleus directory.

Nucleus Segmentation

Detection and Segmentation for Surgery Robots by the NUS Control & Mechatronics Lab.

Surgery Robot Detection and Segmentation

Reconstructing 3D buildings from aerial LiDAR

A proof of concept project by Esri, in collaboration with Nvidia and Miami-Dade County. Along with a great write up and code by Dmitry Kudinov, Daniel Hedges, and Omar Maher. 3D Building Reconstruction

Usiigaci: Label-free Cell Tracking in Phase Contrast Microscopy

A project from Japan to automatically track cells in a microfluidics platform. Paper is pending, but the source code is released.

Characterization of Arctic Ice-Wedge Polygons in Very High Spatial Resolution Aerial Imagery

Research project to understand the complex processes between degradations in the Arctic and climate change. By Weixing Zhang, Chandi Witharana, Anna Liljedahl, and Mikhail Kanevskiy. image

Mask-RCNN Shiny

A computer vision class project by HU Shiyu to apply the color pop effect on people with beautiful results.

Mapping Challenge: Convert satellite imagery to maps for use by humanitarian organisations.

Mapping Challenge

GRASS GIS Addon to generate vector masks from geospatial imagery. Based on a Master's thesis by Ondřej Pešek.

GRASS GIS Image

Comments
  • Tensorflow 2.0 compatibility

    Tensorflow 2.0 compatibility

    Things to do / fix

    • [x] training mode
    • [x] inference mode
    • [x] calling crop_and_resize_v1 (from tensorflow.python.ops.image_ops_impl) with box_ind is deprecated and will be removed in a future version. Instructions for updating: box_ind is deprecated, use box_indices instead
    • [x] metrics
    • [x] all provided samples run (demo, training_shape, balloon)
    • [x] add some tests
    • [x] maybe split PR for better reviewability

    To try this do the following:

    Install Mask_RCNN

    git clone https://github.com/tomgross/Mask_RCNN.git --branch tensorflow-2.0
    cd Mask_RCNN
    bin/pip install --upgrade pip
    bin/pip install -r requirements.txt
    bin/pip install -e .
    

    Run examples

    There are some examples for training and inference mode you can run as notebooks in your setup.

    Run tests

    bin/pip install pytest
    bin/pytest tests
    

    Known issues

    • Tensorflow Version 2.1.0 seems to have troubles as reported here: https://github.com/matterport/Mask_RCNN/pull/1896#issuecomment-570299903 I did not run into troubles with the version 2.0.0 so far. If you encounter issues with 2.1.x please report, help fix or use 2.0.0
    • numpy version 1.18.0 seems to conflict with imgaug. Downgrading to 1.17.4 works for me.
    • If you use load_image_gt in your code/notebook remove the use_mini_maskargument since it has been use redundant with config: load_image_gt(dataset, config, image_id, use_mini_mask=False)
    • Older versions of pip don't find the releases of tensorflow 2. I don't know which is the breaking version, but 19.3.1 works.
    opened by tomgross 95
  • Increasing Output Mask Resolution

    Increasing Output Mask Resolution

    My masks continue to come out as relatively blocky and not dealing well with curves and points. For example the attached whale outline. I've turned off mini masks so I'm no longer downsampling those masks in training but does anyone have advice for increasing the output resolution of masks?

    There is this line in config.py: MASK_SHAPE = [28, 28] Which I assume is part of the solution but I'm not sure what else I'll need to change to accomodate for a larger mask size because it says "To change this mask size you also need to change the neural network mask branch."

    Any tips on what I need to alter to get this to put out a much higher resolution mask?

    screen shot 2018-06-03 at 2 34 07 pm
    opened by patrickcgray 57
  • Balloon Sample: What if we want to train for more than one class

    Balloon Sample: What if we want to train for more than one class

    In coco.py, annotation format is coco annotation format. In balloon.py, the annotation is achieved by VIA which is different to coco format. In shapes.py, the annotation is not needed coz there is no dataset I successfully trained, evaluated and tested all samples and notebooks (coco, balloon, shapes) and I also created my own dataset, my annotation by VIA. I could train the model using my own dataset successfully, but only for one class (one category). I need to train for more class and I did try to modify funtions: load_objects() and load_mask() without success. So please help.

    opened by AliceDinh 57
  • ImportError: No module named 'pycocotools'

    ImportError: No module named 'pycocotools'

    Thanks a lot I am having the following error "ImportError: No module named 'pycocotools'"

    Can you please advice me how to install it as I could not find it on pip or conda?

    I tried the installation from https://github.com/cocodataset/cocoapi but did not work too

    opened by Walid-Ahmed 52
  • ValueError: Tried to convert 'shape' to a tensor and failed. Error: None values not supported.

    ValueError: Tried to convert 'shape' to a tensor and failed. Error: None values not supported.

    I am running the below step - on my work environment to reproduce the code (using the kaggle data set)

    I spent some time and not able to figure it out. Please help me.

    I am using pyhon 3.6 and keras - 2.1.6-tf version and Linux


    Create model in inference modefrom - < inspect_nucleus_model.ipynb > file - I am trying to run the below code (with default setting) and getting the error (copied far below). I

    with tf.device(DEVICE): model = modellib.MaskRCNN(mode = "inference", model_dir = LOGS_DIR, config = config)


    ValueError Traceback (most recent call last) /miniconda/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py in _apply_op_helper(self, op_type_name, name, **keywords) 509 as_ref=input_arg.is_ref, --> 510 preferred_dtype=default_dtype) 511 except TypeError as err:

    /miniconda/lib/python3.6/site-packages/tensorflow/python/framework/ops.py in internal_convert_to_tensor(value, dtype, name, as_ref, preferred_dtype, ctx) 1106 if ret is None: -> 1107 ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref) 1108

    /miniconda/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py in _autopacking_conversion_function(v, dtype, name, as_ref) 959 return NotImplemented --> 960 return _autopacking_helper(v, inferred_dtype, name or "packed") 961

    /miniconda/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py in _autopacking_helper(list_or_tuple, dtype, name) 921 elems_as_tensors.append( --> 922 constant_op.constant(elem, dtype=dtype, name=str(i))) 923 return gen_array_ops.pack(elems_as_tensors, name=scope)

    /miniconda/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py in constant(value, dtype, shape, name, verify_shape) 195 tensor_util.make_tensor_proto( --> 196 value, dtype=dtype, shape=shape, verify_shape=verify_shape)) 197 dtype_value = attr_value_pb2.AttrValue(type=tensor_value.tensor.dtype)

    /miniconda/lib/python3.6/site-packages/tensorflow/python/framework/tensor_util.py in make_tensor_proto(values, dtype, shape, verify_shape) 423 if values is None: --> 424 raise ValueError("None values not supported.") 425 # if dtype is provided, forces numpy array to be the type

    ValueError: None values not supported.

    During handling of the above exception, another exception occurred:

    ValueError Traceback (most recent call last) /miniconda/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py in _apply_op_helper(self, op_type_name, name, **keywords) 523 observed = ops.internal_convert_to_tensor( --> 524 values, as_ref=input_arg.is_ref).dtype.name 525 except ValueError as err:

    /miniconda/lib/python3.6/site-packages/tensorflow/python/framework/ops.py in internal_convert_to_tensor(value, dtype, name, as_ref, preferred_dtype, ctx) 1106 if ret is None: -> 1107 ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref) 1108

    /miniconda/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py in _autopacking_conversion_function(v, dtype, name, as_ref) 959 return NotImplemented --> 960 return _autopacking_helper(v, inferred_dtype, name or "packed") 961

    /miniconda/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py in _autopacking_helper(list_or_tuple, dtype, name) 921 elems_as_tensors.append( --> 922 constant_op.constant(elem, dtype=dtype, name=str(i))) 923 return gen_array_ops.pack(elems_as_tensors, name=scope)

    /miniconda/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py in constant(value, dtype, shape, name, verify_shape) 195 tensor_util.make_tensor_proto( --> 196 value, dtype=dtype, shape=shape, verify_shape=verify_shape)) 197 dtype_value = attr_value_pb2.AttrValue(type=tensor_value.tensor.dtype)

    /miniconda/lib/python3.6/site-packages/tensorflow/python/framework/tensor_util.py in make_tensor_proto(values, dtype, shape, verify_shape) 423 if values is None: --> 424 raise ValueError("None values not supported.") 425 # if dtype is provided, forces numpy array to be the type

    ValueError: None values not supported.

    During handling of the above exception, another exception occurred:

    ValueError Traceback (most recent call last) in () 2 print (LOGS_DIR) 3 with tf.device(DEVICE): ----> 4 model = modellib.MaskRCNN(mode = "inference", model_dir = LOGS_DIR, config = config)

    /project/bioinformatics/Rajaram_lab/s183574/myCopy/Nuclei-Counting-Segmentation/mrcnn/model.py in init(self, mode, config, model_dir) 1839 self.model_dir = model_dir 1840 self.set_log_dir() -> 1841 self.keras_model = self.build(mode=mode, config=config) 1842 1843 def build(self, mode, config):

    /project/bioinformatics/Rajaram_lab/s183574/myCopy/Nuclei-Counting-Segmentation/mrcnn/model.py in build(self, mode, config) 2040 config.POOL_SIZE, config.NUM_CLASSES, 2041 train_bn=config.TRAIN_BN, -> 2042 fc_layers_size=config.FPN_CLASSIF_FC_LAYERS_SIZE) 2043 2044 # Detections

    /project/bioinformatics/Rajaram_lab/s183574/myCopy/Nuclei-Counting-Segmentation/mrcnn/model.py in fpn_classifier_graph(rois, feature_maps, image_meta, pool_size, num_classes, train_bn, fc_layers_size) 954 # Reshape to [batch, num_rois, NUM_CLASSES, (dy, dx, log(dh), log(dw))] 955 s = K.int_shape(x) --> 956 mrcnn_bbox = KL.Reshape((s[1], num_classes, 4), name="mrcnn_bbox")(x) 957 958 return mrcnn_class_logits, mrcnn_probs, mrcnn_bbox

    /miniconda/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py in call(self, inputs, *args, **kwargs) 686 687 if not in_deferred_mode: --> 688 outputs = self.call(inputs, *args, **kwargs) 689 if outputs is None: 690 raise ValueError('A layer's call method should return a Tensor '

    /miniconda/lib/python3.6/site-packages/tensorflow/python/keras/layers/core.py in call(self, inputs) 438 def call(self, inputs): 439 return array_ops.reshape(inputs, --> 440 (array_ops.shape(inputs)[0],) + self.target_shape) 441 442 def get_config(self):

    /miniconda/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py in reshape(tensor, shape, name) 6195 if _ctx is None or not _ctx._eager_context.is_eager: 6196 _, _, _op = _op_def_lib._apply_op_helper( -> 6197 "Reshape", tensor=tensor, shape=shape, name=name) 6198 _result = _op.outputs[:] 6199 _inputs_flat = _op.inputs

    /miniconda/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py in _apply_op_helper(self, op_type_name, name, **keywords) 526 raise ValueError( 527 "Tried to convert '%s' to a tensor and failed. Error: %s" % --> 528 (input_name, err)) 529 prefix = ("Input '%s' of '%s' Op has type %s that does not match" % 530 (input_name, op_type_name, observed))

    ValueError: Tried to convert 'shape' to a tensor and failed. Error: None values not supported.

    as_ref=input_arg.is_ref, --> 510 preferred_dtype=default_dtype) 511 except TypeError as err:

    /miniconda/lib/python3.6/site-packages/tensorflow/python/framework/ops.py in internal_convert_to_tensor(value, dtype, name, as_ref, preferred_dtype, ctx) 1106 if ret is None: -> 1107 ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref) 1108

    /miniconda/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py in _autopacking_conversion_function(v, dtype, name, as_ref) 959 return NotImplemented --> 960 return _autopacking_helper(v, inferred_dtype, name or "packed") 961

    /miniconda/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py in _autopacking_helper(list_or_tuple, dtype, name) 921 elems_as_tensors.append( --> 922 constant_op.constant(elem, dtype=dtype, name=str(i))) 923 return gen_array_ops.pack(elems_as_tensors, name=scope)

    /miniconda/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py in constant(value, dtype, shape, name, verify_shape) 195 tensor_util.make_tensor_proto( --> 196 value, dtype=dtype, shape=shape, verify_shape=verify_shape)) 197 dtype_value = attr_value_pb2.AttrValue(type=tensor_value.tensor.dtype)

    /miniconda/lib/python3.6/site-packages/tensorflow/python/framework/tensor_util.py in make_tensor_proto(values, dtype, shape, verify_shape) 423 if values is None: --> 424 raise ValueError("None values not supported.") 425 # if dtype is provided, forces numpy array to be the type

    ValueError: None values not supported.

    During handling of the above exception, another exception occurred:

    ValueError Traceback (most recent call last) /miniconda/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py in _apply_op_helper(self, op_type_name, name, **keywords) 523 observed = ops.internal_convert_to_tensor( --> 524 values, as_ref=input_arg.is_ref).dtype.name 525 except ValueError as err:

    /miniconda/lib/python3.6/site-packages/tensorflow/python/framework/ops.py in internal_convert_to_tensor(value, dtype, name, as_ref, preferred_dtype, ctx) 1106 if ret is None: -> 1107 ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref) 1108

    /miniconda/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py in _autopacking_conversion_function(v, dtype, name, as_ref) 959 return NotImplemented --> 960 return _autopacking_helper(v, inferred_dtype, name or "packed") 961

    /miniconda/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py in _autopacking_helper(list_or_tuple, dtype, name) 921 elems_as_tensors.append( --> 922 constant_op.constant(elem, dtype=dtype, name=str(i))) 923 return gen_array_ops.pack(elems_as_tensors, name=scope)

    /miniconda/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py in constant(value, dtype, shape, name, verify_shape) 195 tensor_util.make_tensor_proto( --> 196 value, dtype=dtype, shape=shape, verify_shape=verify_shape)) 197 dtype_value = attr_value_pb2.AttrValue(type=tensor_value.tensor.dtype)

    /miniconda/lib/python3.6/site-packages/tensorflow/python/framework/tensor_util.py in make_tensor_proto(values, dtype, shape, verify_shape) 423 if values is None: --> 424 raise ValueError("None values not supported.") 425 # if dtype is provided, forces numpy array to be the type

    ValueError: None values not supported.

    During handling of the above exception, another exception occurred:

    ValueError Traceback (most recent call last) in () 2 print (LOGS_DIR) 3 with tf.device(DEVICE): ----> 4 model = modellib.MaskRCNN(mode = "inference", model_dir = LOGS_DIR, config = config)

    /project/bioinformatics/Rajaram_lab/s183574/myCopy/Nuclei-Counting-Segmentation/mrcnn/model.py in init(self, mode, config, model_dir) 1839 self.model_dir = model_dir 1840 self.set_log_dir() -> 1841 self.keras_model = self.build(mode=mode, config=config) 1842 1843 def build(self, mode, config):

    /project/bioinformatics/Rajaram_lab/s183574/myCopy/Nuclei-Counting-Segmentation/mrcnn/model.py in build(self, mode, config) 2040 config.POOL_SIZE, config.NUM_CLASSES, 2041 train_bn=config.TRAIN_BN, -> 2042 fc_layers_size=config.FPN_CLASSIF_FC_LAYERS_SIZE) 2043 2044 # Detections

    /project/bioinformatics/Rajaram_lab/s183574/myCopy/Nuclei-Counting-Segmentation/mrcnn/model.py in fpn_classifier_graph(rois, feature_maps, image_meta, pool_size, num_classes, train_bn, fc_layers_size) 954 # Reshape to [batch, num_rois, NUM_CLASSES, (dy, dx, log(dh), log(dw))] 955 s = K.int_shape(x) --> 956 mrcnn_bbox = KL.Reshape((s[1], num_classes, 4), name="mrcnn_bbox")(x) 957 958 return mrcnn_class_logits, mrcnn_probs, mrcnn_bbox

    /miniconda/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py in call(self, inputs, *args, **kwargs) 686 687 if not in_deferred_mode: --> 688 outputs = self.call(inputs, *args, **kwargs) 689 if outputs is None: 690 raise ValueError('A layer's call method should return a Tensor '

    /miniconda/lib/python3.6/site-packages/tensorflow/python/keras/layers/core.py in call(self, inputs) 438 def call(self, inputs): 439 return array_ops.reshape(inputs, --> 440 (array_ops.shape(inputs)[0],) + self.target_shape) 441 442 def get_config(self):

    /miniconda/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py in reshape(tensor, shape, name) 6195 if _ctx is None or not _ctx._eager_context.is_eager: 6196 _, _, _op = _op_def_lib._apply_op_helper( -> 6197 "Reshape", tensor=tensor, shape=shape, name=name) 6198 _result = _op.outputs[:] 6199 _inputs_flat = _op.inputs

    /miniconda/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py in _apply_op_helper(self, op_type_name, name, **keywords) 526 raise ValueError( 527 "Tried to convert '%s' to a tensor and failed. Error: %s" % --> 528 (input_name, err)) 529 prefix = ("Input '%s' of '%s' Op has type %s that does not match" % 530 (input_name, op_type_name, observed))

    ValueError: Tried to convert 'shape' to a tensor and failed. Error: None values not supported.

    opened by ShibaPrasad 32
  • How should training be changed when using image augmentation?

    How should training be changed when using image augmentation?

    I've got a nice augmentation setup using the great imgaug library that was chosen for this project but I'm not certain how I should change my training parameters given this augmentation.

    I'm training basically using the same example from the imgaug documentation with some minor changes:

    seq = iaa.Sometimes(0.833, iaa.Sequential([
        iaa.Fliplr(0.5), # horizontal flips
        iaa.Crop(percent=(0, 0.1)), # random crops
        # Small gaussian blur with random sigma between 0 and 0.5.
        # But we only blur about 50% of all images.
        iaa.Sometimes(0.5,
            iaa.GaussianBlur(sigma=(0, 0.5))
        ),
        # Strengthen or weaken the contrast in each image.
        iaa.ContrastNormalization((0.75, 1.5)),
        # Add gaussian noise.
        # For 50% of all images, we sample the noise once per pixel.
        # For the other 50% of all images, we sample the noise per pixel AND
        # channel. This can change the color (not only brightness) of the
        # pixels.
        iaa.AdditiveGaussianNoise(loc=0, scale=(0.0, 0.05*255), per_channel=0.5),
        # Make some images brighter and some darker.
        # In 20% of all cases, we sample the multiplier once per channel,
        # which can end up changing the color of the images.
        iaa.Multiply((0.8, 1.2), per_channel=0.2),
        # Apply affine transformations to each image.
        # Scale/zoom them, translate/move them, rotate them and shear them.
        iaa.Affine(
            scale={"x": (0.8, 1.2), "y": (0.8, 1.2)},
            translate_percent={"x": (-0.2, 0.2), "y": (-0.2, 0.2)},
            rotate=(-25, 25),
            shear=(-8, 8)
        )
    ], random_order=True)) # apply augmenters in random order
    

    So given that this code basically runs these augmentations randomly for 5 out of 6 images should I increase my steps per epoch x6? Should I increase my num of epochs x6? How does the augmentation actually get applied during training?

    I see it is applied in load_image_gt() which is called in the data_generator() but I'm not totally clear what should change in my training regimen based on these augmentations.

    opened by patrickcgray 32
  • Train own Dataset by taking actual images

    Train own Dataset by taking actual images

    Hi,

    Thanks a lot for the awesome repository.

    I went to train_shapes file which describes about how to train for our own dataset.

    But all the things which you guys are doing over there is by generating randomly. Could explain the same by taking actual images which has ground truth of mask, class and bounding related information.

    Regards, Pirag

    opened by pirahagvp 31
  • Systematic analysis of the multi GPU problem

    Systematic analysis of the multi GPU problem

    EDIT: Actually there is no problem when using multiple GPUs. I and several other people were merely confused about the time/steps, that are output during the training. We mixed them up with time/image, while it rather is time/batch, where we have to take note of the fact that the batchsize is proportional both to the GPU_COUNT and the IMAGES_PER_GPU. Therefore it is perfectly normal for the time/step to increase by a factor<=2, when increasing GPU_COUNT from 1 to 2.

    Hello everybody, along with others (@waleedka, @ericj974, @schmidje, @liangbo-1, @dil, @kmh4321, @zgxsin, @YubinXie, @pieterbl86), I'm searching for the solution to the problem that the use of multiple GPUs can have a severely negative impact on the training speed (see #589, #676, #708, #710). There are however users, who claim that they are running or have run configurations, which don't have this problem. Therefore I'd like to make a poll how many users are affected by this problem.

    Poll: I already encountered a configuration, with which I was able to train significantly faster on multiple GPUs than on a single GPU.

    Obviously, the poll alone does not help too much. Therefore, I'd like to gather working and non-working configurations. Please include the following information in your post:

    • Did it work?
    • Number of GPUs (GPU_COUNT)
    • Number of images per GPU (IMAGES_PER_GPU)
    • utilized script (e.g. train_shapes.ipynb)
    • GPU model(s)
    • OS
    • CUDA version
    • cuDNN version
    • Driver version
    • Mask-RCNN version (URL of the repo and hash specifying the state of the repo)
    • Tensorflow version
    • Keras version
    • approx. speed up

    Thank you very much for your help.

    opened by maxfrei750 30
  • Training cause ERROR:root:Error processing image

    Training cause ERROR:root:Error processing image

    I am training images on my own dataset and used VGG Image Annotator (VIA) for segmenting images but it cause error when i ran

    model.train(dataset_train, dataset_val, 
                learning_rate=config.LEARNING_RATE, 
                epochs=1, 
                layers='heads')
    

    Here's the traceback:

    Starting at epoch 0. LR=0.001
    
    Checkpoint Path: C:\Users\Instructor\Desktop\Mask_RCNN\samples\landing\logs\landing20180604T1145\mask_rcnn_landing_{epoch:04d}.h5
    Selecting layers to train
    fpn_c5p5               (Conv2D)
    fpn_c4p4               (Conv2D)
    fpn_c3p3               (Conv2D)
    fpn_c2p2               (Conv2D)
    fpn_p5                 (Conv2D)
    fpn_p2                 (Conv2D)
    fpn_p3                 (Conv2D)
    fpn_p4                 (Conv2D)
    In model:  rpn_model
        rpn_conv_shared        (Conv2D)
        rpn_class_raw          (Conv2D)
        rpn_bbox_pred          (Conv2D)
    mrcnn_mask_conv1       (TimeDistributed)
    mrcnn_mask_bn1         (TimeDistributed)
    mrcnn_mask_conv2       (TimeDistributed)
    mrcnn_mask_bn2         (TimeDistributed)
    mrcnn_class_conv1      (TimeDistributed)
    mrcnn_class_bn1        (TimeDistributed)
    mrcnn_mask_conv3       (TimeDistributed)
    mrcnn_mask_bn3         (TimeDistributed)
    mrcnn_class_conv2      (TimeDistributed)
    mrcnn_class_bn2        (TimeDistributed)
    mrcnn_mask_conv4       (TimeDistributed)
    mrcnn_mask_bn4         (TimeDistributed)
    mrcnn_bbox_fc          (TimeDistributed)
    mrcnn_mask_deconv      (TimeDistributed)
    mrcnn_class_logits     (TimeDistributed)
    mrcnn_mask             (TimeDistributed)
    c:\users\instructor\appdata\local\programs\python\python35\lib\site-packages\tensorflow\python\ops\gradients_impl.py:100: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
      "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
    Epoch 1/1
      7/100 [=>............................] - ETA: 9:47 - loss: 2.7469 - rpn_class_loss: 0.1161 - rpn_bbox_loss: 1.0056 - mrcnn_class_loss: 1.1608 - mrcnn_bbox_loss: 0.2483 - mrcnn_mask_loss: 0.2161     
    ERROR:root:Error processing image {'path': 'C:\\Users\\Instructor\\Desktop\\Mask_RCNN\\samples\\landing\\dataset\\train\\flying 12.jpg', 'polygons': [{'name': 'polygon', 'all_points_x': [2, 102, 175, 214, 238, 275, 328, 391, 413, 435, 489, 519, 556, 551, 538, 557, 599, 629, 643, 652, 658, 685, 710, 711, 716, 721, 733, 759, 778, 797, 804, 808, 813, 840, 852, 864, 856, 856, 876, 883, 882, 870, 856, 832, 813, 788, 758, 746, 718, 705, 701, 710, 725, 715, 726, 734, 716, 713, 713, 686, 671, 655, 636, 630, 610, 607, 598, 614, 614, 610, 613, 1, 2], 'all_points_y': [660, 668, 677, 670, 644, 613, 610, 598, 587, 602, 588, 556, 524, 503, 473, 447, 456, 446, 428, 399, 368, 353, 342, 322, 294, 271, 258, 243, 241, 235, 208, 190, 173, 167, 181, 195, 207, 224, 230, 254, 296, 316, 347, 338, 337, 325, 325, 340, 368, 392, 415, 448, 456, 467, 485, 498, 511, 533, 548, 570, 569, 573, 588, 608, 615, 636, 646, 663, 691, 708, 718, 721, 660]}, {'name': 'polygon', 'all_points_x': [607, 612, 613, 597, 606, 609, 628, 634, 650, 667, 682, 714, 714, 713, 735, 714, 723, 711, 700, 705, 718, 755, 781, 854, 882, 882, 875, 853, 852, 863, 837, 810, 796, 752, 721, 707, 655, 642, 627, 595, 555, 536, 552, 486, 431, 411, 382, 325, 269, 211, 170, -1, -1, 1279, 1277, 607], 'all_points_y': [720, 693, 663, 644, 638, 617, 608, 586, 575, 566, 569, 548, 531, 511, 495, 464, 456, 447, 414, 388, 368, 327, 324, 346, 295, 254, 232, 222, 207, 196, 166, 171, 234, 244, 268, 341, 367, 428, 444, 458, 446, 471, 523, 588, 602, 587, 600, 609, 612, 669, 676, 659, 2, 2, 720, 720]}], 'width': 1280, 'id': 'flying 12.jpg', 'source': 'landing', 'height': 720}
    Traceback (most recent call last):
      File "c:\users\instructor\appdata\local\programs\python\python35\lib\site-packages\mask_rcnn-2.1-py3.5.egg\mrcnn\model.py", line 1695, in data_generator
        use_mini_mask=config.USE_MINI_MASK)
      File "c:\users\instructor\appdata\local\programs\python\python35\lib\site-packages\mask_rcnn-2.1-py3.5.egg\mrcnn\model.py", line 1210, in load_image_gt
        mask, class_ids = dataset.load_mask(image_id)
      File "<ipython-input-4-071b0f25d21e>", line 66, in load_mask
        mask[rr, cc, i] = 1
    IndexError: index 720 is out of bounds for axis 0 with size 720
     15/100 [===>..........................] - ETA: 8:15 - loss: 2.1900 - rpn_class_loss: 0.1246 - rpn_bbox_loss: 1.0832 - mrcnn_class_loss: 0.6205 - mrcnn_bbox_loss: 0.1760 - mrcnn_mask_loss: 0.1857
    ERROR:root:Error processing image {'path': 'C:\\Users\\Instructor\\Desktop\\Mask_RCNN\\samples\\landing\\dataset\\train\\flying 22.jpg', 'polygons': [{'name': 'polygon', 'all_points_x': [1086, 1279, 1278, 198, 279, 422, 419, 381, 372, 144, 153, 281, 196, 0, 1, 815, 805, 849, 900, 891, 932, 994, 1086], 'all_points_y': [716, 718, 2, 1, 112, 109, 128, 129, 149, 151, 111, 114, 1, 1, 719, 718, 665, 641, 593, 557, 576, 639, 716]}, {'name': 'polygon', 'all_points_x': [154, 143, 371, 379, 418, 419, 154], 'all_points_y': [113, 147, 149, 129, 128, 111, 113]}, {'name': 'polygon', 'all_points_x': [811, 808, 842, 874, 902, 893, 934, 987, 1086, 809, 811], 'all_points_y': [717, 663, 644, 616, 594, 555, 578, 635, 717, 722, 717]}], 'width': 1280, 'id': 'flying 22.jpg', 'source': 'landing', 'height': 720}
    Traceback (most recent call last):
      File "c:\users\instructor\appdata\local\programs\python\python35\lib\site-packages\mask_rcnn-2.1-py3.5.egg\mrcnn\model.py", line 1695, in data_generator
        use_mini_mask=config.USE_MINI_MASK)
      File "c:\users\instructor\appdata\local\programs\python\python35\lib\site-packages\mask_rcnn-2.1-py3.5.egg\mrcnn\model.py", line 1210, in load_image_gt
        mask, class_ids = dataset.load_mask(image_id)
      File "<ipython-input-4-071b0f25d21e>", line 66, in load_mask
        mask[rr, cc, i] = 1
    IndexError: index 720 is out of bounds for axis 0 with size 720
     27/100 [=======>......................] - ETA: 6:36 - loss: 2.2996 - rpn_class_loss: 0.1214 - rpn_bbox_loss: 1.0273 - mrcnn_class_loss: 0.4554 - mrcnn_bbox_loss: 0.3613 - mrcnn_mask_loss: 0.3342
    ERROR:root:Error processing image {'path': 'C:\\Users\\Instructor\\Desktop\\Mask_RCNN\\samples\\landing\\dataset\\train\\flying 06.jpg', 'polygons': [{'name': 'polygon', 'all_points_x': [381, 430, 423, 412, 355, 305, 283, 290, 320, 364, 408, 464, 482, 489, 520, 584, 623, 682, 734, 783, 818, 818, 1014, 1039, 1027, 1035, 1057, 1078, 1081, 1091, 1078, 1050, 1012, 820, 812, 828, 815, 842, 848, 860, 894, 889, 861, 852, 881, 902, 1003, 1281, 1275, 1, 3, 381], 'all_points_y': [718, 684, 670, 661, 692, 662, 635, 592, 555, 522, 505, 501, 467, 442, 433, 447, 428, 430, 426, 403, 417, 417, 282, 252, 223, 205, 208, 204, 240, 264, 311, 318, 283, 419, 441, 479, 512, 555, 553, 585, 606, 627, 634, 661, 678, 669, 721, 719, 1, 3, 719, 718]}, {'name': 'polygon', 'all_points_x': [383, 434, 423, 410, 355, 302, 281, 290, 319, 361, 406, 465, 480, 488, 517, 555, 582, 619, 680, 737, 772, 785, 819, 810, 822, 827, 810, 823, 836, 849, 857, 882, 891, 887, 858, 852, 883, 901, 1003, 619, 598, 578, 560, 533, 522, 507, 494, 490, 383], 'all_points_y': [716, 682, 667, 660, 689, 656, 630, 591, 553, 522, 507, 501, 471, 440, 435, 444, 445, 430, 431, 427, 409, 402, 417, 436, 461, 479, 513, 533, 553, 553, 585, 600, 603, 626, 632, 659, 677, 668, 719, 716, 688, 715, 693, 688, 702, 694, 708, 719, 716]}, {'name': 'polygon', 'all_points_x': [1021, 1014, 1035, 1050, 1074, 1082, 1086, 1078, 1079, 1077, 1056, 1035, 1026, 1037, 1030, 1021], 'all_points_y': [270, 284, 302, 314, 306, 283, 260, 240, 216, 203, 206, 203, 217, 245, 264, 270]}], 'width': 1280, 'id': 'flying 06.jpg', 'source': 'landing', 'height': 720}
    Traceback (most recent call last):
      File "c:\users\instructor\appdata\local\programs\python\python35\lib\site-packages\mask_rcnn-2.1-py3.5.egg\mrcnn\model.py", line 1695, in data_generator
        use_mini_mask=config.USE_MINI_MASK)
      File "c:\users\instructor\appdata\local\programs\python\python35\lib\site-packages\mask_rcnn-2.1-py3.5.egg\mrcnn\model.py", line 1210, in load_image_gt
        mask, class_ids = dataset.load_mask(image_id)
      File "<ipython-input-4-071b0f25d21e>", line 66, in load_mask
        mask[rr, cc, i] = 1
    IndexError: index 720 is out of bounds for axis 0 with size 720
     45/100 [============>.................] - ETA: 4:57 - loss: 2.3798 - rpn_class_loss: 0.1041 - rpn_bbox_loss: 0.9614 - mrcnn_class_loss: 0.3517 - mrcnn_bbox_loss: 0.5000 - mrcnn_mask_loss: 0.4627
    ERROR:root:Error processing image {'path': 'C:\\Users\\Instructor\\Desktop\\Mask_RCNN\\samples\\landing\\dataset\\train\\flying 05.jpg', 'polygons': [{'name': 'polygon', 'all_points_x': [307, 421, 423, 414, 314, 298, 291, 293, 287, 280, 288, 280, 270, 276, 281, 307, 340, 381, 418, 467, 476, 473, 549, 607, 690, 705, 731, 788, 795, 808, 802, 808, 824, 816, 806, 817, 846, 847, 853, 875, 889, 894, 888, 893, 889, 855, 929, 1023, 1061, 1016, 1282, 1281, 4, 3, 307], 'all_points_y': [718, 660, 645, 636, 677, 653, 624, 641, 623, 614, 607, 589, 525, 504, 468, 441, 424, 433, 469, 454, 426, 407, 393, 383, 402, 397, 412, 392, 390, 401, 418, 419, 437, 455, 479, 509, 522, 540, 559, 577, 582, 585, 597, 597, 602, 624, 653, 698, 717, 715, 717, -1, -1, 718, 718]}, {'name': 'polygon', 'all_points_x': [291, 314, 407, 411, 421, 421, 422, 365, 330, 306, 478, 472, 478, 485, 485, 484, 483, 484, 488, 492, 496, 506, 520, 526, 533, 549, 558, 568, 576, 588, 602, 603, 668, 737, 775, 856, 1063, 850, 886, 882, 892, 891, 853, 849, 846, 830, 817, 810, 805, 818, 821, 809, 809, 801, 809, 801, 795, 786, 774, 735, 708, 690, 606, 559, 475, 479, 474, 464, 416, 377, 341, 308, 279, 274, 270, 270, 276, 278, 278, 289, 277, 290, 291], 'all_points_y': [640, 672, 640, 634, 642, 650, 658, 690, 706, 720, 719, 706, 704, 704, 698, 694, 688, 680, 675, 668, 660, 660, 668, 656, 650, 654, 668, 676, 690, 669, 678, 692, 699, 686, 680, 719, 716, 623, 602, 609, 576, 591, 560, 541, 523, 518, 511, 493, 477, 453, 438, 417, 415, 418, 403, 395, 391, 394, 396, 412, 398, 401, 385, 394, 406, 423, 441, 458, 471, 432, 424, 439, 472, 504, 521, 534, 592, 595, 595, 606, 616, 626, 640]}, {'name': 'polygon', 'all_points_x': [586, 576, 567, 553, 546, 534, 526, 521, 507, 498, 486, 487, 477, 476, 480, 586], 'all_points_y': [717, 690, 679, 667, 656, 654, 656, 670, 662, 662, 678, 699, 705, 704, 719, 717]}], 'width': 1280, 'id': 'flying 05.jpg', 'source': 'landing', 'height': 720}
    Traceback (most recent call last):
      File "c:\users\instructor\appdata\local\programs\python\python35\lib\site-packages\mask_rcnn-2.1-py3.5.egg\mrcnn\model.py", line 1695, in data_generator
        use_mini_mask=config.USE_MINI_MASK)
      File "c:\users\instructor\appdata\local\programs\python\python35\lib\site-packages\mask_rcnn-2.1-py3.5.egg\mrcnn\model.py", line 1210, in load_image_gt
        mask, class_ids = dataset.load_mask(image_id)
      File "<ipython-input-4-071b0f25d21e>", line 66, in load_mask
        mask[rr, cc, i] = 1
    IndexError: index 1280 is out of bounds for axis 1 with size 1280
     71/100 [====================>.........] - ETA: 2:37 - loss: 2.2839 - rpn_class_loss: 0.0836 - rpn_bbox_loss: 0.9078 - mrcnn_class_loss: 0.2698 - mrcnn_bbox_loss: 0.5224 - mrcnn_mask_loss: 0.5003
    ERROR:root:Error processing image {'path': 'C:\\Users\\Instructor\\Desktop\\Mask_RCNN\\samples\\landing\\dataset\\train\\flying 07.jpg', 'polygons': [{'name': 'polygon', 'all_points_x': [380, 429, 422, 411, 354, 304, 282, 293, 324, 367, 405, 467, 482, 491, 516, 586, 626, 682, 741, 786, 817, 817, 1013, 1038, 1026, 1034, 1056, 1077, 1080, 1090, 1077, 1049, 1011, 819, 811, 827, 814, 841, 847, 859, 885, 888, 860, 851, 880, 901, 1002, 1280, 1276, 4, 2, 380], 'all_points_y': [718, 684, 670, 661, 692, 662, 635, 595, 565, 530, 513, 510, 478, 447, 441, 454, 436, 435, 435, 409, 417, 417, 282, 252, 223, 205, 208, 204, 240, 264, 311, 318, 283, 419, 441, 479, 512, 555, 553, 585, 607, 627, 634, 661, 678, 669, 721, 719, 3, 5, 719, 718]}, {'name': 'polygon', 'all_points_x': [386, 437, 426, 408, 357, 305, 284, 293, 322, 364, 409, 468, 483, 491, 520, 558, 585, 622, 683, 740, 775, 788, 822, 813, 825, 830, 813, 826, 839, 852, 860, 885, 884, 890, 861, 855, 886, 904, 1006, 622, 601, 581, 563, 536, 525, 510, 497, 493, 386], 'all_points_y': [724, 690, 675, 664, 693, 664, 638, 599, 561, 530, 515, 509, 479, 448, 443, 452, 453, 438, 439, 435, 417, 410, 425, 444, 469, 487, 521, 541, 561, 561, 593, 608, 613, 634, 640, 667, 685, 676, 727, 724, 696, 723, 701, 696, 710, 702, 716, 727, 724]}, {'name': 'polygon', 'all_points_x': [1021, 1014, 1035, 1050, 1074, 1082, 1086, 1078, 1079, 1077, 1056, 1035, 1026, 1037, 1030, 1021], 'all_points_y': [270, 284, 302, 314, 306, 283, 260, 240, 216, 203, 206, 203, 217, 245, 264, 270]}], 'width': 1280, 'id': 'flying 07.jpg', 'source': 'landing', 'height': 720}
    Traceback (most recent call last):
      File "c:\users\instructor\appdata\local\programs\python\python35\lib\site-packages\mask_rcnn-2.1-py3.5.egg\mrcnn\model.py", line 1695, in data_generator
        use_mini_mask=config.USE_MINI_MASK)
      File "c:\users\instructor\appdata\local\programs\python\python35\lib\site-packages\mask_rcnn-2.1-py3.5.egg\mrcnn\model.py", line 1210, in load_image_gt
        mask, class_ids = dataset.load_mask(image_id)
      File "<ipython-input-4-071b0f25d21e>", line 66, in load_mask
        mask[rr, cc, i] = 1
    IndexError: index 720 is out of bounds for axis 0 with size 720
    IndexError                                Traceback (most recent call last)
    <ipython-input-11-83fb3ae74319> in <module>()
          6             learning_rate=config.LEARNING_RATE,
          7             epochs=1,
    ----> 8             layers='heads')
    
    c:\users\instructor\appdata\local\programs\python\python35\lib\site-packages\mask_rcnn-2.1-py3.5.egg\mrcnn\model.py in train(self, train_dataset, val_dataset, learning_rate, epochs, layers, augmentation)
       2326             max_queue_size=100,
       2327             workers=workers,
    -> 2328             use_multiprocessing=True,
       2329         )
       2330         self.epoch = max(self.epoch, epochs)
    
    c:\users\instructor\appdata\local\programs\python\python35\lib\site-packages\keras\legacy\interfaces.py in wrapper(*args, **kwargs)
         89                 warnings.warn('Update your `' + object_name +
         90                               '` call to the Keras 2 API: ' + signature, stacklevel=2)
    ---> 91             return func(*args, **kwargs)
         92         wrapper._original_function = func
         93         return wrapper
    
    c:\users\instructor\appdata\local\programs\python\python35\lib\site-packages\keras\engine\training.py in fit_generator(self, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
       2192                 batch_index = 0
       2193                 while steps_done < steps_per_epoch:
    -> 2194                     generator_output = next(output_generator)
       2195 
       2196                     if not hasattr(generator_output, '__len__'):
    
    c:\users\instructor\appdata\local\programs\python\python35\lib\site-packages\mask_rcnn-2.1-py3.5.egg\mrcnn\model.py in data_generator(dataset, config, shuffle, augment, augmentation, random_rois, batch_size, detection_targets)
       1693                 load_image_gt(dataset, config, image_id, augment=augment,
       1694                               augmentation=augmentation,
    -> 1695                               use_mini_mask=config.USE_MINI_MASK)
       1696 
       1697             # Skip images that have no instances. This can happen in cases
    
    c:\users\instructor\appdata\local\programs\python\python35\lib\site-packages\mask_rcnn-2.1-py3.5.egg\mrcnn\model.py in load_image_gt(dataset, config, image_id, augment, augmentation, use_mini_mask)
       1208     # Load image and mask
       1209     image = dataset.load_image(image_id)
    -> 1210     mask, class_ids = dataset.load_mask(image_id)
       1211     original_shape = image.shape
       1212     image, window, scale, padding, crop = utils.resize_image(
    
    <ipython-input-4-071b0f25d21e> in load_mask(self, image_id)
         64             # Get indexes of pixels inside the polygon and set them to 1
         65             rr, cc = skimage.draw.polygon(p['all_points_y'], p['all_points_x'])
    ---> 66             mask[rr, cc, i] = 1
         67 
         68         # Return mask, and array of class IDs of each instance. Since we have
    
    IndexError: index 720 is out of bounds for axis 0 with size 720
    
    opened by jmarrr 27
  • error: operands could not be broadcast together with shapes (1280,1280,4) (3,)

    error: operands could not be broadcast together with shapes (1280,1280,4) (3,)

    My training has gone fine, but I'm getting the following error when I try to infer from an image. I'm using the balloon template to apply a splash image. Any help would be very much appreciated.

    image                    shape: (3000, 4000, 4)       min:    0.00000  max:  255.00000  uint8
    Traceback (most recent call last):
      File "test.py", line 374, in <module>
        video_path=args.video)
      File "test.py", line 238, in detect_and_color_splash
        r = model.detect([image], verbose=1)[0]
      File "/scratch/df525/Mask_RCNN-master/mrcnn/model.py", line 2457, in detect
        molded_images, image_metas, windows = self.mold_inputs(images)
      File "/scratch/df525/Mask_RCNN-master/mrcnn/model.py", line 2356, in mold_inputs
        molded_image = mold_image(molded_image, self.config)
      File "/scratch/df525/Mask_RCNN-master/mrcnn/model.py", line 2757, in mold_image
        return images.astype(np.float32) - config.MEAN_PIXEL
    ValueError: operands could not be broadcast together with shapes (1280,1280,4) (3,) 
    
    
    opened by doingRsrch 26
  • How do I interpret evaluation results on my custom dataset?

    How do I interpret evaluation results on my custom dataset?

    After performing training, I ran evaluation code on 500 images to which I received following results. Can any one help me understand these results. Any help is much appreciated.

    Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.061
     Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.075
     Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.059
     Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
     Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
     Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.094
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.099
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.099
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.099
     Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
     Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.149
    
    opened by Suvi-dha 24
  • Update model.py

    Update model.py

    opened by Ben-geo 0
  • How to use ScanNet dataset for instance segmentation?

    How to use ScanNet dataset for instance segmentation?

    Hi,

    Because I need to train a model suitable for indoor instance segmentation, I need to use the ScanNet dataset. To this end, I downloaded scannet_frames_25k as a data set, and its structure is as follows:

    scannet_frames_25k/
        scene0000_00/
            color/
    	depth/
    	instance/
    	label/
    	pose/
    	intrinsics_color.txt
    	intrinsics_depth.txt
        scene0000_01/
        ...
    

    The rgb images are stored in the color folder, and the semantic segmentation images are stored in the label folder.

    But if I want to do instance segmentation, I must have annotation files, so the question is, how do I process this ScanNet dataset to get a training/val set and annotation files available for this Mask_RCNN?

    opened by ydzat 0
  • KeyError: 'names'

    KeyError: 'names'

    i am facing this error when i'm trying to run my train(model). I am sure there is no error in my syntax. Please help me!

    KeyError Traceback (most recent call last) in ----> 3 train(model)

    in train(model) 3 # Training dataset. 4 dataset_train = CustomDataset() ----> 5 dataset_train.load_custom("C:\Users\User\OneDrive - Universiti Teknologi MARA\FYP\skin burn\MaskRCNN-main\Dataset", "train") 6 dataset_train.prepare() 7

    in load_custom(self, dataset_dir, subset) 45 # shape_attributes (see json format above) 46 polygons = [r['shape_attributes'] for r in a['regions']] ---> 47 objects = [s['region_attributes']['names' for s in a['regions']] 48 print("objects:",objects) 49 #name_dict = {"laptop": 1,"tab": 2,"phone": 3}

    in (.0) 45 # shape_attributes (see json format above) 46 polygons = [r['shape_attributes'] for r in a['regions']] ---> 47 objects = [s['region_attributes']['names'] for s in a['regions']] 48 print("objects:",objects) 49 #name_dict = {"laptop": 1,"tab": 2,"phone": 3}

    KeyError: 'names'

    opened by NazriHariz 0
  • Tensorborad callback for histogram = 1! gives following error!

    Tensorborad callback for histogram = 1! gives following error!

    TypeError Traceback (most recent call last) in 6 learning_rate=config.LEARNING_RATE, 7 epochs=5, ----> 8 layers='heads')

    ~/project/try2/Mask_RCNN/mrcnn/model.py in train(self, train_dataset, val_dataset, learning_rate, epochs, layers, augmentation, custom_callbacks, no_augmentation_sources) 2381 max_queue_size=100, 2382 workers=workers, -> 2383 use_multiprocessing=True, 2384 ) 2385 self.epoch = max(self.epoch, epochs)

    ~/anaconda3/envs/try2/lib/python3.7/site-packages/keras/legacy/interfaces.py in wrapper(*args, **kwargs) 89 warnings.warn('Update your ' + object_name + ' call to the ' + 90 'Keras 2 API: ' + signature, stacklevel=2) ---> 91 return func(*args, **kwargs) 92 wrapper._original_function = func 93 return wrapper

    ~/anaconda3/envs/try2/lib/python3.7/site-packages/keras/engine/training.py in fit_generator(self, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch) 1416 use_multiprocessing=use_multiprocessing, 1417 shuffle=shuffle, -> 1418 initial_epoch=initial_epoch) 1419 1420 @interfaces.legacy_generator_methods_support

    ~/anaconda3/envs/try2/lib/python3.7/site-packages/keras/engine/training_generator.py in fit_generator(model, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch) 92 else: 93 callback_model = model ---> 94 callbacks.set_model(callback_model) 95 callbacks.set_params({ 96 'epochs': epochs,

    ~/anaconda3/envs/try2/lib/python3.7/site-packages/keras/callbacks.py in set_model(self, model) 52 def set_model(self, model): 53 for callback in self.callbacks: ---> 54 callback.set_model(model) 55 56 def on_epoch_begin(self, epoch, logs=None):

    ~/anaconda3/envs/try2/lib/python3.7/site-packages/keras/callbacks.py in set_model(self, model) 847 else: 848 tf.compat.v1.summary.histogram('{}_out'.format(layer.name), --> 849 layer.output) 850 self.merged = tf.summary.merge_all() 851

    ~/anaconda3/envs/try2/lib/python3.7/site-packages/tensorflow/python/summary/summary.py in histogram(name, values, collections, family) 177 default_name='HistogramSummary') as (tag, scope): 178 val = _gen_logging_ops.histogram_summary( --> 179 tag=tag, values=values, name=scope) 180 _summary_op_util.collect(val, collections, [_ops.GraphKeys.SUMMARIES]) 181 return val

    ~/anaconda3/envs/try2/lib/python3.7/site-packages/tensorflow/python/ops/gen_logging_ops.py in histogram_summary(tag, values, name) 327 # Add nodes to the TensorFlow graph. 328 _, _, _op = _op_def_lib._apply_op_helper( --> 329 "HistogramSummary", tag=tag, values=values, name=name) 330 _result = _op.outputs[:] 331 _inputs_flat = _op.inputs

    ~/anaconda3/envs/try2/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py in _apply_op_helper(self, op_type_name, name, **keywords) 624 _SatisfiesTypeConstraint(base_type, 625 _Attr(op_def, input_arg.type_attr), --> 626 param_name=input_name) 627 attrs[input_arg.type_attr] = attr_value 628 inferred_from[input_arg.type_attr] = input_name

    ~/anaconda3/envs/try2/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py in _SatisfiesTypeConstraint(dtype, attr_def, param_name) 58 "allowed values: %s" % 59 (param_name, dtypes.as_dtype(dtype).name, ---> 60 ", ".join(dtypes.as_dtype(x).name for x in allowed_list))) 61 62

    TypeError: Value passed to parameter 'values' has DataType bool not in list of allowed values: float32, float64, int32, uint8, int16, int8, int64, bfloat16, uint16, float16, uint32, uint64

    opened by manaswakchaure 0
  • fix(sec): upgrade keras to 2.6.0rc3

    fix(sec): upgrade keras to 2.6.0rc3

    What happened?

    There are 1 security vulnerabilities found in keras 2.0.8

    What did I do?

    Upgrade keras from 2.0.8 to 2.6.0rc3 for vulnerability fix

    What did you expect to happen?

    Ideally, no insecure libs should be used.

    The specification of the pull request

    PR Specification from OSCS

    opened by vvsd 0
Releases(v2.1)
  • v2.1(Mar 19, 2018)

    This release adds:

    • The Balloon Color Splash sample, along with dataset and trained weights.
    • Convert the last prediction layer from Python to TensorFlow operations.
    • Automatic download of COCO weights and dataset.
    • Fixes for running on Windows.

    Thanks to everyone who made this possible with fixes and pull requests.

    Note: COCO weights are not updated in this release. Continue to use the .h5 file from release 2.0.

    Source code(tar.gz)
    Source code(zip)
    balloon_dataset.zip(36.94 MB)
    mask_rcnn_balloon.h5(244.00 MB)
  • v2.0(Nov 26, 2017)

    This release includes updates to improve training and accuracy, and a new MS COCO trained model.

    • Remove unnecessary dropout layer
    • Reduce anchor stride from 2 to 1
    • Increase ROI training mini batch to 200 per image
    • Improve computing proposal positive:negative ratio
    • Updated COCO training schedule
    • Add --logs param to coco.py to set logging directory
    • Bug Fix: exclude BN weights from L2 regularization
    • Use mean (rather than sum) of L2 regularization for a smoother loss in TensorBoard
    • Better compatibility with Python 2.7

    The new MS COCO trained weights improve the accuracy compared to the previous weights. These are the evaluation results on the minival dataset:

    Evaluate annotation type *bbox*
     Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.347
     Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.544
     Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.377
     Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.163
     Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.390
     Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.486
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.295
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.424
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.433
     Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.214
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.481
     Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.601
    
    Evaluate annotation type *segm*
     Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.296
     Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.510
     Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.306
     Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.128
     Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.330
     Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.430
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.258
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.369
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.376
     Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.173
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.417
     Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.538
    

    Big thanks to everyone who contributed to this repo. Names are in the commits history.

    Source code(tar.gz)
    Source code(zip)
    mask_rcnn_coco.h5(245.62 MB)
Owner
Matterport, Inc
Matterport, Inc
Code for the ECCV2020 paper "A Differentiable Recurrent Surface for Asynchronous Event-Based Data"

A Differentiable Recurrent Surface for Asynchronous Event-Based Data Code for the ECCV2020 paper "A Differentiable Recurrent Surface for Asynchronous

Marco Cannici 21 Oct 05, 2022
Fbone (Flask bone) is a Flask (Python microframework) starter/template/bootstrap/boilerplate application.

Fbone (Flask bone) is a Flask (Python microframework) starter/template/bootstrap/boilerplate application.

Wilson 1.7k Dec 30, 2022
Harmonic Memory Networks for Graph Completion

HMemNetworks Code and documentation for Harmonic Memory Networks, a series of models for compositionally assembling representations of graph elements

mlalisse 0 Oct 27, 2021
Semantic Segmentation of images using PixelLib with help of Pascalvoc dataset trained with Deeplabv3+ framework.

CARscan- Approach 1 - Segmentation of images by detecting contours. It failed because in images with elements along with cars were also getting detect

Padmanabha Banerjee 5 Jul 29, 2021
Oscar and VinVL

Oscar: Object-Semantics Aligned Pre-training for Vision-and-Language Tasks VinVL: Revisiting Visual Representations in Vision-Language Models Updates

Microsoft 938 Dec 26, 2022
This is the repo of the manuscript "Dual-branch Attention-In-Attention Transformer for speech enhancement"

DB-AIAT: A Dual-branch attention-in-attention transformer for single-channel SE

Guochen Yu 68 Dec 16, 2022
Example of a Quantum LSTM

Example of a Quantum LSTM

Riccardo Di Sipio 36 Oct 31, 2022
Hand tracking demo for DIY Smart Glasses with a remote computer doing the work

CameraStream This is a demonstration that streams the image from smartglasses to a pc, does the hand recognition on the remote pc and streams the proc

Teemu Laurila 20 Oct 13, 2022
Image-to-Image Translation with Conditional Adversarial Networks (Pix2pix) implementation in keras

pix2pix-keras Pix2pix implementation in keras. Original paper: Image-to-Image Translation with Conditional Adversarial Networks (pix2pix) Paper Author

William Falcon 141 Dec 30, 2022
PyTorch implementation of Graph Convolutional Networks in Feature Space for Image Deblurring and Super-resolution, IJCNN 2021.

GCResNet PyTorch implementation of Graph Convolutional Networks in Feature Space for Image Deblurring and Super-resolution, IJCNN 2021. The code will

11 May 19, 2022
Isaac Gym Reinforcement Learning Environments

Isaac Gym Reinforcement Learning Environments

NVIDIA Omniverse 714 Jan 08, 2023
Official pytorch implementation of Rainbow Memory (CVPR 2021)

Rainbow Memory: Continual Learning with a Memory of Diverse Samples

Clova AI Research 91 Dec 17, 2022
A developer interface for creating Chat AIs for the Chai app.

ChaiPy A developer interface for creating Chat AIs for the Chai app. Usage Local development A quick start guide is available here, with a minimal exa

Chai 28 Dec 28, 2022
Implementation of "Semi-supervised Domain Adaptive Structure Learning"

Semi-supervised Domain Adaptive Structure Learning - ASDA This repo contains the source code and dataset for our ASDA paper. Illustration of the propo

3 Dec 13, 2021
Pytorch implementation of our paper accepted by NeurIPS 2021 -- Revisiting Discriminator in GAN Compression: A Generator-discriminator Cooperative Compression Scheme

Revisiting Discriminator in GAN Compression: A Generator-discriminator Cooperative Compression Scheme (NeurIPS2021) (Link) Overview Prerequisites Linu

Shaojie Li 34 Mar 31, 2022
Software that can generate photos from paintings, turn horses into zebras, perform style transfer, and more.

CycleGAN PyTorch | project page | paper Torch implementation for learning an image-to-image translation (i.e. pix2pix) without input-output pairs, for

Jun-Yan Zhu 11.5k Dec 30, 2022
Import Python modules from dicts and JSON formatted documents.

Paker Paker is module for importing Python packages/modules from dictionaries and JSON formatted documents. It was inspired by httpimporter. Important

Wojciech Wentland 1 Sep 07, 2022
🛠️ Tools for Transformers compression using Lightning ⚡

Bert-squeeze is a repository aiming to provide code to reduce the size of Transformer-based models or decrease their latency at inference time.

Jules Belveze 66 Dec 11, 2022
The world's simplest facial recognition api for Python and the command line

Face Recognition You can also read a translated version of this file in Chinese 简体中文版 or in Korean 한국어 or in Japanese 日本語. Recognize and manipulate fa

Adam Geitgey 46.9k Jan 03, 2023
Code associated with the paper "Towards Understanding the Data Dependency of Mixup-style Training".

Mixup-Data-Dependency Code associated with the paper "Towards Understanding the Data Dependency of Mixup-style Training". Running Alternating Line Exp

Muthu Chidambaram 0 Nov 11, 2021