bethgelab / siamese-mask-rcnn Goto Github PK

Siamese Mask R-CNN model for one-shot instance segmentation

License: Other

Python 49.37% Jupyter Notebook 50.63%

object-detection instance-segmentation one-shot-instance-segmentation one-shot-learning few-shot-learning

siamese-mask-rcnn's Introduction

Siamese Mask R-CNN

This is the official implementation of Siamese Mask R-CNN from One-Shot Instance Segmentation. It is based on the Mask R-CNN implementation by Matterport.

The repository includes:

Source code of Siamese Mask R-CNN
Training code for MS COCO
Evaluation on MS COCO metrics (AP)
Training and evaluation of one-shot splits of MS COCO
Training code to reproduce the results from the paper
Pre-trained weights for ImageNet
Pre-trained weights for all models from the paper
Code to evaluate all models from the paper
Code to generate result figures

One-Shot Instance Segmentation

One-shot instance segmentation can be summed up as: Given a query image and a reference image showing an object of a novel category, we seek to detect and segment all instances of the corresponding category (in the image above ‘person’ on the left, ‘car’ on the right). Note that no ground truth annotations of reference categories are used during training. This type of visual search task creates new challenges for computer vision algorithms, as methods from metric and few-shot learning have to be incorporated into the notoriously hard tasks ofobject identification and segmentation. Siamese Mask R-CNN extends Mask R-CNN - a state-of-the-art object detection and segmentation system - with a Siamese backbone and a matching procedure to perform this type of visual search.

Installation

Clone this repository
Prepare COCO dataset as described below
Run the install_requirements.ipynb notebook to install all relevant dependencies.

Requirements

Linux, Python 3.4+, Tensorflow, Keras 2.1.6, cython, scikit_image 0.13.1, h5py, imgaug and opencv_python

Prepare COCO dataset

The model requires MS COCO and the CocoAPI to be added to /data.

cd data
git clone https://github.com/cocodataset/cocoapi.git

It is recommended to symlink the dataset root of MS COCO.

ln -s $PATH_TO_COCO$/coco coco

If unsure follow the instructions of the Matterport Mask R-CNN implementation.

Get pretrained weights

Get the pretrained weights from the releases menu and save them to /checkpoints.

Training

To train siamese mask r-cnn on MS COCO simply follow the instructions in the training.ipynb notebook. There are two model configs available, a small one which runs on a single GPU with 12GB memory and a large one which needs 4 GPUs with 12GB memory each. The second model config is the same as used in our experiments.

To reproduce our results and train the models reported in the paper run the notebooks provided in experiments. Those models need 4 GPUs with 12GB memory each.

Our models are trained on the coco 2017 training set, of which we remove the last 3000 images for validation.

Evaluation

To evaluate and visualize a models results run the evaluation.ipynb notebook. Make sure to use the same config as used for training the model.

To evaluate the models reported in the paper run the evaluation notebook provided in experiments. Each model will be evaluated 5 times to compensate for the stochastic effects introduced by randomly choosing the reference instances. The final result is the mean of those five runs.

We use the coco 2017 val set for testing and the last 3000 images from the training set for validation.

Model description

Siamese Mask R-CNN is designed as a minimal variation of Mask R-CNN which can perform the visual search task described above. For more details please read the paper.

Citation

If you use this repository or want to reference our work please cite our paper:

@article{michaelis_one-shot_2018,
    title = {One-Shot Instance Segmentation},
    author = {Michaelis, Claudio and Ustyuzhaninov, Ivan and Bethge, Matthias and Ecker, Alexander S.},
    year = {2018},
    journal = {arXiv},
    url = {http://arxiv.org/abs/1811.11507}
}

siamese-mask-rcnn's People

Contributors

Stargazers

Watchers

Forkers

husband-of-reimu fendaq msnqqer templeblock hajungong007 yougoforward sacadena luoshanwei avinash-chouhan danilopetrocelli mlipc dl-alva bivasmaiti26 siyan-zhao zhangjinsong3 huyhoang17 ryanxli witzou zhanght021 xcrobert mayv123123 aihill pawaragunawardena otakku79 xuanyuandi nelaturuharsha qianwen96 min-sheng mattavallone gaochen315 wyh20000305 shashidurbha jqx37 zhouchuangchuang grapesonwang lijunyu159 sailfish009 ml-and-ai-repo gcv9htd urviljivani007 shuaiw24 samihasara suhashimmareddy ankitshah009 jingwenzzhu astha-rastogi-1 abhishek-iyer1 repo-collection sunghwa0508 jk1602lyq alec-bell ake020675 siyuanliii zhoumaomin kaidduong stephanielewkowitz sirbastiano zxw-king rakehsaleem

siamese-mask-rcnn's Issues

Understanding about "NUM_CLASSES" and "IMAGE_META_SIZE" in config.py

I‘m sorry to bother you again. I can't grasp the meaning of "NUM_CLASSES" especially it‘s set to "1 + 1" in train.ipynb or initialized as "1" in config.py. Besides, I can make sense of the computation of "IMAGE_META_SIZE " in "self.IMAGE_META_SIZE = 1 + 3 + 3 + 4 + 1 + self.NUM_CLASSES".

Multiple runs on training not evaluation

Hi Thanks for the great work

I have a certain question regarding the results reported. In the paper you did mention you are performing 5 runs for the evaluation. So my question is did you try to also test the randomness from the training procedure itself since it is a stochastic process itself. You are randomly sampling the support set and query image so you would expect some variability. I am wondering if you reached stable results or not for the training itself even with different random seeds?

Thanks

error when loading dataset

Hi @michaelisc there is an error while loading dataset, could you pls take a look

ssh://wangtao@deep41:22/home/wangtao/anaconda2/envs/tensorflow_/bin/python -u /home/wangtao/.pycharm_helpers/pydev/pydevd.py --multiproc --qt-support=auto --client '0.0.0.0' --port 37213 --file /home/wangtao/prj/siamese_mrcnn/train.py
pydev debugger: process 15874 is connecting

Connected to pydev debugger (build 181.4892.64)
Using TensorFlow backend.


loading annotations into memory...
Done (t=26.20s)
creating index...
index created!
Traceback (most recent call last):
  File "/home/wangtao/.pycharm_helpers/pydev/pydevd.py", line 1664, in <module>
    main()
  File "/home/wangtao/.pycharm_helpers/pydev/pydevd.py", line 1658, in main
    globals = debugger.run(setup['file'], None, None, is_module)
  File "/home/wangtao/.pycharm_helpers/pydev/pydevd.py", line 1068, in run
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/home/wangtao/prj/siamese_mrcnn/train.py", line 35, in <module>
    coco_train.build_indices()
  File "/home/wangtao/prj/siamese_mrcnn/lib/utils.py", line 328, in build_indices
    self.category_image_index = IndexedCocoDataset._build_category_image_index(self.image_category_index)
TypeError: unbound method _build_category_image_index() must be called with IndexedCocoDataset instance as first argument (got list instance instead)

Regarding testing the model

I have trained the model using train.ipynb on a subset of coco dataset(i.e only on first 10 categories). How do I test the model on an image which doesn't belong to the coco dataset? I want to give explicitly a reference image and a query image. Please help me with this.

The problem occurred in train.ipynb

Create model object in inference mode.

model = siamese_model.SiameseMaskRCNN(mode="training", model_dir=MODEL_DIR, config=config)
ValueError Traceback (most recent call last)
in ()
1 # Create model object in inference mode.
----> 2 model = siamese_model.SiameseMaskRCNN(mode="training", model_dir=MODEL_DIR, config=config)

~/disk/ZSD/new_version_one_shot_instance_seg/lib/Mask_RCNN/mrcnn/model.py in init(self, mode, config, model_dir)
1830 self.model_dir = model_dir
1831 self.set_log_dir()
-> 1832 self.keras_model = self.build(mode=mode, config=config)
1833
1834 def build(self, mode, config):

~/disk/ZSD/new_version_one_shot_instance_seg/lib/model.py in build(self, mode, config)
294 # CHANGE: add siamese distance copmputation
295 # Combine FPs using L1 distance
--> 296 P2 = l1_distance_graph(IP2, TP2, feature_maps = 3self.config.FPN_FEATUREMAPS//2, name='P2')
297 P3 = l1_distance_graph(IP3, TP3, feature_maps = 3self.config.FPN_FEATUREMAPS//2, name='P3')
298 P4 = l1_distance_graph(IP4, TP4, feature_maps = 3*self.config.FPN_FEATUREMAPS//2, name='P4')

~/disk/ZSD/new_version_one_shot_instance_seg/lib/model.py in l1_distance_graph(P, T, feature_maps, name)
85 T = KL.Lambda(lambda x: K.expand_dims(K.expand_dims(x, axis=1), axis=1))(T)
86 # T = KL.Lambda(lambda x: K.tile(T, [1, int(P.shape[1]), int(P.shape[2]), 1]))(T)
---> 87 L1 = KL.Subtract()([P, T])
88 L1 = KL.Lambda(lambda x: K.abs(x))(L1)
89 D = KL.Concatenate()([P, L1])#KL.Concatenate()([P, T, L1])

~/disk/anaconda3/envs/py36/lib/python3.6/site-packages/keras/engine/topology.py in call(self, inputs, **kwargs)
600
601 # Actually call the layer, collecting output(s), mask(s), and shape(s).
--> 602 output = self.call(inputs, **kwargs)
603 output_mask = self.compute_mask(inputs, previous_mask)
604

~/disk/anaconda3/envs/py36/lib/python3.6/site-packages/keras/layers/merge.py in call(self, inputs)
144 return y
145 else:
--> 146 return self._merge_function(inputs)
147
148 def compute_output_shape(self, input_shape):

~/disk/anaconda3/envs/py36/lib/python3.6/site-packages/keras/layers/merge.py in _merge_function(self, inputs)
241 'on exactly 2 inputs')
242 if inputs[0]._keras_shape != inputs[1]._keras_shape:
--> 243 raise ValueError('Subtract layer should be called '
244 'on inputs of the same shape')
245 return inputs[0] - inputs[1]

ValueError: Subtract layer should be called on inputs of the same shape

PRE_NMS_LIMIT

I got an error while running first evaluate.ipynb and then train.ipynb that PRE_NMS_LIMIT is not defined in the model.py. I have found in the original Mask R-CNN implementation that it should be set to 6000. You should add the parameter to avoid the error to both files.

Loading weights

Hello!
Very nice paper :)
I am trying to use the model for my own images so I loaded the pretrained weights but ran into an issue: here is the code adapted from evaluate.ipynb

`
%load_ext autoreload
%autoreload 2
%matplotlib inline
#%load_ext line_profiler

import tensorflow.compat.v1 as tf
tf.logging.set_verbosity(tf.logging.INFO)
sess_config = tf.ConfigProto()

import sys
import os

COCO_DATA = 'data/coco/'
MASK_RCNN_MODEL_PATH = 'lib/Mask_RCNN/'

if MASK_RCNN_MODEL_PATH not in sys.path:
sys.path.append(MASK_RCNN_MODEL_PATH)

from samples.coco import coco
from mrcnn import utils
from mrcnn import model as modellib
from mrcnn import visualize

from lib import utils as siamese_utils
from lib import model as siamese_model
from lib import config as siamese_config

import time
import datetime
import random
import numpy as np
import skimage.io
import imgaug
import pickle
import matplotlib.pyplot as plt
from collections import OrderedDict

Root directory of the project

ROOT_DIR = os.getcwd()

Directory to save logs and trained model

MODEL_DIR = os.path.join(ROOT_DIR, "logs")

Select checkpoint

checkpoint = 'checkpoints/small_siamese_mrcnn_0160.h5'

class SmallEvalConfig(siamese_config.Config):
# Set batch size to 1 since we'll be running inference on
# one image at a time. Batch size = GPU_COUNT * IMAGES_PER_GPU
GPU_COUNT = 1
IMAGES_PER_GPU = 1
NUM_CLASSES = 1 + 1
NAME = 'coco'
EXPERIMENT = 'evaluation'
CHECKPOINT_DIR = 'checkpoints/'
NUM_TARGETS = 1

config = SmallEvalConfig()

model = siamese_model.SiameseMaskRCNN(mode="inference", model_dir=MODEL_DIR, config=config)
model.load_checkpoint(checkpoint, training_schedule=train_schedule)

When i load the checkpoint I get the following error:

ValueError Traceback (most recent call last)
Cell In[50], line 2
1 model = siamese_model.SiameseMaskRCNN(mode="inference", model_dir=MODEL_DIR, config=config)
----> 2 model.load_checkpoint(checkpoint, training_schedule=train_schedule)
3 model

File ~/siamese-mask-rcnn/lib/model.py:847, in SiameseMaskRCNN.load_checkpoint(self, weights_path, training_schedule, verbose)
845 self.set_trainable(".*")
846 # load weights
--> 847 self.load_weights(weights_path, by_name=True)
848 self.epoch = epoch_index

File ~/siamese-mask-rcnn/lib/Mask_RCNN/mrcnn/model.py:2115, in MaskRCNN.load_weights(self, filepath, by_name, exclude)
2112 layers = filter(lambda l: l.name not in exclude, layers)
2114 if by_name:
-> 2115 hdf5_format.load_weights_from_hdf5_group(f, layers)
2116 else:
2117 hdf5_format.load_weights_from_hdf5_group(f, layers)

File ~/.local/lib/python3.8/site-packages/tensorflow/python/keras/saving/hdf5_format.py:705, in load_weights_from_hdf5_group(f, layers)
702 weight_values = preprocess_weights_for_loading(
703 layer, weight_values, original_keras_version, original_backend)
704 if len(weight_values) != len(symbolic_weights):
--> 705 raise ValueError('Layer #' + str(k) + ' (named "' + layer.name +
706 '" in the current model) was found to '
707 'correspond to layer ' + name + ' in the save file. '
708 'However the new layer ' + layer.name + ' expects ' +
709 str(len(symbolic_weights)) +
710 ' weights, but the saved weights have ' +
711 str(len(weight_values)) + ' elements.')
712 weight_value_tuples += zip(symbolic_weights, weight_values)
713 backend.batch_set_value(weight_value_tuples)

ValueError: Layer #13 (named "mrcnn_class_logits" in the current model) was found to correspond to layer mrcnn_class_bn1 in the save file. However the new layer mrcnn_class_logits expects 2 weights, but the saved weights have 4 elements

TypeError: unhashable type: 'ListWrapper' when try to train model

Hello. Could you please help me figure out this error when i run this training command

for epochs, parameters in train_schedule.items():
    print("")
    print("training layers {} until epoch {} with learning_rate {}".format(parameters["layers"], 
                                                                          epochs, 
                                                                          parameters["learning_rate"]))
    model.train(coco_train, coco_val, 
                learning_rate=parameters["learning_rate"], 
                epochs=epochs, 
                layers=parameters["layers"])

After that. I got this following output and error

training layers heads until epoch 1 with learning_rate 0.02

Starting at epoch 0. LR=0.02

Checkpoint Path: C:\Users\supha\Desktop\FIBO\intern\Onboard\siamese-mask-rcnn-master - Copy\logs\siamese_mrcnn_small_coco_example\siamese_mrcnn_{epoch:04d}.h5
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [11], in <cell line: 1>()
      2 print("")
      3 print("training layers {} until epoch {} with learning_rate {}".format(parameters["layers"], 
      4                                                                       epochs, 
      5                                                                       parameters["learning_rate"]))
----> 6 model.train(coco_train, coco_val, 
      7             learning_rate=parameters["learning_rate"], 
      8             epochs=epochs, 
      9             layers=parameters["layers"])

File ~\Desktop\FIBO\intern\Onboard\siamese-mask-rcnn-master - Copy\lib\model.py:711, in SiameseMaskRCNN.train(self, train_dataset, val_dataset, learning_rate, epochs, layers, augmentation)
    709 modellib.log("Checkpoint Path: {}".format(self.checkpoint_path))
    710 self.set_trainable(layers)
--> 711 self.compile(learning_rate, self.config.LEARNING_MOMENTUM)
    713 # Work-around for Windows: Keras fails on Windows when using
    714 # multiprocessing workers. See discussion here:
    715 # https://github.com/matterport/Mask_RCNN/issues/13#issuecomment-353124009
    716 if os.name is 'nt':

File ~\Desktop\FIBO\intern\Onboard\siamese-mask-rcnn-master - Copy\lib\model.py:540, in SiameseMaskRCNN.compile(self, learning_rate, momentum)
    538         continue
    539     loss = (tf.reduce_mean(input_tensor=layer.output, keepdims=True) * self.config.LOSS_WEIGHTS.get(name, 1.))
--> 540     self.keras_model.add_loss(loss)
    542 # Add L2 Regularization
    543 # Skip gamma and beta weights of batch normalization layers.
    544 reg_losses = [
    545     keras.regularizers.l2(self.config.WEIGHT_DECAY)(w) / tf.cast(tf.size(w), tf.float32)
    546     for w in self.keras_model.trainable_weights
    547     if 'gamma' not in w.name and 'beta' not in w.name]

File ~\Desktop\FIBO\intern\Onboard\.venv\lib\site-packages\keras\engine\base_layer_v1.py:1054, in Layer.add_loss(self, losses, inputs)
   1052 for symbolic_loss in symbolic_losses:
   1053   if getattr(self, '_is_graph_network', False):
-> 1054     self._graph_network_add_loss(symbolic_loss)
   1055   else:
   1056     # Possible a loss was added in a Layer's `build`.
   1057     self._losses.append(symbolic_loss)

File ~\Desktop\FIBO\intern\Onboard\.venv\lib\site-packages\keras\engine\functional.py:908, in Functional._graph_network_add_loss(self, symbolic_loss)
    906 new_nodes.extend(add_loss_layer.inbound_nodes)
    907 new_layers.append(add_loss_layer)
--> 908 self._insert_layers(new_layers, new_nodes)

File ~\Desktop\FIBO\intern\Onboard\.venv\lib\site-packages\keras\engine\functional.py:851, in Functional._insert_layers(self, layers, relevant_nodes)
    848     self._nodes_by_depth[depth].append(node)
    850 # Insert layers and update other layer attrs.
--> 851 layer_set = set(self._self_tracked_trackables)
    852 deferred_layers = []
    853 for layer in layers:

File ~\Desktop\FIBO\intern\Onboard\.venv\lib\site-packages\tensorflow\python\training\tracking\data_structures.py:668, in ListWrapper.__hash__(self)
    665 def __hash__(self):
    666   # List wrappers need to compare like regular lists, and so like regular
    667   # lists they don't belong in hash tables.
--> 668   raise TypeError("unhashable type: 'ListWrapper'")

TypeError: unhashable type: 'ListWrapper'

I run this code with tensorflow 2.5.0 and keras 2.8.0 and i tried to change something in loss function code but it's still same error
Thanks for help!

Prediction time during testing

Hi @mbethge ,
Thank you for sharing your great work.
I just downloaded your pre-trained model and made some tests on my own dataset.
The first thing I can see is that the time this model needs to predict 1 image is quite slow, about more than 7s for 1 image.
What's about your prediction time for 1 image ?

Should the RPN_ANCHOR_SCALE in config,py be modified as the input size is only 512

I noticed that this implementation is using 512 as the query image size. However, the max rpn anchor scale is set to be 512 by default. Should this be modified according to the input size?

Training error

@michaelisc Hello michael, Iam trying to run training part in your work. But Iam having this error

I had verified that the number of workers is 0 and I also set the multiprocessing = False as well. But I still had that problem.
Can you show me how to cope with that ?

issue re-running

the links provided for training.ipynb and evaluation.ipynb are not working. Could you please fix that?

I am running this repo on google colab, after the instalation and preparation, which one of these should I run?

and what are their differences?

shape error

I am not able to go beyond the following error even though I have created a conda environment and installed all the required packages as required:
-->ValueError: Dimension 3 in both shapes must be equal, but are 512 and 1024. Shapes are [7,7,384,512] and [7,7,384,1024]. for 'Assign_362' (op: 'Assign') with input shapes: [7,7,384,512], [7,7,384,1024].

Under segmentation for close items

First of all, thank you for the great work！

I encounter a problem of under segmenting for instance sitting next to each other, in particular, segmenting a pack of boxes, the algorithm just considers all boxes a single instance.

Is there a way around this if instances are not separated by background? I know the size of the box in pixel space, is there a way I can utilize this information to achieve the correct segmentation? I tried modifying the size and aspect ratio of the anchor boxes but didn't see positive changes.

Thanks in advance!

Problem with custom dataset

Hi everyone,
I am trying to train the siamese model with a custom dataset (comprises three classes) and I used the trained weight file (mask_rcnn_coco.h5). The dataset_train and the dataset_val are saved as JSON format like the Mask R-CNN repository.
But I received the error about the image shapes as below.
How I can reshape the image size to fit with this model?
Thank you!

This is the code of the training part.

`# Training
if name == 'main':
dataset_dir = os.path.join(ROOT_DIR, "shapes")

dataset_train = shapesDataset()
dataset_train.load_shapes(dataset_dir, "train")
dataset_train.prepare()

# Validation dataset
dataset_val = shapesDataset()
dataset_val.load_shapes(dataset_dir, "val")
dataset_val.prepare()

config = shapesConfig()
config.display()

# Create model object in inference mode.
model = siamese_model.SiameseMaskRCNN(mode="training", model_dir=MODEL_DIR, config=config)

# Select weights file to load
init_with = "coco"
if init_with == "coco":
    model.load_weights(COCO_WEIGHTS_PATH, by_name=True,
                       exclude=["mrcnn_class_logits", "mrcnn_bbox_fc",
                                "mrcnn_bbox", "mrcnn_mask"])
elif init_with == "last":
    model.load_weights(model.find_last(), by_name=True)
elif init_with == "imagenet":
    model.load_weights(model.get_imagenet_weights(), by_name=True)


start_train = time.time()
model.train(dataset_train, dataset_val,
            learning_rate=config.LEARNING_RATE,
            epochs=30,
            layers='heads', )

history = model.keras_model.history.history
epochs = range(1, len(next(iter(history.values()))) + 1)

plt.figure()
plt.plot(epochs, history["loss"], label="Train loss")
plt.plot(epochs, history["val_loss"], label="Valid loss")
plt.title('Train loss and Valid loss', fontsize=12, fontweight='bold')
plt.xlabel('Number of Epoch', fontsize=10)
plt.ylabel('Loss value', fontsize=10)
plt.legend(fontsize=10)
plt.savefig('loss.png')
plt.show()

best_epoch = np.argmin(history["val_loss"])
print("Best Epoch:", best_epoch + 1, history["val_loss"][best_epoch])

end_train = time.time()
minutes = round((end_train - start_train) / 60, 2)
print(f'Training took {minutes} minutes')

`
The error:

ValueError: Dimension 2 in both shapes must be equal, but are 384 and 256. Shapes are [3,3,384,512] and [3,3,256,512]. for 'Assign' (op: 'Assign') with input shapes: [3,3,384,512], [3,3,256,512].

How are samples of imagenet preprocessed during pretraining?

Just wondering how is the pretrained images preprocessed, are they scaled to be 0-1 and normalized with imagenet mean and std?

Small typo

In lib/model.py at line 377:

target_rois = KL.Lambda(lambda x: modellig.norm_boxes_graph(
                    x, K.shape(input_image)[1:3]))(input_rois)

modellig is supposed to be modellib

The problem meet when i try to run the train.ipynb

Running the Dataset section

coco_train = siamese_utils.IndexedCocoDataset()
coco_train.load_coco(COCO_DATA, "train", year="2017")
coco_train.prepare()
coco_train.build_indices()
coco_train.ACTIVE_CLASSES = train_classes

coco_val = siamese_utils.IndexedCocoDataset()
coco_val.load_coco(COCO_DATA, "val", year="2017")
coco_val.prepare()
coco_val.build_indices()
coco_val.ACTIVE_CLASSES = train_classes

it says i lack "active_classes

loading annotations into memory...
Done (t=16.04s)
creating index...
index created!

AttributeError Traceback (most recent call last)
in ()
2 coco_train = siamese_utils.IndexedCocoDataset()
3 # coco_train.set_active_classes(train_classes)
----> 4 coco_train.load_coco(COCO_DATA, "train", year="2017")
5 coco_train.prepare()
6 coco_train.build_indices()

/mnt/Disk1/liangzh/siamese-mask-rcnn/lib/Mask_RCNN/samples/coco/coco.py in load_coco(self, dataset_dir, subset, year, class_ids, class_map, return_coco, auto_download, subsubset)
118 # # All classes
119 # class_ids = sorted(coco.getCatIds())
--> 120 if len(self.active_classes) == 0:
121 # All classes
122 class_ids = sorted(coco.getCatIds())

AttributeError: 'IndexedCocoDataset' object has no attribute 'active_classes'

thank you!

Dimension dismatch error when evaluate the retrained model for COCO, from epoch 2

Hi,

I trained COCO from your pretrained ImageNet weight, then use the trained COCO model to evaluate. It works fine with the first epoch model, however, start from the second epoch model, it show error in
model = siamese_model.SiameseMaskRCNN(mode="inference", model_dir=MODEL_DIR, config=config)
model.load_checkpoint(checkpoint, training_schedule=train_schedule)
that:
Dimension 0 in both shapes must be equal, but are 1 and 7. Shapes are [1,1,512,256] and [7,7,3,64]. for 'Assign_394' (op: 'Assign') with input shapes: [1,1,512,256], [7,7,3,64].

If we note the epoch 1 and 2 model as 'siamese_mrcnn_0001.h5‘ and 'siamese_mrcnn_0002.h5', I checked the shape of the h5py dataset of both models, they are exact same. May I get some suggestion about this error? Thanks

One-Shot Detection - input_target shape Error

Hi! Amazing work and very nice codebase overall. I enjoyed checking the architecture.

I tried testing the model on the "small" configuration with a single query image and a reference, loading them from cv2:

Basically:

class_img = cv2.imread(f"./data/ligilog/LigiLog-100/classes/images/{sample_class_id}.jpg")
class_img = cv2.cvtColor(class_img, cv2.COLOR_BGR2RGB)
image_img = cv2.imread(f"./data/ligilog/LigiLog-100/src/images/{sample_image_id}.jpg")
image_img = cv2.cvtColor(image_img, cv2.COLOR_BGR2RGB)
model.detect([class_img], [image_img], verbose=3, random_detections=False)[0]

and I'm finding the following issue:

ValueError                                Traceback (most recent call last)
<ipython-input-53-29daa0ddecdf> in <module>
----> 1 model.detect([class_img], [image_img], verbose=3, random_detections=False)[0]
      2 # model.detect([np.reshape(class_img, tuple([1] + list(class_img.shape)))], [image_img], verbose=2, random_detections=False)[0]

~/osod/siamese-mask-rcnn/lib/model.py in detect(self, targets, images, verbose, random_detections, eps)
    769         # CHANGE: Use siamese detection model
    770         detections, _, _, mrcnn_mask, _, _, _ =\
--> 771             self.keras_model.predict([molded_images, image_metas, molded_targets, anchors], verbose=2)
    772         if random_detections:
    773             # Randomly shift the detected boxes

~/anaconda3/envs/aws_neuron_tensorflow_p36/lib/python3.6/site-packages/keras/engine/training.py in predict(self, x, batch_size, verbose, steps)
   1162                              'argument.')
   1163         # Validate user data.
-> 1164         x, _, _ = self._standardize_user_data(x)
   1165         if self.stateful:
   1166             if x[0].shape[0] > batch_size and x[0].shape[0] % batch_size != 0:

~/anaconda3/envs/aws_neuron_tensorflow_p36/lib/python3.6/site-packages/keras/engine/training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, check_array_lengths, batch_size)
    755             feed_input_shapes,
    756             check_batch_axis=False,  # Don't enforce the batch size.
--> 757             exception_prefix='input')
    758 
    759         if y is not None:

~/anaconda3/envs/aws_neuron_tensorflow_p36/lib/python3.6/site-packages/keras/engine/training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix, check_last_layer_shape)
    129                         ': expected ' + names[i] + ' to have ' +
    130                         str(len(shape)) + ' dimensions, but got array '
--> 131                         'with shape ' + str(data_shape))
    132                 if not check_batch_axis:
    133                     data_shape = data_shape[1:]

ValueError: Error when checking input: expected input_target to have 5 dimensions, but got array with shape (1, 57, 266, 3)

I saw that the input_target shape is a function of config.NUM_TARGETS and config.TARGET_SHAPE, however I tried playing with those 2 values and got no solution.

Could you point me at the change I'd have to do in the configuration for this to be solved?

Thanks!

How to use the pretrained weights?

Hi @michaelisc, thanks for sharing the work! I'm wondering how to use the released pretrained weights? they are named like large_siamese_mrcnn_coco_i1_0160.h5, but the checkpoint path in the eval ipynb is like
'/home/wangtao/prj/siamese_mrcnn/experiments/logs/parallel_coco_full/siamese_mrcnn_0320.h5', so, should I download, rename and put the checkpoint into ./experiments/logs/parallel_coco_full/?

Problem with mask predictions

Hi!
I am wondering how to inference with the unseen classes? In the evaluate.ipynb, I notice that you comment the coco_nopascal_classes.

# train_classes = coco_nopascal_classes
train_classes = np.array(range(1,81))

To more specifically, how do I train a model with COCO no pascal classess and evaluate the model with pascal dataset?
Thank you!

Update Mask_RCNN

We should update lib/Mask_RCNN to be the same as the current version of https://github.com/matterport/Mask_RCNN

Why are model loading and inference times so slow?

Hello, can you explain why the model loading and inference times are so slow?

typo

in the evaluate.ipynb, checkpoints/large_siamese_mrcnn_coco_full_0320.h was missing a 5 in the end

# Select checkpoint
if model_size == 'small':
    checkpoint = 'checkpoints/small_siamese_mrcnn_0160.h5'
elif model_size == 'large':
    checkpoint = 'checkpoints/large_siamese_mrcnn_coco_full_0320.h'

Understanding the evaluation output

Hi, thanks for the interesting work!

I am trying to reproduce the mAP50 reported in the paper, but I have a problem of understanding the evaluation output result.
I downloaded the pretrained model large_siamese_mrcnn_coco_full_0320.h5 and ran the experiments/evaluate-experiments.ipynb.
After the first run of evaluation on all classes, the output result is:

Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.221
Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.357
Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.238
Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.118
Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.216
Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.310
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.201
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.377
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.395
Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.217
Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.416
Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.570

How to intepret this result? And how is this related to the results reported in the paper? (Siamese Mask-RCNN: 35.7% mAP50 (detection) and 33.4% (instance segmentation))

Support for Tensorflow 2.4+

Hello! Do you think there will be an updated code that supports Tensorflow 2.4+ so that we can run this code on the latest CUDA version? Right now there are too many compatibility issues while doing this.

Error while Evaluation

ValueError: Dimension 3 in both shapes must be equal, but are 512 and 1024. Shapes are [7,7,384,512] and [7,7,384,1024]. for 'Assign_3992' (op: 'Assign') with input shapes: [7,7,384,512], [7,7,384,1024].

Out of memory

When I try to run train.ipynb on my server, it prompts out of memory and stops running when epochs is up to 40. The memory on my server is about 11G. How can I deal with it to run this program on my server successfully?

Performance is not as good as Mask RCNN when training on a small custom dataset

I have a small dataset of fewer than 200 images, they are different items, but I labeled them only as foreground instances and background. I fine-tuned MASK RCNN and Siamese-mask rcnn using the exact same config, but Mask rcnn shows significantly better performance. The template images I supplied during testing are crops of the instance I used for training, so I don't understand what makes the performance worse on siamese mask rcnn... I will appreciate any suggestions from you, thanks!

The purpose of concatenate difference to the scene representation

@michaelisc Hello, Iam here again. I do not understand in matching part in your paper, can you tell me why you concatenated the differences (between the ref embedding & that of the scene) to the scene representation ?
Thank in advance

Test.ipynb

Hello, can you please share any testing ipynb file with two or more images(reference images) so that we can easily understand the beauty of this paper and method? Right now, it's hard to figure how to use the same pre-trained weights on a custom datasets for testing or in video segmentation.

Results

@michaelisc Hello, when testing your pretrained weights, Iam having a very weird results:

I also tested with arbitrary reference images, and the detected results were nearly the same. Seems the network doesn't care much about what the ref image is.

I think it would be a big problem.

Cannot load data successfully

I put train2017, val2017 and annotations under data/coco. The error info:

loading annotations into memory...
Done (t=15.71s)
creating index...
index created!
Training data is loaded !
loading annotations into memory...
Done (t=23.77s)
creating index...
index created!
Traceback (most recent call last):
File "python_train.py", line 62, in
coco_val.build_indices()
File "/person/hello/siamese-mask-rcnn-master/lib/utils.py", line 328, in build_indices
self.category_image_index = IndexedCocoDataset._build_category_image_index(self.image_category_index)
File "/person/hello/siamese-mask-rcnn-master/lib/utils.py", line 353, in _build_category_image_index
for category in range(max(image_category_index)[0]+1):
ValueError: max() arg is an empty sequence

plz help

when I run evaluate.ipynb it throw this error

ValueError: Error when checking : expected input_target to have shape (5, 96, 96, 3) but got array with shape (1, 96, 96, 3)

Modifications for training on custom dataset

Hey, I have a custom dataset that has 25 classed(Id from 1,2,3,5,...26). I changed line 87 in samples/coco/coco.py to "NUM_CLASSED = 1 + 25" and second cell in train.ipynb to "train_classes = np.array(range(1,26))". Now I am having the error "max() arg is an empty sequence" when running the third cell in train.ipynb.

Which modifications are necessary for training this model on custom dataset, thanks!

Problem occurred in evaluate.ipynb

When I run this part of code:

Load and evaluate model

Create model object in inference mode.

model = siamese_model.SiameseMaskRCNN(mode="inference", model_dir=MODEL_DIR, config=config)
model.load_checkpoint(checkpoint, training_schedule=train_schedule)

Evaluate only active classes

active_class_idx = np.array(coco_val.ACTIVE_CLASSES) - 1

Evaluate on the validation set

print('starting evaluation ...')
siamese_utils.evaluate_dataset(model, coco_val, coco_object, eval_type="bbox",
dataset_type='coco', limit=0, image_ids=None,
class_index=active_class_idx, verbose=1)

It reminds me ValueError: Subtract layer should be called on inputs of the same shape.
However, it can run successfully when I convert the evaluate.jpynb into the file of evaluate.py.

How to design the loss function？

I confuse the loss function,please help.

bethgelab / siamese-mask-rcnn Goto Github PK

siamese-mask-rcnn's Introduction

Siamese Mask R-CNN

One-Shot Instance Segmentation

Installation

Requirements

Prepare COCO dataset

Get pretrained weights

Training

Evaluation

Model description

Citation

siamese-mask-rcnn's People

Contributors

Stargazers

Watchers

Forkers

siamese-mask-rcnn's Issues

Create model object in inference mode.

Root directory of the project

Directory to save logs and trained model

Select checkpoint

Running the Dataset section

it says i lack "active_classes

Load and evaluate model

Create model object in inference mode.

Evaluate only active classes

Evaluate on the validation set

Recommend Projects

Recommend Topics

Recommend Org