Coder Social home page Coder Social logo

Comments (17)

michaelisc avatar michaelisc commented on May 23, 2024 1

We did not evaluate performance on Pascal VOC. We did measure performance on COCO using the VOC classes as a split but in the end decided to go with the four balanced splits reported in the paper because we think that this is the "cleaner" solution.

from siamese-mask-rcnn.

michaelisc avatar michaelisc commented on May 23, 2024 1

A quick check did not turn up any obvious bugs in the model. I'll have to check some of the older commits to figure out, where the problem originated from. That may take a bit.

from siamese-mask-rcnn.

michaelisc avatar michaelisc commented on May 23, 2024

You will have to define the classes with a 1:1 correspondence in pascal and set them as active classes:

pascal_classes = [1,2,3,4,5,6,7,9,15,16,17,18,19,20,40,57,58,59,61,63]
no_pascal_classes = [i for i in range(1,81) if i not in pascal_classes]
train_classes = np.array(no_pascal_classes)

To evaluate on Pascal VOCyou will have to implement a corresponding dataloader and evaluation function.

A note: Interestingly the pascal classes seem to be easier for the model as one-shot performance on the pascal classes in coco is better than on the 4 symmetric splits we use in the paper.

from siamese-mask-rcnn.

jimmy15923 avatar jimmy15923 commented on May 23, 2024

Hi @michaelisc !

Thanks for your reply!

Did you also report the AP50 for the Pascal VOC which trained on Non-VOC COCO dataset in your paper? If not, I am wondering what's the result for your experiment? Because I am trying to implement this experiment and I wish to know what's the reasonable AP50 should be. Thanks a lot!

from siamese-mask-rcnn.

NCTU-VRDL avatar NCTU-VRDL commented on May 23, 2024

Hi @michaelisc!

Did you provide the learning curve of your model? I have tried to train on Non-VOC COCO dataset and the performance is still low after 100 epochs. I used the large config as same as this to train the model. Thanks!

from siamese-mask-rcnn.

michaelisc avatar michaelisc commented on May 23, 2024

@NCTU-VRDL the learning rate decay step after 120 epochs is very important for performance. I'd guess this is the problem but let me know if the performance is still low a few epochs after that.

from siamese-mask-rcnn.

NCTU-VRDL avatar NCTU-VRDL commented on May 23, 2024

@NCTU-VRDL the learning rate decay step after 120 epochs is very important for performance. I'd guess this is the problem but let me know if the performance is still low a few epochs after that.

The performance is still poor after 140 epochs training, you can see the training loss still high in the learning curve below
image

And you can find me train.py here, but it almost as same as the notebook you provide. The only difference is that I use two V100 and 6 images_per_gpu to train this model.

Please have some checks with the code, many thanks!

import tensorflow as tf
tf.logging.set_verbosity(tf.logging.INFO)
sess_config = tf.ConfigProto()

import sys
import os

COCO_DATA = 'data/coco'
MASK_RCNN_MODEL_PATH = 'lib/Mask_RCNN/'

if MASK_RCNN_MODEL_PATH not in sys.path:
    sys.path.append(MASK_RCNN_MODEL_PATH)
    
from samples.coco import coco
from mrcnn import utils
from mrcnn import model as modellib
from mrcnn import visualize
    
from lib import utils as siamese_utils
from lib import model as siamese_model
from lib import config as siamese_config
   
import time
import datetime
import random
import numpy as np
import skimage.io
import imgaug
import pickle
import matplotlib.pyplot as plt
from collections import OrderedDict

# Root directory of the project
ROOT_DIR = os.getcwd()

# Directory to save logs and trained model
MODEL_DIR = os.path.join(ROOT_DIR, "logs")

# train_classes = coco_nopascal_classes
pascal_classes = [1,2,3,4,5,6,7,9,15,16,17,18,19,20,40,57,58,59,61,63]
no_pascal_classes = [i for i in range(1,81) if i not in pascal_classes]
train_classes = np.array(no_pascal_classes)

# Load COCO/train dataset
coco_train = siamese_utils.IndexedCocoDataset()
coco_train.load_coco(COCO_DATA, subset="train", subsubset="train", year="2017")
coco_train.prepare()
coco_train.build_indices()
coco_train.ACTIVE_CLASSES = train_classes

# Load COCO/val dataset
coco_val = siamese_utils.IndexedCocoDataset()
coco_val.load_coco(COCO_DATA, subset="train", subsubset="val", year="2017")
coco_val.prepare()
coco_val.build_indices()
coco_val.ACTIVE_CLASSES = train_classes


# ### Model
class LargeTrainConfig(siamese_config.Config):
    # Batch size = GPU_COUNT * IMAGES_PER_GPU
    GPU_COUNT = 2
    IMAGES_PER_GPU = 6 # 4 16GB GPUs are required for a batch_size of 12
    NUM_CLASSES = 1 + 1
    NAME = 'large_coco_novoc'
    EXPERIMENT = 'example'
    CHECKPOINT_DIR = 'checkpoints/'
    # Reduced image sizes
    TARGET_MAX_DIM = 192
    TARGET_MIN_DIM = 150
    IMAGE_MIN_DIM = 800
    IMAGE_MAX_DIM = 1024
    # Reduce model size
    FPN_CLASSIF_FC_LAYERS_SIZE = 1024
    FPN_FEATUREMAPS = 256
    # Reduce number of rois at all stages
    RPN_ANCHOR_STRIDE = 1
    RPN_TRAIN_ANCHORS_PER_IMAGE = 256
    POST_NMS_ROIS_TRAINING = 2000
    POST_NMS_ROIS_INFERENCE = 1000
    TRAIN_ROIS_PER_IMAGE = 200
    DETECTION_MAX_INSTANCES = 100
    MAX_GT_INSTANCES = 100
    # Adapt NMS Threshold
    DETECTION_NMS_THRESHOLD = 0.5
    # Adapt loss weights
    LOSS_WEIGHTS = {'rpn_class_loss': 2.0, 
                    'rpn_bbox_loss': 0.1, 
                    'mrcnn_class_loss': 2.0, 
                    'mrcnn_bbox_loss': 0.5, 
                    'mrcnn_mask_loss': 1.0}


config = LargeTrainConfig()
config.display()

# Create model object in inference mode.
model = siamese_model.SiameseMaskRCNN(mode="training", model_dir=MODEL_DIR, config=config)

train_schedule = OrderedDict()
train_schedule[1] = {"learning_rate": config.LEARNING_RATE, "layers": "heads"}
train_schedule[120] = {"learning_rate": config.LEARNING_RATE, "layers": "all"}
train_schedule[160] = {"learning_rate": config.LEARNING_RATE/10, "layers": "all"}

model.load_imagenet_weights(pretraining='imagenet_687')
    
for epochs, parameters in train_schedule.items():
    print("")
    print("training layers {} until epoch {} with learning_rate {}".format(parameters["layers"], 
                                                                          epochs, 
                                                                          parameters["learning_rate"]))
    model.train(coco_train, coco_val, 
                learning_rate=parameters["learning_rate"], 
                epochs=epochs, 
                layers=parameters["layers"])

from siamese-mask-rcnn.

michaelisc avatar michaelisc commented on May 23, 2024

What is the performance when you evaluate? Loss and AP are not always that correlated because the model uses a mixture of 5 loss functions. In my experience the classifier loss would for example barely drop but still classify somewhat recently in the end (it is still the weak spot though).

from siamese-mask-rcnn.

NCTU-VRDL avatar NCTU-VRDL commented on May 23, 2024

When I used the pre-train "large_siamese_mrcnn_coco_i4_0160.h5" to inference on VOC dataset, I can get 0.58 at AP50. It makes sense since "i4" may contain some same classes as VOC. But I get only 0.04 at AP50 when using the model trained from scripts above.

from siamese-mask-rcnn.

michaelisc avatar michaelisc commented on May 23, 2024

Is this detection or instance segmentation performance?

from siamese-mask-rcnn.

NCTU-VRDL avatar NCTU-VRDL commented on May 23, 2024

Is this detection or instance segmentation performance?

instance segmentation performance

from siamese-mask-rcnn.

michaelisc avatar michaelisc commented on May 23, 2024

Can you test object detection? You are now the second person to report this and it seems as if something broke in the segmentation training.

And can you visualize some of the predictions? Maybe this gives some hints at what is going on.

from siamese-mask-rcnn.

NCTU-VRDL avatar NCTU-VRDL commented on May 23, 2024

A quick check did not turn up any obvious bugs in the model. I'll have to check some of the older commits to figure out, where the problem originated from. That may take a bit.

Maybe you can try using the pre-train weights as initialization to train a new model. I also trained on the LVIS dataset, and use the "coco_full_0320.h5" as initialization. But the performance is getting much worser after 100 epochs training. From 0.24 to 0.16 at AP50 (segm), which is the "coco_full_0320.h5" model can achieve 0.24 without any fine-tuning.

from siamese-mask-rcnn.

michaelisc avatar michaelisc commented on May 23, 2024

I first want to check if training with an older commit of the repo works. In the end I did run the experiments which led to the pre-trained checkpoints at some time, so something must have changed in between. But as I said, this may take a while.

from siamese-mask-rcnn.

NCTU-VRDL avatar NCTU-VRDL commented on May 23, 2024

Can you test object detection? You are now the second person to report this and it seems as if something broke in the segmentation training.

And can you visualize some of the predictions? Maybe this gives some hints at what is going on.

Hi! I think the problem is from segmentation! I evaluated with the object detection and get 0.19 at AP50, much better than segmentation. You can see the visualization below.
image

from siamese-mask-rcnn.

michaelisc avatar michaelisc commented on May 23, 2024

That's what I suspected. Won't be easy to fix as it is something that goes wrong during training and only for the mask branch (object detection and the pre-trained model work).

from siamese-mask-rcnn.

bominn avatar bominn commented on May 23, 2024

Hi @NCTU-VRDL
I am trying to train on the Non-VOC COCO dataset and inference on the VOC dataset too, can I get your object detection result on the VOC dataset? Thanks a lot!

from siamese-mask-rcnn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.