guangxinghan / meta-faster-r-cnn Goto Github PK

Code for AAAI 2022 Oral paper: 'Meta Faster R-CNN: Towards Accurate Few-Shot Object Detection with Attentive Feature Alignment'

Python 96.88% Shell 3.12%

attention few-shot-object-detection meta-learning

meta-faster-r-cnn's Introduction

Meta-Faster-R-CNN

This repo contains the official PyTorch implementation for the AAAI 2022 Oral paper: 'Meta Faster R-CNN: Towards Accurate Few-Shot Object Detection with Attentive Feature Alignment' (paper).

Highlights

Our model is a natural extension of Faster R-CNN for few-shot scenario with the prototype based metric-learning.
Our meta-learning based models achieve strong few-shot object detection performance without fine-tuning.
Our model can keep the knowledge of base classes by learning a separate Faster R-CNN detection head for base classes.

Installation

Our codebase is built upon detectron2. You only need to install detectron2 following their instructions.

Please note that we used detectron 0.2.1 in this project. Higher versions of detectron might report errors.

Data Preparation

We evaluate our model on two FSOD benchmarks PASCAL VOC and MSCOCO following the previous work TFA.
Please prepare the original PASCAL VOC and MSCOCO datasets and also the few-shot datasets following TFA in the folder ./datasets/coco and ./datasets/pascal_voc respectively.
Please run the scripts in ./datasets/coco and ./datasets/pascal_voc step by step to generate the support images for both many-shot base classes (used during meta-training) and few-shot classes (used during few-shot fine-tuning).

Model training and evaluation on MSCOCO

We have three training stages, first meta-training, then training the base-classes detection head, and finally few-shot fine-tuning.
During meta-training, we have three training steps. First, we train the baseline model following FewX. Then we add the whole feature fusion network in both Meta-RPN and Meta-Classifier, and finally add the proposed attentive feature alignment. The training script is

sh scripts/meta_training_coco_resnet101_multi_stages.sh

after meta-training, the model are directly evaluated on novel classes without fine-tuning.

We a separate Faster R-CNN detection head for base classes, using the shared feature backbone as the first step. The training script is

sh scripts/faster_rcnn_with_fpn_coco_base_classes_branch.sh

We perform 1/2/3/5/10/30-shot fine-tuning over novel classes after the three-step meta-training, using the exact same few-shot datasets as TFA. The training script is

sh scripts/few_shot_finetune_coco_resnet101.sh

Model training and evaluation on PASCAL VOC

We evaluate our model on the three splits as TFA.
Similar as MSCOCO, we have three training stages, and three training steps during meta-training.
The training scripts for VOC split1 is

sh scripts/meta_training_pascalvoc_split1_resnet101_multi_stages.sh
sh scripts/faster_rcnn_with_fpn_pascalvoc_split1_base_classes_branch.sh
sh scripts/few_shot_finetune_pascalvoc_split1_resnet101.sh

The training scripts for VOC split2 is

sh scripts/meta_training_pascalvoc_split2_resnet101_multi_stages.sh
sh scripts/faster_rcnn_with_fpn_pascalvoc_split2_base_classes_branch.sh
sh scripts/few_shot_finetune_pascalvoc_split2_resnet101.sh

The training scripts for VOC split3 is

sh scripts/meta_training_pascalvoc_split3_resnet101_multi_stages.sh
sh scripts/faster_rcnn_with_fpn_pascalvoc_split3_base_classes_branch.sh
sh scripts/few_shot_finetune_pascalvoc_split3_resnet101.sh

Model Zoo

We provided the meta-trained models over base classes for both MSCOCO dataset and the 3 splits on VOC dataset. The model links are Google Drive and Tencent Weiyun.

Citing Meta-Faster-R-CNN

If you use this work in your research or wish to refer to the baseline results published here, please use the following BibTeX entries:

@inproceedings{han2022meta,
  title={Meta faster r-cnn: Towards accurate few-shot object detection with attentive feature alignment},
  author={Han, Guangxing and Huang, Shiyuan and Ma, Jiawei and He, Yicheng and Chang, Shih-Fu},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={36},
  number={1},
  pages={780--789},
  year={2022}
}
@inproceedings{fan2020few,
  title={Few-shot object detection with attention-RPN and multi-relation detector},
  author={Fan, Qi and Zhuo, Wei and Tang, Chi-Keung and Tai, Yu-Wing},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={4013--4022},
  year={2020}
}
@inproceedings{wang2020frustratingly,
  title={Frustratingly simple few-shot object detection},
  author={Wang, Xin and Huang, Thomas E and Darrell, Trevor and Gonzalez, Joseph E and Yu, Fisher},
  booktitle={Proceedings of the 37th International Conference on Machine Learning},
  pages={9919--9928},
  year={2020}
}

Acknowledgement

This repo is developed based on FewX, TFA and detectron2. Thanks for their wonderful codebases.

meta-faster-r-cnn's People

Contributors

Stargazers

Watchers

Forkers

xiaodongdreams liux-n imyjx daniel-007 qinzhengmei hl-hanlin wmyname mymuli im98tyx luckin1coffee

meta-faster-r-cnn's Issues

Training methods

I have a question. For VOC datasets, they are not randomly divided into three splits. Instead, I am renting a server online. I can rent three graphics cards of the same model and train three splits separately to obtain the results. Is this okay?

Question about how to design dataset

I would like to ask again, my dataset has more small targets (remote sensing dataset), and the size difference between different objects is large (and after referring to the VOC dataset, I found that the target in the original map is a larger proportion, which is different from my dataset), I choose Novel class to contain smaller objects and larger objects (more robust). But in the process of meta-training, no matter how I adjust the parameters, there is still a normal map of Base class and almost 0 map of Novel class, maybe it is not the problem of parameters, but my support size is 320x320 by default, should I reduce this value to better support meta-training? What is a better suggestion to solve this problem?

Run evaluation only?

Is there a way to run the model in evaluation mode only using the meta-trained model provided in the 'Model Zoo' section? The instructions in the readme only describe how to train the model from scratch. I am specifically interested in performing the evaluation for the VOC dataset.

Can't find how to generate final_split_voc_{}_shot_instances_train2014.json

Dear author, when I ran the 3_gen_support_pool_any_shot_novel_class.py file of the coco dataset, I found the missing new_annotations/final_split_voc_{}_shot_instances_train2014.json file prevented me from generating the relevant support set. I have also carefully examined other codes and have not found the location where this file was generated. Could you tell me where this file was generated? thank you!

i have one small question about that the backbone from the file "Base-FSOD-C4.yaml" is set empty?

query_cls = query_cls[0] IndexError: list index out of range

亲爱的作者，当我运行CUDA_VISIBLE_DEVICES=3 python3 fsod_train_net.py --num-gpus 1 --dist-url auto --config-file configs/fsod/meta_training_coco_resnet101_stage_1.yaml您所说的训练脚本时候，会报
query_cls = query_cls[0]
IndexError: list index out of range的错误，我看您在query_cls = query_cls[0] 位置也有专门输出查看，请问这个错误我该如何解决呢？

Question about "FloatingPointError: loss bacame infinite or NaN at iteration"

Thank you very much for your patient answer before. When using my own dataset (public dataset), I often have the problem "FloatingPointError: loss bacame infinite or NaN at iteration" in the first stage of training (the baseline is FewX) (but this problem does not occur sometimes, so is it due to random sampling?). I prepared all my datasets with the code provided in your paper and the cited prepare_voc_few_shot.py in the same format as you suggested, probably due to the small number of datasets (800), I used 4 GPU V100, Img_pre_batch=8 (two on each GPU) for I tried to adjust the learning rate to 0.00001 for the first phase of meta-training, and this error also occurs, and after checking the FewX issue, I also tried to upgrade detectron2 from 0.2.1 to 0.3, but this problem still occurs. I would like to add that the problem does not occur when I train the number of arguments as default (15000,20000), but in the second small stage of the first phase (adding fusion network into meta-rcnn and meta-classifier), the same error occurs again, and there is no way to solve it by adjusting the learning rate. I would like to consult you to see if there is any good advice.

NaN:Dear author, thanks for you great work. Currently I am trying to run your code but always report NaN error, the following is the error traceback, can you have a look? Thanks in advance!

proposals = self.predict_proposals(

File "/home/sjk/anaconda3/envs/chpy/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
return func(*args, **kwargs)
File "/home/1Tm2/CH/Meta-Faster-R-CNN/meta_faster_rcnn/modeling/fsod/fsod_rpn.py", line 523, in predict_proposals
return find_top_rpn_proposals(
File "/home/sjk/anaconda3/envs/chpy/lib/python3.8/site-packages/detectron2/modeling/proposal_generator/proposal_utils.py", line 103, in find_top_rpn_proposals
raise FloatingPointError(
FloatingPointError: Predicted boxes or scores contain Inf/NaN. Training has diverged.

ValueError: a must be greater than 0 unless no samples are taken

when I train my dataset, there is an error:

File "/home/whb/anaconda3/envs/meta-frcnn1/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/whb/anaconda3/envs/meta-frcnn1/lib/python3.7/site-packages/detectron2/data/common.py", line 41, in getitem
data = self._map_func(self._dataset[cur_idx])
File "/home/whb/kunyan/Meta-Faster-R-CNN-main/meta_faster_rcnn/data/dataset_mapper_pascal_voc.py", line 129, in call
support_images, support_bboxes, support_cls = self.generate_support(dataset_dict)
File "/home/whb/kunyan/Meta-Faster-R-CNN-main/meta_faster_rcnn/data/dataset_mapper_pascal_voc.py", line 267, in generate_support
other_cls = self.support_df.loc[(~self.support_df['category_id'].isin(used_category_id)), 'category_id'].drop_duplicates().sample().tolist()[0]
File "/home/whb/anaconda3/envs/meta-frcnn1/lib/python3.7/site-packages/pandas/core/generic.py", line 5365, in sample
locs = rs.choice(axis_length, size=n, replace=replace, p=weights)
File "mtrand.pyx", line 909, in numpy.random.mtrand.RandomState.choice
ValueError: a must be greater than 0 unless no samples are taken

help please

datasets/coco small vs origin

Dear author, what is the difference between the small in the gen_support_pool_with_small_ .py file and the original gen_support_pool_with. py in datasets/coco/, I would love to hear from you.

Related GPU device configuration

How big is the memory of a single GPU in the config file settings? thank you!

Questions about how to generate vocsplit folder

After looking at the prepare dataset section of your code, I would like to ask if I need to generate a folder for vocsplit when preparing my own dataset (in voc format) if I want to do few-shot work? Do I need to generate this folder according to my needs? Because I don't see any part in the source code about generating a vocsplit. Since I am new to the work of few-shot in target detection, I don't have a thorough understanding of this kind of data set division, so I would like to know if I need to generate my own vocsplit folder and what should be the contents inside? I presume from the code that this folder should store txt files with different shots and different categories of target information, but I'm still not sure how it's generated exactly.

Questions about VOC dataset partitioning

Your work is interesting and inspiring. After referring to your paper, I would like to ask if the three ways of classifying novel classes used in the paper for the VOC dataset are based on your own experience, or to enrich the experimental results, etc. Also when dealing with my own dataset (which has been converted to VOC format), is it possible to define the division of novel classes in my own way, e.g., to take only one custom way instead of three.

requirements

Hello, why not try to provide a requirements

i meet some error

"Hello, I've been trying to use your code as a reference, but unfortunately, I've encountered some issues that prevent it from running successfully. I appreciate your work, and I was wondering if it would be possible to ask you a few questions about the code. I understand you may be busy, but any assistance you could provide would be greatly appreciated. Thank you!"

Question: Why is this situation occurring, and I'd like to inquire about how to resolve it?

可以提供以下某个数据集下的某个split的某个shot的训练日志吗？感觉自己的配置有哪里不对，想参考下

如题

Questions about loss_cls is Nan

When debugging the code, I suspect that some parameters may not be understood, leading to the result that loss_cls is Nan in the first iteration. stage_1.yaml", what is the role of BASE class, I see its contents are in "Base-FSOD-C4.yaml", which mentions some settings of FsodRCNN parameters, but I would like to ask in the first stage is for meta-RPN and meta-classifier training, should this FsodRCNN be the second stage? If so, why is there a new configuration file "faster_rcnn_with_fpn_pascalvoc_split1_base_classes_branch.yaml" for the second stage? Because I found that in "base-fsod-c4.yaml", DATASETS is coco dataset (the file "base-fsod-c4.yaml" only appeared in the relevant VOC dataset) I'm a little confused by the setting of these parameters, and the setting of num_classes to 1 in ROI_head.

About code

Hi author! When will you upload your overall code please? I'm very intrested in your work.

Target Visualization

"Hello, I'm sorry to bother you. Could you please provide the source code for visualizing the affinity matrix for alignment and foreground attention masks?"

Dear author, I would like to ask what may cause the following problems in the process of generating support sets when running your code:

python datasets/coco/2_gen_support_pool.py
[u'info', u'licenses', u'images', u'annotations', u'categories']
loading annotations into memory...
Done (t=5.84s)
creating index...
index created!
0
('path:', '/home/lthpc/Annotation/CH/Meta-Faster-R-CNN-main/datasets/coco/support/trainval2014/COCO_train2014_000000262146.jpg')
/home/lthpc/Annotation/CH/Meta-Faster-R-CNN-main/datasets/coco/support/trainval2014/COCO_train2014_000000262146.jpg
('img.shape', None)
Traceback (most recent call last):
File "datasets/coco/2_gen_support_pool.py", line 267, in
support_df = main()
File "datasets/coco/2_gen_support_pool.py", line 243, in main
support_img, support_box = crop_support(im, bbox)
File "datasets/coco/2_gen_support_pool.py", line 51, in crop_support
image_shape = img.shape[:2]# h, w
AttributeError: 'NoneType' object has no attribute 'shape'

Memory requirements question

Hello!
I am trying to apply your project to my novel few shot dataset and I was wondering about the GPU memory requirements that you would recommend.
As I understood, the meta-trained models you published for download should later be fine-tuned with novel data. Do you have suggestions on the minimum memory required and batch size to be used for fine-tuning?

Thanks!

When I run scripts/faster_rcnn_with_fpn_coco_base_classes_branch.sh scripts, I always encounter the following question:

When I run scripts/faster_rcnn_with_fpn_coco_base_classes_branch.sh scripts, I always encounter the following question:
Specifically, I allow the following command: CUDA_VISIBLE_DEVICES=4 python3 faster_rcnn_train_net.py --num-gpus 1 --dist-url auto --config-file configs/fsod/faster_rcnn_with_fpn_coco_base_classes_branch.yaml 2> &1 | tee log/faster_rcnn_with_fpn_coco_base_classes_branch.txt

The code can run smoothly, but allow an error to be reported halfway, please see what the reason is, have you encountered it?
[05/21 16:11:43 d2.utils.events]: eta: 1 day, 8:04:06 iter: 2659 total_loss: 0.4677 loss_cls: 0.18 loss_box_reg: 0.1562 loss_rpn_cls: 0.0641 loss_rpn_loc: 0.05643 time: 1.1096 last_time: 1.0202 data_time: 0.1139 last_data_time: 0.1274 lr: 0.02 max_mem: 13627M
[05/21 16:12:04 d2.utils.events]: eta: 1 day, 8:01:23 iter: 2679 total_loss: 0.4508 loss_cls: 0.1935 loss_box_reg: 0.1532 loss_rpn_cls: 0.04843 loss_rpn_loc: 0.05691 time: 1.1092 last_time: 1.0267 data_time: 0.1128 last_data_time: 0.1354 lr: 0.02 max_mem: 13627M
[05/21 16:12:25 d2.utils.events]: eta: 1 day, 7:59:51 iter: 2699 total_loss: 0.4921 loss_cls: 0.1895 loss_box_reg: 0.1568 loss_rpn_cls: 0.05576 loss_rpn_loc: 0.0615 time: 1.1086 last_time: 1.1702 data_time: 0.1112 last_data_time: 0.1801 lr: 0.02 max_mem: 13627M
ERROR [05/21 16:12:45 d2.engine.train_loop]: Exception during training:
Traceback (most recent call last):
File "/home/user3/gyq/Meta-Faster-R-CNN-main/detectron2/detectron2/engine/train_loop.py", line 155, in train
self.run_step()
File "/home/user3/gyq/Meta-Faster-R-CNN-main/detectron2/detectron2/engine/defaults.py", line 494, in run_step
self._trainer.run_step()
File "/home/user3/gyq/Meta-Faster-R-CNN-main/detectron2/detectron2/engine/train_loop.py", line 297, in run_step
data = next(self._data_loader_iter)
File "/home/user3/gyq/Meta-Faster-R-CNN-main/detectron2/detectron2/data/common.py", line 291, in iter
for d in self.dataset:
File "/home/user3/miniconda3/envs/gyq/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 530, in next
data = self._next_data()
File "/home/user3/miniconda3/envs/gyq/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1204, in _next_data
return self._process_data(data)
File "/home/user3/miniconda3/envs/gyq/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1250, in _process_data
data.reraise()
File "/home/user3/miniconda3/envs/gyq/lib/python3.8/site-packages/torch/_utils.py", line 457, in reraise
raise exception
OSError: Caught OSError in DataLoader worker process 2.
Original Traceback (most recent call last):
File "/home/user3/miniconda3/envs/gyq/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
data = fetcher.fetch(index)
File "/home/user3/miniconda3/envs/gyq/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 32, in fetch
data.append(next(self.dataset_iter))
File "/home/user3/gyq/Meta-Faster-R-CNN-main/detectron2/detectron2/data/common.py", line 258, in iter
yield self.dataset[idx]
File "/home/user3/gyq/Meta-Faster-R-CNN-main/detectron2/detectron2/data/common.py", line 95, in getitem
data = self._map_func(self._dataset[cur_idx])
File "/home/user3/gyq/Meta-Faster-R-CNN-main/detectron2/detectron2/utils/serialize.py", line 26, in call
return self._obj(*args, **kwargs)
File "/home/user3/gyq/Meta-Faster-R-CNN-main/detectron2/detectron2/data/dataset_mapper.py", line 154, in call
image = utils.read_image(dataset_dict["file_name"], format=self.image_format)
File "/home/user3/gyq/Meta-Faster-R-CNN-main/detectron2/detectron2/data/detection_utils.py", line 185, in read_image
return convert_PIL_to_numpy(image, format)
File "/home/user3/gyq/Meta-Faster-R-CNN-main/detectron2/detectron2/data/detection_utils.py", line 76, in convert_PIL_to_numpy
image = image.convert(conversion_format)
File "/home/user3/.local/lib/python3.8/site-packages/PIL/Image.py", line 934, in convert
self.load()
File "/home/user3/.local/lib/python3.8/site-packages/PIL/ImageFile.py", line 251, in load
raise OSError(
OSError: image file is truncated (9 bytes not processed)

[05/21 16:12:45 d2.engine.hooks]: Overall training speed: 2717 iterations in 0:50:11 (1.1082 s / it)
[05/21 16:12:45 d2.engine.hooks]: Total training time: 0:50:44 (0:00:33 on hooks)
[05/21 16:12:45 d2.utils.events]: eta: 1 day, 7:59:04 iter: 2719 total_loss: 0.4697 loss_cls: 0.1902 loss_box_reg: 0.1503 loss_rpn_cls: 0.05803 loss_rpn_loc: 0.05316 time: 1.1082 last_time: 1.0643 data_time: 0.1117 last_data_time: 0.0886 lr: 0.02 max_mem: 13627M
/home/user3/miniconda3/envs/gyq/lib/python3.8/site-packages/torch/functional.py:568: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/TensorShape.cpp:2228.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
Traceback (most recent call last):
File "faster_rcnn_train_net.py", line 80, in
launch(
File "/home/user3/gyq/Meta-Faster-R-CNN-main/detectron2/detectron2/engine/launch.py", line 84, in launch
main_func(*args)
File "faster_rcnn_train_net.py", line 74, in main
return trainer.train()
File "/home/user3/gyq/Meta-Faster-R-CNN-main/detectron2/detectron2/engine/defaults.py", line 484, in train
super().train(self.start_iter, self.max_iter)
File "/home/user3/gyq/Meta-Faster-R-CNN-main/detectron2/detectron2/engine/train_loop.py", line 155, in train
self.run_step()
File "/home/user3/gyq/Meta-Faster-R-CNN-main/detectron2/detectron2/engine/defaults.py", line 494, in run_step
self._trainer.run_step()
File "/home/user3/gyq/Meta-Faster-R-CNN-main/detectron2/detectron2/engine/train_loop.py", line 297, in run_step
data = next(self._data_loader_iter)
File "/home/user3/gyq/Meta-Faster-R-CNN-main/detectron2/detectron2/data/common.py", line 291, in iter
for d in self.dataset:
File "/home/user3/miniconda3/envs/gyq/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 530, in next
data = self._next_data()
File "/home/user3/miniconda3/envs/gyq/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1204, in _next_data
return self._process_data(data)
File "/home/user3/miniconda3/envs/gyq/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1250, in _process_data
data.reraise()
File "/home/user3/miniconda3/envs/gyq/lib/python3.8/site-packages/torch/_utils.py", line 457, in reraise
raise exception
OSError: Caught OSError in DataLoader worker process 2.
Original Traceback (most recent call last):
File "/home/user3/miniconda3/envs/gyq/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
data = fetcher.fetch(index)
File "/home/user3/miniconda3/envs/gyq/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 32, in fetch
data.append(next(self.dataset_iter))
File "/home/user3/gyq/Meta-Faster-R-CNN-main/detectron2/detectron2/data/common.py", line 258, in iter
yield self.dataset[idx]
File "/home/user3/gyq/Meta-Faster-R-CNN-main/detectron2/detectron2/data/common.py", line 95, in getitem
data = self._map_func(self._dataset[cur_idx])
File "/home/user3/gyq/Meta-Faster-R-CNN-main/detectron2/detectron2/utils/serialize.py", line 26, in call
return self._obj(*args, **kwargs)
File "/home/user3/gyq/Meta-Faster-R-CNN-main/detectron2/detectron2/data/dataset_mapper.py", line 154, in call
image = utils.read_image(dataset_dict["file_name"], format=self.image_format)
File "/home/user3/gyq/Meta-Faster-R-CNN-main/detectron2/detectron2/data/detection_utils.py", line 185, in read_image
return convert_PIL_to_numpy(image, format)
File "/home/user3/gyq/Meta-Faster-R-CNN-main/detectron2/detectron2/data/detection_utils.py", line 76, in convert_PIL_to_numpy
image = image.convert(conversion_format)
File "/home/user3/.local/lib/python3.8/site-packages/PIL/Image.py", line 934, in convert
self.load()
File "/home/user3/.local/lib/python3.8/site-packages/PIL/ImageFile.py", line 251, in load
raise OSError(
OSError: image file is truncated (9 bytes not processed)

boxes visualization

Hello, to avoid disturbing you, could I ask you for the source code of the final predicted frame in your visualization?

The inference scripts

Hello, I only find the training scripts in Readme. Does this mean that inference will be performed automatically after the training?

Questions about support pool generation code

Hello, thank you for your work. I am counfusized on your crop_support() function in datasets/pascal_voc/2_gen_support_pool.py during support pool generation. What is the meaning of context_pixel variable? And how to properly set variables like context_pixel and target_size when transfering your code to other datasets?

Questions about “RuntimeError: mat 1 dim 1 must match mat 2 dim 0”

Sorry to bother you again in your busy schedule, when I run the first stage code "fsod_train_net.py", line 583 of "fsod_fast_rcnn.py" "out_fc=F. relu(self.fc_1(cat_fc),inplace=True)" reports only splicing error, I debug and find that the actual size of input dimension of linear layer fc_1 here is 940962*2, while its required input dimension is 4096, should I pull it into one dimension by reshape operation here? Here should I pull it into one dimension by reshape operation and then transform its dimension to 4096 by a newly added linear layer and then input to the original linear layer fc_1?

problems about training on a single card

Hi, thangs for your work. I have a 16gb gpu and in order to run the training process, i change the support_shot from 30 to 20 and the batch size from 8 to 2.But i don't know whether it is better to reduce the support_shot more and increase the batch size. Apart from that, do i need to chage the learning weight and iteration nums? I'd be appreciated if you can give me some help!

Data cannot be obtained problem！！！

Hello author, I have run the article code and encountered some issues. I have tried to solve them but have not been successful. Can you help me take a look? The problem is as follows:

FileNotFoundError: [Errno 2] No such file or directory: './datasets/pascal_voc/voc_2007_trainval_base1.pkl'

Where should I obtain this pkl file？？

AttributeError: 'NoneType' object has no attribute 'shape'

Why is running 2_ Gen_ Support_ There was an issue with the pool.py file:AttributeError: 'NoneType' object has no attribute 'shape'
Is it because there are grayscale images？

about GPU

hello, I am interested in your work, And I want to know what type of the GPUs when you training? could 4 x 3090 do this work?

Questions about “RuntimeError: mat 1 dim 1 must match mat 2 dim 0”

[Error] ValueError: Milestone must be smaller than total number of updates

Why is the weight file generated by the second script not used？

The second script "faster_rcnn_with_fpn_XX_base_classes_branch.sh" generate a weight file, but the third script does not use this weight file, what does the second script do？

Visualization of the affinity matrix and attention masks

Your work is excellent, could you please explain how to achieve visualization of the affinity matrix and attention masks

About test datasets on pascal_voc

No such file or directory: 'datasets/pascal_voc/VOC2007/ImageSets/Main/test.txt'

When I try to run the script,I get above error, how can I get the text.txt on VOC2007?

Questions about ap50 of novel data in meta-testing

I would like to ask a question about the metrics of the Novel class test data in the first stage of getting three stages of training. My current understanding is that 1-1 of fewx baseline uses some base data for a basic training, while 1-2 introduces class prototypes and adds meta-RPN and meta-classifier for training, 1-3 is to solve the spatial misalignment problem, then I would like to know in 1-1, 1-2, 1-3 of these three stages of testing process, what should be the ap50 of the NOVEL class data. When I use my own dataset, the ap50 of the novel class data in all three stages is basically close to 0 (but in your paper, in Table 7, there are values of ap50 for meta-train in the base class and meta-test in the novel class, is this metric the average ap50 when testing on the novel class data?) And he really does not have the novel class data involved in the meta-training process, why meta-training can be meta-testing on the novel class with such good results?

could you help me this problem?

File "/data/sam/Meta-Faster-R-CNN-main/meta_faster_rcnn/modeling/fsod/fsod_fast_rcnn.py", line 112, in fsod_fast_rcnn_inference_single_image
scores = scores.reshape(cls_num, box_num).permute(1, 0)
RuntimeError: shape '[20, 645]' is invalid for input of size 12912

I want to run my own dataset on your work.

Hello, thank you very much for your work, but I have a question. I want to run my own dataset on your work. It is in voc format. How can I run it?Can you give me an operation manual?

About I want to run the test set separately

How do you separate the training and test scripts without running the training script every time you test?

The performance for 30-shot on COCO

Hi, I try to reproduce your method on COCO. I use 8 gpu and keep other default parameters. I can achieve similar performance before fine-tuning. After fine-tuning, I achieve 12.2 AP on 10-shot and 15.2 AP on 30-shot. I think the performance on 30-shot is far lower than your reported performance (16.6 AP). Is there anything in the config file that needs to be changed? Thanks.

about data?

Thank you for your work. I would like to train on my own dataset. May I ask what the dataset structure should look like? I don't quite understand the expression in README. Thank you for your reply

Questions about backbone on meta-training

Thank you for your previous patient answer, after studying your work for some time, I have some questions, that is, why not use faster-rcnn-fpn but use faster-r-cnn-c4, is it because of the memory occupation problem? Can the backbone of meta-training be changed to faster-rcnn-fpn?

Hello! No such file or directory: './datasets/coco/train_support_df.pkl' Would you like to tell how to get the train_support_df.pkl

Traceback (most recent call last):
File "fsod_train_net.py", line 211, in
args=(args,),
File "/home/ch/anaconda3/envs/ch/lib/python3.7/site-packages/detectron2/engine/launch.py", line 82, in launch
main_func(*args)
File "fsod_train_net.py", line 197, in main
trainer = Trainer(cfg)
File "/home/ch/anaconda3/envs/ch/lib/python3.7/site-packages/detectron2/engine/defaults.py", line 378, in init
data_loader = self.build_train_loader(cfg)
File "fsod_train_net.py", line 55, in build_train_loader
mapper = DatasetMapperWithSupportCOCO(cfg)
File "/home/ch/Meta-Faster-R-CNN/meta_faster_rcnn/data/dataset_mapper_coco.py", line 99, in init
self.support_df = pd.read_pickle("./datasets/coco/train_support_df.pkl")
File "/home/ch/anaconda3/envs/ch/lib/python3.7/site-packages/pandas/io/pickle.py", line 169, in read_pickle
f, fh = get_handle(fp_or_buf, "rb", compression=compression, is_text=False)
File "/home/ch/anaconda3/envs/ch/lib/python3.7/site-packages/pandas/io/common.py", line 499, in get_handle
f = open(path_or_buf, mode)
FileNotFoundError: [Errno 2] No such file or directory: './datasets/coco/train_support_df.pkl'

Dear author, thank you for your work. I would like to use your framework for your work, but I cannot find the text file under the logs file

I can't find the following text file：
log/meta_training_pascalvoc_split1_resnet101_stage_1.txt
log/meta_training_pascalvoc_split1_resnet101_stage_2.txt
log/meta_training_pascalvoc_split1_resnet101_stage_3.txt
log/meta_training_coco_resnet101_stage_1.txt
log/meta_training_coco_resnet101_stage_2.txt
log/meta_training_coco_resnet101_stage_3.txt
log/1shot_finetune_pascalvoc_split1_resnet101.txt
log/faster_rcnn_with_fpn_coco_base_classes_branch.txt
...
If you could provide it to me, I would be very grateful！Thank you！

I encountered a new problem in the process of reproduction

I don't know why such a problem occurs，Hope you can get some advice, thank you