Coder Social home page Coder Social logo

hdmnet's Introduction

Hierarchical Dense Correlation Distillation for Few-Shot Segmentation

📢Thanks for your interest in our work! This is the official implementation for our CVPR 2023 paper "Hierarchical Dense Correlation Distillation for Few-Shot Segmentation". And we also released the corresponding models.

Abstract: Few-shot semantic segmentation (FSS) aims to form class-agnostic models segmenting unseen classes with only a handful of annotations. Previous methods limited to the semantic feature and prototype representation suffer from coarse segmentation granularity and train-set overfitting. In this work, we design Hierarchically Decoupled Matching Network (HDMNet) mining pixel-level support correlation based on the transformer architecture. The self-attention modules are used to assist in establishing hierarchical dense features, as a means to accomplish the cascade matching between query and support features. Moreover, we propose a matching module to reduce train-set overfitting and introduce correlation distillation leveraging semantic correspondence from coarse resolution to boost fine-grained segmentation. Our method performs decently in experiments. We achieve 50.0% mIoU on COCO dataset one-shot setting and 56.0% on five-shot segmentation, respectively.

Get Started

📘 Environment

  • python == 3.8.13
  • torch == 1.12.1
  • torchvision == 0.13.1
  • cuda == 11.0
  • mmcv-full == 1.6.1
  • mmsegmentation == 0.27.0

Please refer to the guidelines in MMSegmentation v0.27.0.

📝Dataset

Please download the following datasets:

We follow the lists generation as PFENet and upload the Data lists. You can direct download and put them into the ./lists directory.

💥 Please ensure that you uncomment the data list generation sections and generate the base annotation when running the code for the first time. More details refer to util/get_mulway_base_data.py and util/dataset.py

Models

We have adopted the same procedures as BAM for the pre-trained backbones, placing them in the ./initmodel directory. We have also uploaded the complete trained model of COCO dataset for your convenience. For Pascal dataset, you can directly retrain the models since the traing time is less than 10 hours.

To reproduct the results we reported in our paper, you can just download the corresponding models and run test script. But we still highly recommond you to retrain the model. Please note that the experimental results may vary due to different environments and settings. We sometimes get higher mIoU results than reported in the paper by up to 1.0%. However, it is still acceptable to compare your results with those reported in the paper only. Wish you a good luck! 😄😄

Scripts

  • First update the configurations in the ./config for training or testing

  • Train script

sh train.sh [exp_name] [dataset] [GPUs]

# Example (split0 | COCO dataset | 4 GPUs for traing):
# sh train.sh split0 coco 4
  • Test script
sh test.sh [exp_name] [dataset] [GPUs]

# Example (split0 | COCO dataset | 1 GPU for testing):
# sh test.sh split0 coco 1

References

This repository owes its existence to the exceptional contributions of other projects:

Many thanks to their invaluable contributions.

BibTeX

If you find our work and this repository useful. Please consider giving a star ⭐ and citation 📚.

@article{peng2023hierarchical,
  title={Hierarchical Dense Correlation Distillation for Few-Shot Segmentation},
  author={Peng, Bohao and Tian, Zhuotao and Wu, Xiaoyang and Wang, Chenyao and Liu, Shu and Su, Jingyong and Jia, Jiaya},
  journal={arXiv preprint arXiv:2303.14652},
  year={2023}
}

hdmnet's People

Contributors

pbihao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

hdmnet's Issues

--arch

报错:NameError: name 'BAM' is not defined,您好请问这个是什么原因啊,--arch这行参数在您的代码里起到啥作用啊

The code is inconsistent with the paper description

作者您好,我对您的这篇论文很感兴趣,但是在阅读论文和代码的过程中,发现多处不一致的情况,特来向您请教,例如:

  1. Fig.3中的Correlation模块,您在Sec.3.3 的描述中说,使用cross-attention(scaled dot-product)进行查询特征和支持特征的匹配会导致overfitting和weak generalization,因此采用了cosine相似度量的方式,您在Fig3中也是这么展现的,但是您在model/Transformer.py的实现中,实际采用的还是cross-attention。
  2. Fig.3中的Correlation Distillation,您在Sec.3.3 中的公式(9)描述,使用的是相邻层次correlation间的蒸馏,但是您在model/HDMNet.py的实现中,实际使用的是支持图片的mask(teacher)到每个层次correlation的蒸馏。
    此外,还有论文中还有描述不够清楚的地方,例如:
  3. 您实际上使用了BAM中base learner+meta learner的结构,但是在正文描述中,只在实现细节处提了一句以PSPNet作为base learner,方法一节中完全没有提及。
  4. 支持图片,支持mask和查询图片输入到预训练特征提取器中的操作,代码中的操作正文Sec3.2中没有提及。
  5. 模型的loss公式等。
    我想知道:
  6. 您公布的代码是否为最新版本?如果不是,近期是否可以更新?
  7. 关于上面论文描述中我心存疑惑的几点内容,不知您是否可以解答呢?
    谢谢。

"align_corners" in resize function

感谢您极好的工作!
想请教您,在代码中您交替使用了mmseg的resize和F.interpolate来进行双线性插值。但在mmseg的resize函数中,align_conners参数设置为false,而其在F.interpolate中设置为True。请问这有具体的缘由吗?

Pascal复现差异大

您好,感谢您的出色工作
对Pascal数据集复现结果时,出现与论文较大的差异,想问下是什么原因呢

关于数据问题

您好,我刚开始接触小样本分割,对于你用的数据还不太会操作,您可以教我一下,您的数据是怎么合并得到的吗,数据是用的voc2012的吗,还是要把voc2012和SBD结合啊

how to pretrain backbone

Hi, I'm very interested in your work. If I want to do experiment on dataset other than coco and pascal, do I need to pretrain the backbone ?

Performance of the baseline

尊敬的HDMNet作者:
你好!您做出的HDMNet工作十分出色,非常感谢您开源了代码!在这里我有一个小问题想请教一下您:论文的表4(消融实验表格)的第一行是您的baseline的性能。由论文的4.3节可以得知,您构造baseline的方式类似于CyCTR。只是去掉了循环一致性模块以及self-cross attention的层数有所差别。但是baseline的性能远超CyCTR(44.7 vs 40.3),这个baseline也超过了目前很多的其他方法的最佳性能。请问您构造的baseline与CyCTR的性能有显著差异的主要原因是什么呢?
期待您的回复!
再次感谢您对社区做出的贡献。

results without BAM ensemble

您好,感谢您开源代码。

但是您的工作采用与BAM同样的数据,而BAM的数据处理是用了一个trick,见链接https://github.com/chunbolang/BAM/issues/45
现在你们都使用这个trick与之前的工作进行比较,这样不公平。
如果使用与HSNet同样的数据,不知您的工作能否达到SOTA?

实现细节

image
作者您好,我发现correlation module实现视乎与论文中的表达不太一致。
在论文中的描述,correlation module的输入应该是 query_feat 和 用supp_mask 处理过的supp_feat。
但在您的代码实现中并没有这一步,而是对attn matrix用supp_mask 处理。
image
请问是为什么呢?还是我没找着。

还有一个小问题:就是之前的工作似乎都摒弃了在输入的时候就用mask处理supp_samples,转而对supp_feat用mask处理。您的方法又重新采纳回这种方案,请问是为什么呢?有没有相关的对比实验数据?

关于batch size

image
image
image
作者您好
您论文中提到训coco的batch size 设为6,但如果按readme中用4卡分布并行训练,那么实际的batch size不就是4*6=24了吗?

weights

Hello. Thank you for sharing the code.
I want to use your code for the segmentation of 2 classes using resnet50. When I run the code, it uploads the best.pth file, but this file has been prepared for 16 classes and gets an error for the segmentation of 2 classes. How can I apply that file for 2 classes? Thanks

list的相关问题

您好,冒昧再次打扰
我想问一下,在train和val当中都有出现data_list_x.txt与sub_class_file_list_x.txt,这两个有什么区别呢?

mmcv version

Could you please ask about the version information of mmcv and mmseg? I reported an error when running, which seems to be the version problem.
The error is:
ModuleNotFoundError: No module named 'mmcv.cnn.utils.weight_init'

I tried two versions of mmcv, 2.0.0 and 0.4.3, and got errors

Using mmsegmentation in the code.

Hi , Thank you for sharing excellent work. Just want to know whether this project is build upon mmsegmentation. Can you guide me where can i change my dataset in the code. I have gitcloned repository as well as mmsegmentation and now i want to further proceed with my custom dataset. Can you guide where to change.

Regards

About the permissions of the shared coco weights

Hi @Pbihao , thanks for your great work! Could you please have a check if the pretrained coco weights have been granted with the correct permissions? It seems that we cannot access to them if we do not have a CUHK's email account. Thank you very much!

code

Hello, we are very interested in your work. May I ask when the training and testing code can be released? Thanks!

关于fss_list

尊敬的作者,您好。在复现您的实验的时候,我发现您fss_list中coco数据条目与BAM不一致。请问您在获取fss_list时采取了与BAM不同的操作吗?这个操作是什么?

关于训练时间

您好,不知道您对实验的整个训练时间是否有过统计,方便透露下各个实验的训练时长吗?

Unable to access google drive for backbones as well as lists and other resources

Hi,
Congratulations for your excellent work. I was trying to access the resources and few hours ago i was able to access but now i am getting error. Further I want to run the code as such to get the base model results before trying my custom dataset and further extension. I keep on getting following error : Traceback (most recent call last):
File "train.py", line 586, in
main()
File "train.py", line 218, in main
loss_train, mIoU_train, mAcc_train, allAcc_train = train(train_loader, val_loader, model, optimizer, epoch)
File "train.py", line 389, in train
allAcc = sum(intersection_meter.sum) / (sum(target_meter.sum) + 1e-10)
TypeError: 'int' object is not iterable
Traceback (most recent call last):
File "train.py", line 586, in
main()
File "train.py", line 218, in main
loss_train, mIoU_train, mAcc_train, allAcc_train = train(train_loader, val_loader, model, optimizer, epoch)
File "train.py", line 389, in train
allAcc = sum(intersection_meter.sum) / (sum(target_meter.sum) + 1e-10)
TypeError: 'int' object is not iterable. Any idea what i am doing wrong here

关于label_b和label的疑问

作者您好,非常感谢您开源了这样杰出的工作。
我在运行代码的过程中遇到了一个问题,我不知道dataset.py文件中读入的label_b应该是什么格式呢,和普通的label的区别是?这个标签是否应该基于one-hot编码,如果是那么如何保存为图片格式呢?
非常期待您的回复!

The DATA List -----Done

Can I get the List on a public web disk like Google?Because I can’t get it now
Thanks

train on myown dataset

Hello. Thank you so much for sharing your interesting work. I want to use this code for soil erosion binary segmentation. My segmentation task doesn't have base classes and just has 0 and 1 (erosion and non-erosion). Now I have two questions. 1: Is this code able to do this segmentation? 2: If yes, should I use just one split?
Thank you in advance.
Hadi

Training logs and model weights.

Hi, I'm very interested in your work.
Would it be convenient for you to share the model weights or training logs on Pascal dataset?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.