slei109 / patnet Goto Github PK

View Code? Open in Web Editor NEW

52.0 52.0 6.0 2.82 MB

Official PyTorch Implementation of Cross-Domain Few-Shot Semantic Segmentation, ECCV 2022

Python 100.00%

cross-domain-few-shot-learning few-shot-segmentation

patnet's People

Contributors

Stargazers

Watchers

Forkers

lixiang007666 git-tengsun the-patpat hcwdavid narugakuru harshgarg28

patnet's Issues

Detail of TFI

Hello, I'm interested in your work. But here are two questions:

The article does not specify the number of fine-tune iterations for TFI.
I noticed that the finetune_reference only one kl variable is calculated, but this kl variable is not associated with the weight of self.reference_layer, how can I use kl_loss to update reference_layer?
Could you please provide more details? Thank you so much!

Question about the TAFT module

Thanks for your work!
A regression loss is used in TAFT for training the transformation matrix. But there is no such loss in your work.
Does the PATM module not need to be trained? Why?

Could you please provide example code and launching command to few-shot finetune the model?

Hello authors,

Could you please provide example code and launching command to few-shot finetune the model?
The correct way to use your code should be:

running train.py on pascal;
fine-tune on the few-shot support set; (Is the code for this step missing? Calling finetune_reference() function in Line 139 in patnet.py)
running test.py on query set of the target domain.
Am I correct?

Thank you.

Could you provide the processed DeepGlobe dataset?

According to the preprocess.py you provided, the number of processed images is 9175 instead of 5666.
I don't know why this is the case. To facilitate comparison with your work, hope you could provide the processed dataset.

Question about "6.3 Implementation Details"

Thanks for your work!
In section 6.3, the paper states: An Adam optimizer is used to fine-tune PATM, with a learning rate of 1e-3 for Deepglobe and
ISIC, 5e-5 for Chest X-ray and FSS-1000.

Is the "cross-domain" proposed in the paper after fine-tune on other dataset? Not directly using the weights trained on the PASCAL VOC on each dataset of the proposed benchmark?

关于fss-1000的疑问

File "..\PATNet-master\data\fss.py", line 83, in sample_episode
class_sample = self.categories.index(query_name.split('/')[-2])
IndexError: list index out of range

感谢您的工作!
我把验证集的数据准备好之后运行train.py出现上述错误，不知道是哪里出错了，想请问一下

Some details of the paper

Thanks for your excellent work!
Where can I find the supplementary material ?
What is the "--niter" on the PASCAL VOC dataset? The default in the code is 2000.
Can one-shot segmentation results be visualized?
Thanks!

Question about PATM

Thanks for your work!
In Section 5.1, you said Since the domain-agnostic metric space is constant, it will be much easier for the downstream segmentation modules to make predictions in such a stable space.
I wonder why it is useful to select such a stable space?
I noticed that TAFT module uses the same idea. However, TAFT also does not explain why it works.

An error occurs when running on the source data

Excellent work! I met an error when training the pre-trained model, it shows like that:

Backbone # param.: 23561205
Learnable # param.: 2580968
Total # param.: 26142173

available GPUs: 1

Total (trn) images are : 13680
Total (val) images are : 0
[Epoch: 00] [Batch: 0001/0684] L: 0.68586 Avg L: 0.68586 mIoU: 0.04 | FB-IoU: 37.67
[Epoch: 00] [Batch: 0051/0684] L: 0.40346 Avg L: 0.52482 mIoU: 0.00 | FB-IoU: 38.23
[Epoch: 00] [Batch: 0101/0684] L: 0.37764 Avg L: 0.46712 mIoU: 12.09 | FB-IoU: 45.07
[Epoch: 00] [Batch: 0151/0684] L: 0.46177 Avg L: 0.44117 mIoU: 21.00 | FB-IoU: 50.16
[Epoch: 00] [Batch: 0201/0684] L: 0.46565 Avg L: 0.42391 mIoU: 25.59 | FB-IoU: 53.12
[Epoch: 00] [Batch: 0251/0684] L: 0.39822 Avg L: 0.41749 mIoU: 28.23 | FB-IoU: 54.64
[Epoch: 00] [Batch: 0301/0684] L: 0.30663 Avg L: 0.41056 mIoU: 30.69 | FB-IoU: 55.98
[Epoch: 00] [Batch: 0351/0684] L: 0.31297 Avg L: 0.39998 mIoU: 32.92 | FB-IoU: 57.27
[Epoch: 00] [Batch: 0401/0684] L: 0.40371 Avg L: 0.39329 mIoU: 34.65 | FB-IoU: 58.27
[Epoch: 00] [Batch: 0451/0684] L: 0.30678 Avg L: 0.38822 mIoU: 35.85 | FB-IoU: 59.02
[Epoch: 00] [Batch: 0501/0684] L: 0.33376 Avg L: 0.38439 mIoU: 36.85 | FB-IoU: 59.63
[Epoch: 00] [Batch: 0551/0684] L: 0.49764 Avg L: 0.38038 mIoU: 37.92 | FB-IoU: 60.25
[Epoch: 00] [Batch: 0601/0684] L: 0.37989 Avg L: 0.37663 mIoU: 38.79 | FB-IoU: 60.69
[Epoch: 00] [Batch: 0651/0684] L: 0.34453 Avg L: 0.37386 mIoU: 39.91 | FB-IoU: 61.22

*** Training [@epoch 00] Avg L: 0.37309 mIoU: 40.20 FB-IoU: 61.40 ***

Traceback (most recent call last):
File "train.py", line 97, in
val_loss, val_miou, val_fb_iou = train(epoch, model, dataloader_val, optimizer, training=False)
File "train.py", line 27, in train
for idx, batch in enumerate(dataloader):
File "/home/jifanfan/anaconda3/envs/torch/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 435, in next
data = self._next_data()
File "/home/jifanfan/anaconda3/envs/torch/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 475, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File "/home/jifanfan/anaconda3/envs/torch/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/jifanfan/anaconda3/envs/torch/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/jifanfan/PATNet/data/pascal.py", line 32, in getitem
idx %= len(self.img_metadata) # for testing, as n_images < 1000
ZeroDivisionError: integer division or modulo by zero

I think the reason for this is that the validation data cannot be loaded. I think the following code leads to this error (For the self.fold of vallidation data is 4).

img_metadata = []
if self.split == 'trn': # For training, read image-metadata of "the other" folds
for fold_id in range(self.nfolds):
if fold_id == self.fold: # Skip validation fold
continue
img_metadata += read_metadata(self.split, fold_id)

    elif self.split == 'val':  # For validation, read image-metadata of "current" fold
        if self.fold != 4:
            img_metadata = read_metadata(self.split, self.fold)
    else:
        raise Exception('Undefined split %s: ' % self.split)

I use the pytorch 1.7+cuda11.0
Thank you for yout time.

Question about ISIC dataset

Sorry to bother you. The raw ISIC dataset is just pictures with serial numbers. It seems that you have categoried the ISIC dataset into three folders named 1, 2, 3. Could you please provide the related code. Thanks!

Dataset Structure of different dataset

Thank you for your work and continued support.

How should I structure my own datasets to work fine with minimal code changes? Could you explain please.

test detail

1.我应该将num改为1200吗?

No corresponding papers can be found

Dear author, thank you very much for providing the code of your paper. The title of your paper is very exciting and attractive to me.

But now I have a problem that I can't find your paper on the Internet. I hope you can provide the content or links of this paper.

Thank you very much.

The Task-Adaptive Fine-Tuning Inference (TFI) coding

Thanks for your excellent work！
I couldn't find any code for calling TFI，only finetune_reference can be found。I want to know if you forgot to write TFI related code in test.py？

supplementary material

Hello author, could you please provide me with the address of the supplementary material for the paper? Thank you very much!

About Task-adaptive Fine-tuning Inference(TFI)

How do I use TFI before I get the final prediction mask?
I hope you could provide the relevant code.
Thanks!

Wrong initialization for `dataloader_val` in `train.py`

This line for initialization of dataloader_val

According to my understanding, the script train.py is for training on Pascal dataset, so we should pass the first argument for initialization of dataloader_val
as args.benchmark and the 4th argument as 0(int zero instead of str zero)?

My configuration for running train.py is:

    dataloader_val = FSSDataset.build_dataloader(args.benchmark, args.bsz, args.nworker, 0, 'val') # shouldn't be fss  ??

Is my configuration correct?

By the way, is it reasonable to set the number of validation images to 346? Is this number too small to validate our model? Can we find other reference materials?

Thanks for your reply.

Argument "--niter"

Excuse me，is ”niter“ set to 2000 in your experiment?This can take a long time to train.
parser.add_argument('--niter', type=int, default=2000)

Code and paper seem inconsistent

Thanks for releasing the code！
But I found that the implementation of the transformation matrix and the paper description seem to be inconsistent.
In your code, the matrix was implemented as:
P = torch.matmul(torch.pinverse(C), R)
but in your paper, the matrix were calculated as

Accordingly, the matrix should be implemented as:
P = torch.matmul(R, torch.pinverse(C)).
It really confused me. Looking forward to your reply！

A detail about training and testing

Do you use only one trained model and test it on four target datasets, or do you have a separate trained model for each target dataset to test?
thx.

some detail about finetuning

I wonder how M_hat got it. The decoder outputs a feature map of size [b,2, h, w], but the size of the M_hat in the paper seems to be [b, 1, h, w].

About deepglobe and isic dataset

Can you provide more detailed guidance on how to properly load deepglobe and isic datasets?

isic
Is it correct to classify ISIC Training Images into three categories according to class_id.csv?

2.deepglobe
After the preprocess.py processing, the chunked and filtered Image and label are obtained.

How do I use your code to load the processed data?

Question about deepglobe dataset

Thanks for your work!
I have some questions about deepglobe daaset:

1.Your got 5666 deepglobe images after filtering the single class images and the ‘unknown’ class. As stated in your paper.
But I got 9174 images follows the preprocess.py. Why is it? Will this have an impact on the results?
2. It seems that you cut each image into 36 pieces but not 6 pieces.
3. It seems that the preprocess.py is Incomplete.
When test the model, we need datapath like ann_path = os.path.join(self.base_path, query_name.split('/')[-4], 'test', 'groundtruth') or img_paths = sorted([path for path in glob.glob('%s/*' % os.path.join(self.base_path, cat, 'test', 'origin'))]). But there are no paths like 'test', 'origin' or 'groundtruth' in the script preprocess.py. What should I do here?

Thanks!

Question about dataset

Hi, sorry to bothering you.
I have some question about the deeplobe dataset and isic, I download these dataset and processed them but it didn't work.

The datasets seem to be classified into various folders but I don't kown how to classfy them.
So could you give me the deeplobe and isic datasets that you directly used in the paper? Thank you!

Best regard

How to freeze randomness during testing for reproducibility

Hello, I have a question. When I want to test on four benchmark, how can I freeze randomness during testing for reproducibility. Should I setting the parameter 'shuffle = false' ? And how did you split the test data for these four benchmark data sets?

We adopted early stop strategy during training. So it didn't run 2000 epochs. You can reset it to work with your dataset.

          We adopted early stop strategy during training. So it didn't run 2000 epochs. You can reset it to work with your dataset.

Originally posted by @slei109 in #7 (comment)
Thank you for your answer. But I did not see the early stop strategy in the code you provided. Do I need to add this myself？

PIL.UnidentifiedImageError: cannot identify image file

Hello dear
I am trying to run your code but I get this error:

  File "/usr/local/lib/python3.9/dist-packages/PIL/Image.py", line 3030, in open
    raise UnidentifiedImageError(
PIL.UnidentifiedImageError: cannot identify image file '../VOCdevkit/VOC2012/JPEGImages/2010_004045.jpg'

1- Is there any problem with the folder "VOCdevkit" location? I uploaded it on the same folder where I cloned your repository.
2- Is there a pretrained model?
3- Could you please explain how to run your code using different training dataset?

Thank you very much!

Where are the pre-trained weights?

Where could I find the pre-trained weights? Or the pre-trained weights are not released? Thanks a lot!

An error occured when running train.py.

A great job. Could you please help me with this?

Backbone # param.: 23561205
Learnable # param.: 2580968
Total # param.: 26142173

available GPUs: 8

Total (trn) images are : 13680
[Epoch: 00] [Batch: 0001/0684] L: 0.68105 Avg L: 0.68105 mIoU: 0.00 | FB-IoU: 33.25
[Epoch: 00] [Batch: 0051/0684] L: 0.43134 Avg L: 0.49445 mIoU: 0.00 | FB-IoU: 37.79
[Epoch: 00] [Batch: 0101/0684] L: 0.35394 Avg L: 0.45320 mIoU: 6.28 | FB-IoU: 41.73
[Epoch: 00] [Batch: 0151/0684] L: 0.34036 Avg L: 0.43587 mIoU: 18.09 | FB-IoU: 48.10
[Epoch: 00] [Batch: 0201/0684] L: 0.29183 Avg L: 0.42005 mIoU: 24.74 | FB-IoU: 51.72
[Epoch: 00] [Batch: 0251/0684] L: 0.36972 Avg L: 0.40563 mIoU: 28.04 | FB-IoU: 53.82
[Epoch: 00] [Batch: 0301/0684] L: 0.37294 Avg L: 0.39855 mIoU: 30.98 | FB-IoU: 55.46
[Epoch: 00] [Batch: 0351/0684] L: 0.29182 Avg L: 0.39139 mIoU: 33.15 | FB-IoU: 56.81
[Epoch: 00] [Batch: 0401/0684] L: 0.43611 Avg L: 0.38447 mIoU: 34.84 | FB-IoU: 57.90
[Epoch: 00] [Batch: 0451/0684] L: 0.36160 Avg L: 0.37885 mIoU: 36.57 | FB-IoU: 58.90
[Epoch: 00] [Batch: 0501/0684] L: 0.32845 Avg L: 0.37381 mIoU: 37.72 | FB-IoU: 59.63
[Epoch: 00] [Batch: 0551/0684] L: 0.39814 Avg L: 0.37052 mIoU: 38.67 | FB-IoU: 60.23
[Epoch: 00] [Batch: 0601/0684] L: 0.27711 Avg L: 0.36641 mIoU: 39.54 | FB-IoU: 60.77
[Epoch: 00] [Batch: 0651/0684] L: 0.39666 Avg L: 0.36396 mIoU: 39.98 | FB-IoU: 61.05

*** Training [@epoch 00] Avg L: 0.36293 mIoU: 40.47 FB-IoU: 61.37 ***

Traceback (most recent call last):
File "/home/liuxuanchen/Develop/cross-domain/PATNet/train.py", line 105, in
val_loss, val_miou, val_fb_iou = train(epoch, model, dataloader_val, optimizer, training=False)
File "/home/liuxuanchen/Develop/cross-domain/PATNet/train.py", line 47, in train
average_meter.write_result('Training' if training else 'Validation', epoch)
File "/home/liuxuanchen/Develop/cross-domain/PATNet/common/logger.py", line 61, in write_result
loss_buf = torch.stack(self.loss_buf)
RuntimeError: stack expects a non-empty TensorList

The performance of PATNet without test-time finetuing

I used the code you provided as well as the dataset to reproduce PATNet. Without fine-tuning, the difference on DeepGlobe's dataset is a bit large. Here's what I reproduced, and the inference was averaged over 5 different random seeds.
The performance of other datasets is quite similar, but the deepglobe dataset is quite different. I wonder if there is an error with the dataset provided?
deepglobe

fss-1000

isic

chest x-ray

This is the ablation experiment in your paper on test-time finetuing.

pretrained model

Hello, can you provide your pre-trained model for testing?

Can't find Task-adaptive Fine-tuning Inference(TFI) .

Hello, Author.

When I reproduced the fine-tuning stage, I did not find the TFI in the code you provided, but I found the finetune_reference you defined in patnet.py.

So I tried to add it to test.py, but it didn't work. I think it may be because pred_mask can't request gradient.

I hope you can answer my questions or add the complete code, thank you!

Question about random seeds

非常感谢您将工作开源！

在paper中您写道 “For each evaluation, we average the mean-IoU of 5 runs [44] with different random seeds.”
但是 test.py 代码中固定了 seed：utils.fix_randseed(0)
请问5次结果是怎样得到的？是将 utils.fix_randseed(0)->utils.fix_randseed(None)，然后运行5遍test.py 吗？

Problem running program on non cuda:0

When I tried to run the program on a GPU other than cuda: 0, for example, on cuda: 4, the program reported an error：tensors were found at cuda:0 and cuda:4.
setting and result are here

Preprocessing of Deepglobe Dataset

Hi Shuo,
It seems that you classified images in Deepglobe into six category subfolders.
Could you please release the code for that step?
Thank you very much.

About the transformation

I find that using the transformation that you proposed would make model convergence much slower.
I wonder why and do you think there is any way to accelerate model convergence when I try to integrate the transformation module into a fss model.

Preprocessing of Deepglobe

Dear Author!
Thank you for providing the code, but I encountered some problems reproducing your article.

About the preprocessing of the Deepglobe dataset you mentioned in the paper, that is, to increase the number of testing images and reduce the size of images, we cut each image into 6 pieces. As the categories labeled in this dataset have no regular shape, the cutting operation has little effect on the segmentation. After filtering the single class images and the'unknown'class, we get 5,666 images to report the results and each image has 408 × 408 pixels, I can't reproduce it.

I hope you can supplement the code. This will help me learn your article a lot.
Thank you!

Can you provide the pretrained weights?

Can you provide the pretrained weights? Thanks so much.

Question about Deepglobe Dataset

Thanks for your work!
It seems that images and masks are both divided into 6 categories in deepglobe.py like:
img_paths = sorted([path for path in glob.glob('%s/*' % os.path.join(self.base_path, cat, 'test', 'origin'))]) (for images)
ann_path = os.path.join(self.base_path, query_name.split('/')[-4], 'test', 'groundtruth') (for masks)

But only masks are divided into 6 categories in preprocess.py.
How can I divided images into corresponding categories?

Question about the training

Thanks for your work!
I have a question that I would like to have answered from you：

Configuration of the training network:
8 x 2080 Ti GPUs
batch size = 32
nworker = 12 (0, 2, 8 were also tested)

I train the HSNet, and the speed can be fast. The GPU-Util is high(about 50%). I can complete an epoch in a few minutes.
But the training speed is particularly slow when I perform your code with the same configuration. GPU-Util fluctuates a lot (occupancy from 1%-50%), and it takes 90 minutes to train one epoch!
Moreover, even training your network with one 2080 Ti GPU, the GPU-Util fluctuates greatly.
What is the reason for this problem?
Thanks!

A few questions about fine-tuning

Thanks for your work!
Where did the query image come from when you used it for fine-tuning? In addition, for all the tested datasets, do they use the Pascal dataset as the training set and FSS as the validation set?

KL loss function on Testing stage

Hi, I have a more question about the KL loss function on testing stage. I found that in your paper, there is a unfrozen stage between the support and query test data used KL loss, but in your code->test.py, the 'loss=None'. So, where did you import and use KL loss function? Thanks for you help in advance.

Could you provide the fine-tune code?

I tried to use TFI but couldn't achieve the results of your paper, could you provide the fine-tune code?

about training

Thanks for your work!
I'm getting the following reported error while training:

Traceback (most recent call last):
File "..\PATNet-master\train.py", line 119, in
val_loss, val_miou, val_fb_iou = train(epoch, model, dataloader_val, optimizer, training=False)
File "..\PATNet-master\train.py", line 55, in train
average_meter.write_result('Training' if training else 'Validation', epoch)
File "..\PATNet-master\common\logger.py", line 77, in write_result
loss_buf = torch.stack(self.loss_buf)
RuntimeError: stack expects a non-empty TensorList

I have no idea why it does this at the end of training a epoch, which should go on to the next epoch