esandml / fcbformer Goto Github PK

View Code? Open in Web Editor NEW

64.0 64.0 20.0 2.32 MB

Official code repository for: FCN-Transformer Feature Fusion for Polyp Segmention (MIUA 2022 paper)

License: Apache License 2.0

Python 100.00%

fcbformer's People

Contributors

Stargazers

Watchers

fcbformer's Issues

Couldn't Reproduce the results.

Hi.

I found your work interesting. Therefore, I decided to work on this topic which is new to me. I'm planning to start some work on medical image segmentation and found your article very interesting.

I tried to reproduce the results for both datasets. For example, train on Kvasir-SEG and test on Kvasir-SEG provided the train/test split by you. However, I failed to reproduce the results mentioned in the paper. I tried to run with and without pre-train weights. I also tried to re-train the code 2 times, but still failed to reproduce the results. I can reproduce results 2-3% less compared to what was mentioned in the paper. I was using torch 1.10.0 and cuda 11.3.

Could you please let me know any special tricks or library versions or what strategy should I follow to reproduce the results?

Thank you.

Is this model available for multiple classes?

Request a result image file for polyp segmentation

Hello, I sincerely request a result image file of your trained polyp segmentation for learning and comparison. If it is convenient, please send it to my email: [email protected] .
Thank you very much!

dataset label question

Within the dataset you are mentioning, the mask pixel values are more than 1, even though the number of classes is one.

I though that there would be 0 and 1 values for the pixel of mask image.

Is it OK?

Could you explain??

dataloader

When using the CVC dataset, a TypeError: Input image tensor permitted channel values are [3], but found 1 error occurs. It is likely due to different datasets having different dataloaders. I hope you can upload the dataloader for the CVC dataset.

Loss function

Hi.
Regarding the Dice Loss, I understood that DiceLoss = 1 - DiceScore, but the DiceScore in your code doesn't match the one proposed in VNet (fig 1) in which m1 and m2 are not squared in the denominator compared to VNet's Dice Loss (fig 2). I reckon that m2 might not be squared because its values are either 0 or 1 (or extremely close to 0 or 1 due to the floating things in computer or so) so the power of 2 doesn't change its original values. But for m1, since it's a predicted probability map with its values being in [0,1], the power of 2 DOES change the probabilities to some extent, so the derivative may also change significantly. So why aren't m1 and m2 (or just m2) squared?

Figure 1. the Dice Loss proposed in VNet

Figure 2. The code implementation. The highlighted line doesn't match with the denominator in VNet's Dice Loss

PVTv2: The conflict between the paper and the code

Hi, I have searched and found no relevant results.
I printed the architecture of class TB (Models/models.py) and found that in get_pyramid() method, the pyramid[] list consisted only 3 feature maps F1, F2, F3 (since the append happens 3 times) instead of 4 feature maps as proposed in your paper. Also, it seems like self.backbone[10] is, I guess, the missing F4. Therefore, there are actually 3 emphasized feature maps after the LE module. This affects the forward() method, i.e. at the first concat step of SFA, instead of concatenating F4_emph & F3_emph as in the paper, F3_emph & F3_emph are concatenated and that's weird.
Is my understanding correct? If so, why isn't F4 used?
Thank you very much!

PS: I have read the PVT repo of whai362 and in class PyramidVisionTransformer, F4 is not used by default but I don't understand why

Error: C^

Search before asking
I have searched and found no similar bug report.

Hi guys! Thank for great project.
When I follow this guide to train t got error:

I have no idea about this error. Please help me.

Encoder experiments

Hello!
I have read some relevant papers cited in your paper's references, one of which is Stepwise Feature Fusion: Local Guides Global, which is the inspiration for PLD+. In this paper, a number of encoder-decoder pairs are experimented, and from Table 4 as in the image below, PLD performs the best with the MiT encoder (MiT is the Segformer’s Encoder). So I wondered if you have tried MiT-B3 encoder before (which has similar #param as PVT v2-B3, at 45.2M) given that PLD is improved into PLD+ and has it been a good match to PLD+?

Thank you very much!

esandml / fcbformer Goto Github PK

fcbformer's People

Contributors

Stargazers

Watchers

Forkers

fcbformer's Issues

Couldn't Reproduce the results.

Is this model available for multiple classes?

Request a result image file for polyp segmentation

dataset label question

dataloader

Loss function

PVTv2: The conflict between the paper and the code

Error: C^

Encoder experiments

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent