Thanks for this work. I would like to ask if you can please share the logs for Swin-L

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Cannot Reproduce Results (Concerning Discrepancies !) about oneformer HOT 5 CLOSED

shi-labs commented on May 20, 2024

Cannot Reproduce Results (Concerning Discrepancies !)

from oneformer.

Comments (5)

achen46 commented on May 20, 2024

The numbers I get after running with Swin-L backbone and crop size of 640 x 640

mIoU,fwIoU,mACC,pACC
50.8082,72.8344,64.6854,83.2033

PQ,SQ,RQ,PQ_th,SQ_th,RQ_th,PQ_st,SQ_st,RQ_st
45.6353,81.1724,54.4800,45.3905,82.1440,54.6043,46.1248,79.2292,54.2314

AP,AP50,AP75,APs,APm,APl ( Task: segm)
31.3344,49.2562,32.7532,12.5777,34.9892,49.0925

AP,AP50,AP75,APs,APm,APl (Task: bbox)
0.0000,0.0000,0.0000,0.0000,0.0000,0.0000

In this repo and paper, I see the that you claim the following numbers:

PQ,AP,mIoU
49.8,35.9,57.0

My question is how can one close the gap between the reported numbers ? basically is the mIoU reported at the end of the training the same as s.s mIoU in the github ?

I hope authors take the issue of reproduciblity seriously. I would like to ask for the release of the log for this experiment as well the CityScapes with Swin-L backbone.

P.S: My environment and all dependencies exactly follows what you recommended.

from oneformer.

praeclarumjj3 commented on May 20, 2024

Hi @achen46, thank you for your interest in our work.

Please share your logs and exact details on your environment (GPU architecture and model, CUDA toolkit version, PyTorch, Torchvision, Detectron, and NATTEN versions + their compiled CUDA versions), so we can help you. That is the first piece of information any issue on an open-source repository requires. Simply stating that "it does not work with exactly following instructions" does not help.

We ran an experiment with a fresh clone of the same code (this GitHub repo) that you are having issues with, and we got the following numbers: PQ: 50.5, AP: 36.2, mIoU (s.s./m.s.): 56.6/57.6 (trained yesterday on 03/05/2023). The results are better than our reported numbers in our CVPR paper with PQ: 49.8, AP: 35.9, mIoU (s.s./m.s.): 57.0/57.7 (trained 7 months ago on 08/14/2022), where we only ran three times and reported the best number.

You can find the WandB logs for the original and reproduced runs here: WandB logs. We also share the training log with step-wise loss values for your reference and environment setup details to help your experiments.

from oneformer.

achen46 commented on May 20, 2024

Thanks for providing the logs. Regarding the questions you raised. My environment is as follows:

sys.platform linux
Python 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:10) [GCC 10.3.0]
numpy 1.22.2
detectron2 0.6 @/opt/conda/lib/python3.8/site-packages/detectron2
Compiler GCC 9.4
CUDA compiler not available
DETECTRON2_ENV_MODULE
PyTorch 1.13.0a0+d0d6b1f @/opt/conda/lib/python3.8/site-packages/torch
PyTorch debug build False
GPU available Yes
GPU 0,1,2,3,4,5,6,7 NVIDIA A100-SXM4-40GB (arch=8.0)
Driver version 515.65.01
CUDA_HOME /usr/local/cuda
TORCH_CUDA_ARCH_LIST 5.2 6.0 6.1 7.0 7.5 8.0 8.6 9.0+PTX
Pillow 9.0.1
torchvision 0.14.0a0 @/opt/conda/lib/python3.8/site-packages/torchvision
torchvision arch flags 5.2, 6.0, 6.1, 7.0, 7.5, 8.0, 8.6, 9.0
fvcore 0.1.5.post20221221
iopath 0.1.9
cv2 3.4.11

There are discrepancies between my environment and those listed in this repository. And I did not install NATTEN since my experiments only concerned Swin (commented out DiNAT, etc.).

I would like to ask if you maybe be able to reproduce your paper results with new Detectron2 and PyTorch versions -- especially Detectron2.

from oneformer.

achen46 commented on May 20, 2024

This is concerning because at least 2 independent users cannot reproduce the results and achieve similar benchmarks. I don't believe even a resolution is reached in #14 , but simply closed due to user inactivity.

From another angle, your paper results should not solely depend on a particular environment (especially if it is quite older than current Detection, PyTorch versions.) And efforts to make it reproducible for newer versions is critical. Again we are not discussing getting exact numbers, but rather numbers which are closer to your reported numbers (e.g. 50 mIoU vs 57 is a huge discrepancy).

from oneformer.

praeclarumjj3 commented on May 20, 2024

Hi @achen46, If you (or the other user who did not follow up) can provide your training log for your results (PQ: 45.6, AP: 31.3, mIoU: 50.8), I will help take a look. It seems you must have done something wrong to get this number, hundreds of people have used our code, and your reported case is rare.

Also, you did not follow our repo’s instructions. You did not understand that we are already using the latest existing versions. Firstly, we are already using Detectron2 and not Detectron. And we are using Detectron2-v0.6, which is still the latest official release version (since Nov 2021). Moreover, Detectron2-v0.6 only supports up to PyTorch-1.10.1 officially, so it’s only sensible to use the compatible version of the packages together. There’s still an open PR supporting the newer PyTorch version to Detectron2. I plan to upgrade our repo to PyTorch 2.0 (released 2 weeks ago) when Detectron2 is ready.

I will close this issue but feel free to open it again with your logs if you still have problems.

from oneformer.

Cannot Reproduce Results (Concerning Discrepancies !) about oneformer HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent