Coder Social home page Coder Social logo

da-vsn's People

Contributors

dayan-guan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

da-vsn's Issues

Some questions about data loading

Hi,
This is a very enlightening work!!! @xing0047 @Dayan-Guan
I want to ask a question~

When I use./TPS/tps/scripts/train.py to read SynthiaSeq or ViperSeq data, I debug the code and find the following phenomena:

I tried to print some variables of __ getitem__ () ,

When the shuffle of source_loader = data.DataLoader() is set to False, and the batch_size=cfg.TRAIN.BATCH_SIZE_SOURCE is set to 1,

  1. It is found that although the batch_ Size=1, but 4 pictures and the first frame corresponding to them are loaded at one time,
    Instead of 1 picture and the previous frame.

  2. At the same time, it is found that 4 loaded pictures are disordered, such as 2-1-3-4, rather than 1-2-3-4, it seems to violate the settings of shuffle.

Could you please kindly explain my doubt? Thank you very much!!

The print code are as follows:

111

The print results are as follows,which the order of each run of print is different:

---index--- 1
---index--- 0
---index--- 2
img_file tps/data/SynthiaSeq/SEQS-04-DAWN/rgb/000002.png
label_file tps/data/SynthiaSeq/SEQS-04-DAWN/label/000002.png
---index--- 3
img_file tps/data/SynthiaSeq/SEQS-04-DAWN/rgb/000001.png
label_file tps/data/SynthiaSeq/SEQS-04-DAWN/label/000001.png
img_file tps/data/SynthiaSeq/SEQS-04-DAWN/rgb/000003.png
label_file tps/data/SynthiaSeq/SEQS-04-DAWN/label/000003.png
img_file tps/data/SynthiaSeq/SEQS-04-DAWN/rgb/000004.png
label_file tps/data/SynthiaSeq/SEQS-04-DAWN/label/000004.png
image_kf tps/data/SynthiaSeq/SEQS-04-DAWN/rgb/000003.png
image_kf tps/data/SynthiaSeq/SEQS-04-DAWN/rgb/000002.png
image_kf tps/data/SynthiaSeq/SEQS-04-DAWN/rgb/000001.png
image_kf tps/data/SynthiaSeq/SEQS-04-DAWN/rgb/000000.png
label_kf tps/data/SynthiaSeq/SEQS-04-DAWN/label/000003.png
label_kf tps/data/SynthiaSeq/SEQS-04-DAWN/label/000002.png
label_kf tps/data/SynthiaSeq/SEQS-04-DAWN/label/000001.png
label_kf tps/data/SynthiaSeq/SEQS-04-DAWN/label/000000.png

Dataset Size

Dear authors,

Thank you for your work. I have a question about your implementation.

In your paper you describe in the section "Datasets" that Cityscapes has 2975 sequences with 30 frames each, VIPER 134k frames and SYNTHIA-Seq 8000 frames.

However, your implementation indicates that a much smaller number of unique frames were used for training, i.e. 13k for VIPER and 849 frames for Synthia. Could you please elaborate on the reasoning behind this decision? Particularly as the difference between VIPER SYNTHIA is also significant.

Thanks for clarifying!

Regarding Synthia-Seq Dataset

I really enjoyed reading your work.
I have a question regarding the synthia-seq dataset.
In the paper you mention that you have used 8000 synthesized video frames, but in the github the Synthia-Seq Dawn contain only 850 images. Can you please clarify this ambiguity.
Thank you.
image

Question on Synthia-seq dataset

Dear authors,

Thank you for your great work. I have several questions about the synthia-seq->cityscape-seq adaptation.
The first one is about the scale of training data. It seems like compared with the VIPER dataset, synthia-seq only contains one labeled video with 850 frames in total. Is that true?
And the second question is that 11 classes are reported the Table 4, but in the dataloader of synthia-seq, 12 classes are used. So, I'm not sure whether the fence class is considered during adaptation or not.

self.id_to_trainid = {3: 0, 4: 1, 2: 2, 5: 3, 7: 4, 15: 5, 9: 6, 6: 7, 1: 8, 10: 9, 11: 10, 8: 11,}

Thanks in advance for your help!

In train_video_UDA.py, line 251, trg_ prob_ warp = warp_ bilinear(trg_prob, trg_flow_warp), if the image flips, but the optical flow does not flip

Hello!
I really enjoy reading your work!!
At the same time, I encountered a problem in the operation of train_video_UDA.py

In line 251 trg_ prob_ warp = warp_ bilinear(trg_prob, trg_flow_warp),
Variable trg_prob is the prediction of trg_img_b_wk,
and trg_img_b_wk is obtained by trg_img_b based on a certain probability of flip,
but trg_flow_warp does not seem to be flipped,
We consider such a situation,
If trg_img_b_wk is fliped,
trg_flow_warp is not flipped,
Then trg_prob_warp and trg_img_d_st do not seem consistent? Because the image flips, but the optical flow does not flip. Although the trg_pl in line 256~258 is fliped.

Chinese discription of my question:
在第251行, trg_ prob_ warp = warp_ bilinear(trg_prob, trg_flow_warp),
变量trg_prob是trg_img_b_wk的语义分割预测,
而trg_img_b_wk是由trg_img_b根据一定概率flip得到的,
但 trg_flow_warp似乎没有进行翻转,
我们考虑这样一种情况,
如果trg_img_b_wk经过了flip处理,
那么trg_prob_warp和trg_img_d_st的语义貌似不是一致的?因为图像flip了但光流图没有flip。
尽管在第256行对trg_pl进行了flip操作

Could you please provide 'estimated_optical_flow' for training DA-VSN

Hi @Dayan-Guan , thank you for open-sourcing your work!

I am trying to follow this work. For training DA-VSN from scratch, the optical flows (for the 3 datasets used in your paper) estimated by FlowNet2 are needed. However, the instruction in your README only includes the evaluation part. I also see from the recent issues that you have provided the code and more instructions for the training part. But the code is not a complete one I guess so I cannot generate the optical flows with it.

Could you please provide your generated optical flows for all 3 datasets used in your paper? It would save us time. Or could you please have a look again at the provided 'Code_for_optical_flow_estimation'? So that it is runnable for generating optical flows on our own.

Thanks in advance!

Regards

Details of SYNTHIA-Seq dataset

Hi author, I have downloaded SYNTHIA-Seq, but I found there are 'Stereo_Left' and 'Stereo_Right' folders. And each contains 'Omni_B', 'Omni_F', 'Omni_L' and 'Omni_R'. I wonder which one is used for training.

Optical flow for training

Thanks for your great job! I want to train DA-VSN, but I don't know how to get Estimated_optical_flow_Viper_train, Estimated_optical_flow_Cityscapes-Seq_train. I didn't find the detail about optical flow from readme or paper.

Optical flow is not used for propagating

Hi, author. I have two questions.
The first is I find that you didn't use flow to propogate previous frame to current frame. You just use it as a limitation that the pixel appeared in both cf and kf will be retained. This is unreasonable.
image
And I refine the code using resample2D to warp kf to cf, but the result only improve a little.

The second question is that I try to train DAVSN for 3 times on 1080Ti and 2080Ti following the setting you gave, but I only get 46 mIoU which is 2 point less than you.

The 'estimated_optical_flow' you provided seems inaccessible

Hello @Dayan-Guan , thank you for open-sourcing your work!

I initially tried to use optical flow estimation to get optical flow files necessary for training, but it seemed to be complicated.
I wish I could experiment with pre-trained optical flow first, but unfortunately I can't seem to access the link you gave.

https ://drive.google.com/drive/folders/1i_-yw9rS7-aa7Cn5ilIMbkUKwr1JpUFA?usp=sharing

When I try to access this link it gives me a 404 error. Could you provide a new access link?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.