Coder Social home page Coder Social logo

tengfei-wang / implicit-internal-video-inpainting Goto Github PK

View Code? Open in Web Editor NEW
245.0 14.0 37.0 34.83 MB

[ICCV 2021]: IIVI: Internal Video Inpainting by Implicit Long-range Propagation

Home Page: https://tengfei-wang.github.io/Implicit-Internal-Video-Inpainting/

Python 100.00%
video-processing video-inpainting video-editing object-removal deeplearning computer-vision image-inpainting deep-learning

implicit-internal-video-inpainting's People

Contributors

ken-ouyang avatar tengfei-wang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

implicit-internal-video-inpainting's Issues

GPU out of memory when set ambiguity_loss or stabilization_loss to True

Hi,
I am trying to run your code with ambiguity loss and stabilization loss. But I met the gpu out of memory problem.

May I ask what is the batch size you set in the experiments with ambiguity loss and stabilization loss and what kind of gpu and how many gpus are used to train the model?

Many thanks!

Virtualenv users support

Hi there. Thank you very much for the code!. Could you generate the "requirements.txt" file for venv users? This gives us more performance and control over the CUDA versions for users with newer RTX 30XX cards. Thank you very much in advance!

About the pipeline

Hi, thanks for your released code.

It takes too long to train&test a video, will it possible to fastly test any input? The example "bmx-trees" takes several hours to finish.

Thanks.

Reduce time for inference

As for each video that I want to repaint I must re-train, what parameters do you recommend changing to decrease the algorithm's training times (4 hours is too much, I would like to achieve 1h <). Thanks a lot in advance for the help and the code!

PNG version of our uncompressed results and segmentation results

Hi,

Many thanks for publishing the code for this nice work. I am very interested in your work.

May I ask could you please share

  1. the PNG version of results
  2. the segmentation results
  3. inpainting results with only the first frame segmentation mask

of all the videos in DAVIS dataset?

Many thanks!

the resolution problem of saving the result's picture

Hello, thank you for the code. And I can train and test the bmx-trees datasets. But the resolution(320, 600) of the picture result is different from the input (480, 854). I found that the train.yml and test.yml file has the parameterimg_shapes: [320, 600]. But when I try to change it to img_shapes: [480, 854] to retrain, the following error occurred:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [5,480,856,3] vs. [5,480,854,1] [Op:Mul].
So if I want to save the inpainting image with the same resolution as the input image, what should I do?

Dataset directory for training

@ken-ouyang @Tengfei-Wang Thank you for releasing the codebase.
Could you please share what the exact data directory should look like for training your model on a video?
I am training on a random youtube video. So, do I need to create a separate frames directory and mask directory, where each file in the mask folder corresponds to an image file in the frames directory?

multi-GPUs - only using vram, not processing

Hi Tengfei Wang, such a amazing reasearch and many thanks for sharing the code. Very intersting results...

I was able to reproduce some results and really liked the work flow you created of CNN and not Oflow, seams it handles perspective shifts and background better (still playing with it).
The dilate mask makes totally sense...

My question is about multi-GPU to speed up training....im doing these below:

on train.py i removed the # on mirrored_strategy = tf.distribute.MirroredStrategy() line
and
added # on os.environ["CUDA_VISIBLE_DEVICES"] = FLAGS.GPU_ID.

With that seams that is Training is using both GPUs, but also shows that the GPU_0 is using CUDA and processing but GPU_1 only using vram, does not seams to be using CUDA and process, only VRAM.
Is that correct?

Also saw @tf.function down below, but not sure if i should remove # on those lines. Also found #dist_full_ds = mirrored_strategy, tried but seams to do the same thing on second gpu, only using vram, not processing

Is that correct behavior?

Thank you Tengfei Wang and once again, amazing research.

the single gpu infer for multi-gpu train

Hello, thank you for the code. Now I can train the model using multi-gpu. But when I use single gpu to infer it, the following error occurred:
tensorflow.python.framework.errors_impl.NotFoundError: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for exp/logs/city_day/0_3/checkpoint_200000
So if I use multi-GPU to train, Do I have to use multi-GPU to infer?

What License?

Hi,
thanks for the great work! I was wondering under which license the repo is published?

Thanks!

4K pipeline and performance

Hello,

Great work. I would like to test the 4K pipeline.
Could you provide a sample or some hints on how long it took to train?
Thank you in advance.
Barnabas

What "Mask Propagation from A Single Frame" usage?

  1. How much annotation files I am must provide?
  2. Must I am provide corresponding mask for them or not?
  3. Why annotation pictures red and green and what difference?
  4. How much frames I am must provide for video?
  5. Is video must be without fast object movement?

error when using train_dist.py

when I use tensorflow 2.4:
AttributeError: 'MirroredStrategy' object has no attribute 'experimental_run_v2'
I need to downgrade to tensorflow 2.0

when I use tensorflow 2.0:
ImportError:cannot import name 'keras_tensor' from 'tensorflow.python.keras.engine' (/home/ivdai/anaconda3/envs/IIVI/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/init.py)
I need to upgrade to tensorflow 2.4

funny

About test

Hi, I‘m a little confused about this method.
If I want to test 10 different videos, should I train 10 different models? and every model needs about 4 hours?
or just need to train a model which can be applied to other videos?

Questoin about mask

Hi, thx for repo!
Can you explain - must I make mask for each frame from video manually? or
maybe has any automation tool?
thx!

I haven't understand your network.

Hi,
Thanks for providing code. I look at your code, I find you train one video, and then use the same to do the inference. I think it is tricky. The CNN should be train with multiple videos, and then use different video to do inference. Same video to train and same video to inference, it is of cause produce a good result.

I try to understand your model. Is kind like the input video has an foreground object and its mask, then, you give another mask for augmentation. Then, after training, the output video will have no foreground and the background is re-drawed. Am I right?

Can I train your model with multiple videos then inference with a different video? For example, I want to train 10 different videos, then I inference with another different video? What will be happened on the inference? How can I train the model with multiple videos?

about pytorch

Hello, thank you very much for your sharing. Is there a PyTorch version?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.