Coder Social home page Coder Social logo

j-min / adversarial_video_summary Goto Github PK

View Code? Open in Web Editor NEW
237.0 11.0 62.0 222 KB

Unofficial PyTorch Implementation of SUM-GAN from "Unsupervised Video Summarization with Adversarial LSTM Networks" (CVPR 2017)

Python 100.00%
video summarization gan vae pytorch

adversarial_video_summary's Introduction

Adversarial_Video_Summary

PyTorch Implementation of SUM-GAN

Changes from Original paper

  • Video feature extractor
    • GoogleNet pool5 (1024) => ResNet-101 pool5 (2048)
    • Followed by linear projection to 500-dim
  • Stable GAN Training
    • Discriminator's learning rate: 1e-5 (Others: 1e-4)
    • Fix Discriminators' parameters for first 15 steps at every epoch.

Model figures

Model figure 1

Model figure 2

Model figure 3

Algorithm

Algorithm figure

adversarial_video_summary's People

Contributors

j-min avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

adversarial_video_summary's Issues

How can get change points using KTS?

I tried to get change points using KTS code.
But i couldn't get proper change points.

If someone get change points using KTS, please help me?

The worst github repository you can work on.

if you are planning to use this repo, dont waste time and just skip it.

  • literally 0 support from the author
  • No single issue was closed or even one usefull information was mentioned.
  • The porject suppose that you have some dataset that you can not ever find it anywhere
  • Completly waste of time.

what is the purpose of uploading this project to open source and make it public if no one can contribute to it?

data

Can you tell me where you downloaded the dataset? I want to run your code has not run through, can I have your contact information?

about the score of every frame

I note that the sLSTM output is of [0, 1], so how can i ensure if the frame is a key frame?
If it is better when output the {0, 1}?

How to extract video features and the number of seq_len?

Hi, in this code, the video features is extracted using resnet,but I don't know if the features are extracted for one by one frame and what is the number of seq_len. Is a 2048-dimension feature of just one video frame extracted and saved as a h5df file or the features of whole video frames are saved as just one h5df file? Could give me some instruction about how to extract and save the features of the whole original video frames in h5df file. Thank you very much.

Inclusion of DPP loss as summary length regularization doesn't help in quality summarization

The authors propose DPP loss for Diversity Regularization in their model. Detrimental Point processes are a idea that help in sampling diverse subset of points from a set of points. This is a dire extension without which the model implementation is incomplete. I am willing to help in this. So can you raise a ticket about things to do and add it in read me. Cheers

log dir AttributeError: 'PosixPath' object has no attribute 'split'

There is not split attribute...
6nipdb> ipdb> logdir
PosixPath('/content/data1/jmcho/SUM_GAN/360airballoon')
6nipdb> ipdb> n
--Return--
None

/content/Adversarial_Video_Summary/utils.py(13)init()
12 import ipdb; ipdb.set_trace()
---> 13 super(TensorboardWriter, self).init(logdir)
14 self.logdir = self.file_writer.get_logdir()

6nipdb> ipdb> n
AttributeError: 'PosixPath' object has no attribute 'split'

/content/Adversarial_Video_Summary/solver.py(68)build()
67 import ipdb; ipdb.set_trace()
---> 68 self.writer = TensorboardWriter(self.config.log_dir)
69

6nipdb> ipdb> n
--Return--
None

/content/Adversarial_Video_Summary/solver.py(68)build()
67 import ipdb; ipdb.set_trace()
---> 68 self.writer = TensorboardWriter(self.config.log_dir)
69

6nipdb> ipdb> self.config.log_dir.split()
*** AttributeError: 'PosixPath' object has no attribute 'split'

6nipdb> ipdb> type(self.config.log_dir)
<class 'pathlib.PosixPath'>
6nipdb> ipdb> dir(self.config.log_dir)

The original video features also need feed into eLSTM and dLSTM

Base on the paper, the original video features also need feed into eLSTM and dLSTM and then feed it to Discriminator(cLSTM). But this implementation seems feed the original features directly into Discriminator after a linear_compress layer. Is this a Bug here ?

poor results applying video summarization on BDD100 dataset

Screenshot 2019-10-23 at 14 10 35

I am trying to apply the network on BDD100 dataset which is for drives so c_loss is -Gan_loss

in the paper we should :

  1. For learning {θs, θe}, minimize
    (Lreconst+Lprior+Lsparsity). ==> s_e_epoch
  2. For learning θd, minimize (Lreconst+LGAN). d_epoch
  3. For learning θc, maximize LGAN. which is -c_loss so minimize c_epoch

but i am having this behaviour? what could be the problem ?

model train help

hi, thank you for your codes. I have tried the codes, but failed. About the training, could you share a pre-trained one or give more guides?

Version Specifications

Hello and thanks for the code!

Is it possible to add in README the version specifications for the packages used? If they exist, it seems that I've missed them.
Packages like "Pillow" rarely break code in their version upgrades, but the same doesn't seem to happen for deep learning libraries (keras and tensorflow break backwards-compatibility very often).
I'm mostly interested in the "torch" and "torchvision" versions used.

Thanks!
Alex

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.