Coder Social home page Coder Social logo

vsummarize's Introduction

Pythonic Video summarization

Sick of nlp text summarization? Presenting: video summarization!

We take advantage of youtube's comment timestamp video references to generate a summarized + shorter video from any youtube video.

A glance:

>>> import vsummarize

>>> data = vsummarize.summarize('http://youtube.com/watch?v=...', output='shorter.mp4')

After this command, the physical summarized .mp4 is now in the output path you provided. We return some meta data incase you need it.

Timestamps from youtube comments are included and are important b/c our algorithm generates summaries via the comments and their timestamps.

>>> print data.hot_clips
[('0:12', '0:16'), ..., ('12:31', '13:01')]

>>> print data.timestamps
['0:12', '0:12', '0:14', ..., '12:31']

>>> print data.duration # in seconds
65

A demo:

I tested this software on a 1 hour long Obama speech.

Original video (59 minutes): http://www.youtube.com/watch?v=hed1nP9X7pI

Summarized video (3:30 minutes): http://www.youtube.com/watch?v=aDYDN9lsSHg

A lot of the time, (even in my product www.shorten.tv), you just want a list of hot video clips instead of physically summarizing a video into a new .mp4 because of the resource consumption.

To do this, simply ignore the output video file parameter.

>>> data = vsummarize.summarize('http://youtube.com/watch?v=...')

>>> print data.hot_clips
[('0:12', '0:16'), ..., ('12:31', '13:01')]

The physical, summarized .mp4 has NOT been generated. We just retrieved a set of meta data of what would have happened if we did summarize it.

To actually generate a summary, we use ffmpeg + moviepy along with the above .hot_clips video sequences to stitch together the video.

Features

  • .mp4 video summarization
  • youtube comments timestamp extraction
  • youtube video hot timestamp extraction
  • youtube video hot sub-clip extraction

Get it now

Because I use both OSX and Ubuntu, I have clear instructions on setting up this project in both platforms. However, I can't guarantee anything for the other platforms besides give installation advice.

We use moviepy, the python video manipulation library, which in turn depends on the ffmpeg library.

Be sure you have pip.

The installation instructions are as follows:

OSX:

$ brew install ffmpeg

Ubuntu:

$ curl -O http://ffmpeg.gusari.org/static/64bit/ffmpeg.static.64bit.2014-02-28.tar.gz
$ tar -zxvf ffmpeg.static.64bit.2014-02-28.tar.gz
$ sudo mv ffmpeg ffprobe /usr/local/bin/.
$ rm ffmpeg.static.64bit.2014-02-28.tar.gz

And lastly, don't forget to install vSummarize itself!

$ pip install vsummarize

This app uses the google gdata api. I have a file named settings.py which contains my personal api keys. I've removed that file from this repo for obvious reasons but i've included a file called rename_to_settings.py which has two api key values for you to cleanly fill out. Also, please rename that file to settings.py after you are finished!

Warning

Because this is such a resource intensive task & lib (especially if you are actually using the summarized .mp4 generation feature), you may notice on a few videos the .mp4 generation fail due to an OS memory exception. This means that you don't have the RAM for ffmpeg to fork processes to subchunk out your video.

License

Authored and maintained by Lucas Ou-Yang. Shoutout to Zulko for helping code and giving advice to some parts of this project.

We use moviepy and ffmpeg for video manipulation. We also use google's youtube api. Please feel free to email & contact me if you run into issues or just would like to talk about the future of this library!

vsummarize's People

Contributors

codelucas avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.