Coder Social home page Coder Social logo

vidsum's Introduction

Generate summary of any video

Generate a summary of any video through its subtitles.

This is the community driven approach towards the summarization by the OpenGenus community.

Installing vidsum

In order to install vidsum, simply clone the repository to a local directory. You can do this by running the following commands:

$ git clone https://github.com/OpenGenus/vidsum.git

$ cd vidsum/code

Please note that vidsum requires the following packages to be installed:

If you do not have these packages installed, then you can do so by running this command:

$ pip install -r requirements.txt

Usage

To generate summary of a video file sample.mp4 with subtitle file subtitle.srt :

python sum.py -i sample.mp4 -s subtitle.srt

To summarize a YouTube video from its url:

python sum.py -u <url>

If you want to remain the downloaded YouTube video and subtitles:

python sum.py -u <url> -k

Future developments

For future development to this approach, see Wiki and check out other approaches.

Contributions

All contributions are welcomed. Please see COMMIT_TEMPLATE.md before making pull requests to this repository. See all contributors here.

vidsum's People

Contributors

adichat avatar bhaveshan avatar codebu5ter avatar grimd34th avatar guptarohit avatar ilimugur avatar jamesmco avatar lwgray avatar shriakhilc avatar subkrish avatar turnrdev avatar vipul-sharma20 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

vidsum's Issues

CONTRIBUTING.md - Instructions on how to contribute to Vidsum

๐Ÿ“
To help contributors get more easily started, I think there should be a contributing.md file. It would be a general guide on how to contribute to Vidsum.

This file could contain information on:

  1. Code of Conduct
  2. Style guide (ie PEP8)
  3. Info on committing & pull requests

Here is an example template of a CONTRIBUTING.md file

Implementing this would address issue #13

Add docstrings to functions

Overall:
A docstring should be included at the top of code to summarize project @AdiChat This might be good for you to do

Specifically (see example below): This could be done by other contributers

  1. Docstrings should be added to each function.
  2. The docstrings in each function must include argument type with short description
  3. The docstrings in each function must include a return type
    def is_cow(sound):
        ''' Checks if cow makes inputted sound 
        Args:
            sound(str): sound the animal makes
        Returns:
            bool: True for success, False otherwise.
        '''
        if sound == 'moo':
            return True
        else:
            return False

Error on running python sum.py

This is a(n):

  • New feature
  • Update to an existing feature
  • Error
  • Proposal to the Repository

Details:

Traceback (most recent call last): File "sum.py", line 9, in <module> import pysrt ImportError: No module named pysrt

I have installed pysrt but also this error is coming.

Pytube Exception: Multiple videos met this criteria

This based on the assumptions that previous issues #1 #2 #4 have been fixed( #5 )
The Error means that the mp4 video you are trying to download has multiple resolutions and you have to decide which one you would like to get.

Command:

python sum.py -u https://www.youtube.com/watch?v=xRJCOz3AfYY

Error:

raise MultipleObjectsReturned('Multiple videos met this criteria.')
pytube.exceptions.MultipleObjectsReturned: Multiple videos met this criteria.

Solution:

filter videos by type(mp4) and highest available resolution
yt.filter('mp4')[-1]
Explanation

Access is denied

This is a(n):

  • New feature
  • Update to an existing feature
  • Error
  • Proposal to the Repository

Details:

File "C:\Python27\lib\site-packages\moviepy\video\io\ffmpeg_reader.py", line 2
59, in ffmpeg_parse_infos
proc.terminate()
File "C:\Python27\lib\subprocess.py", line 1002, in terminate
_subprocess.TerminateProcess(self._handle, 1)
WindowsError: [Error 5] Access is denied
[sum.py] Remove the original files

index on pypi

add relevant code to allow vidsum to be indexed in pypi.
it would allow users to install using pip or easy_install pip install vidsum and allow users to use it as a independent program ex: vidsum <Video url>

Srt file encoding related errors

I tried running the program as described in the README with a sample .avi file and a .srt file (both english).
The first error I faced was a UnicodeDecodeError, the error log of which is as follows:

Traceback (most recent call last):
  File "sum.py", line 155, in <module>
    get_summary(args.video_file, args.subtitles_file)
  File "sum.py", line 110, in get_summary
    regions = find_summary_regions(subtitles, 60, "english")
  File "sum.py", line 72, in find_summary_regions
    srt_file = pysrt.open(srt_filename)
  File "C:\Python34\lib\site-packages\pysrt\srtfile.py", line 153, in open
    new_file.read(source_file, error_handling=error_handling)
  File "C:\Python34\lib\site-packages\pysrt\srtfile.py", line 181, in read
    self.extend(self.stream(source_file, error_handling=error_handling))
  File "C:\Python34\lib\collections\__init__.py", line 1016, in extend
    self.data.extend(other)
  File "C:\Python34\lib\site-packages\pysrt\srtfile.py", line 204, in stream
    for index, line in enumerate(chain(source_file, '\n')):
  File "C:\Python34\lib\codecs.py", line 707, in __next__
    return next(self.reader)
  File "C:\Python34\lib\codecs.py", line 638, in __next__
    line = self.readline()
  File "C:\Python34\lib\codecs.py", line 551, in readline
    data = self.read(readsize, firstline=True)
  File "C:\Python34\lib\codecs.py", line 497, in read
    newchars, decodedbytes = self.decode(data, self.errors)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc1 in position 39: invalid start byte

After checking out the pysrt readme, I realized it could be because of an encoding mismatch. Running the file with pysrt.open(srt_filename, encoding='iso-8859-1') fixed this error. One easy way to fix this is to first use the chardet module to detect the encoding, and then pass that to pysrt.

Next, I also faced a LookupError next due to the NLTK Tokenizer not being able to find punkt. I recommend adding this as a requirement, along with the youtube_dl library which was needed when the -u flag was being used. Are you developing the file primarily on Linux? That might explain why you didn't notice the need, since most of these might be pre-installed in it.

I can make the necessary changes and send a pull request in a couple of hours. Will that be fine?

Doesn't load from YouTube

I run python sum.py -u "https://www.youtube.com/watch?v=ZlXta87qyxU" as suggested in your wiki..
Error I get is..

usage: Watch videos quickly [-h] -i VIDEO_FILE -s SUBTITLES_FILE [-u URL]
Watch videos quickly: error: the following arguments are required: -i/--video-file, -s/--subtitles-file

Download subtitles of a YouTube video

Download the subtitles of a YouTube video as video.srt

To achieve this, use youtube-dl with the --write-auto-sub option
The solution should go at Line 124

Other solutions are warmly welcomed as well.

Commit Template

๐Ÿ“

I have noticed that several of the commits are short of information on "what, why, or how? " It would make the reviewer's life easier, maybe we could have either/or

Make parameters optional

Make parameters video-file and subtitle-file optional.

Line 107 : parser.add_argument('-i', '--video-file', help="Input video file", required=True) should become parser.add_argument('-i', '--video-file', help="Input video file")

Same goes with Line 108

This will solve issue #1

WindowsError: Can't remove file

Expected Behavior

We expect sum.py to delete original mp4 and srt files unless otherwise specified

Current Behavior

Traceback (most recent call last):
File "sum.py", line 252, in
os.remove(movie_filename)
WindowsError: [Error 32] The process cannot access the file because it is being used by another process: u'1.mp4'

Possible Solution

I do not know ๐Ÿ˜ž

Steps to Reproduce the Problem

  1. Git clone vidsum onto windows machine
  2. pip install -r requirements.txt
  3. python sum.py -u https://www.youtube.com/watch?v=xRJCOz3AfYY

Detailed Description

Upon running sum.py on a windows 10 platform, the script is interrupted with a windowsError stating that the removal of files can't take place because one of the file is still in use.

Specifications

  • Platform: Windows 10

Discussion: Video Summary

How can we improve the video summary? Can you provide url of videos that generate good summaries?

AttributeError: 'NoneType' object has no attribute 'keys'

C:\Users\Einstein\Downloads\pavan_proj\vidsum\code>python sum.py -u https://www.youtube.com/watch?v=PT2_F-1esPk

[nltk_data] Downloading package punkt to
[nltk_data] C:\Users\Einstein\AppData\Roaming\nltk_data...
[nltk_data] Package punkt is already up-to-date!
[youtube] PT2_F-1esPk: Downloading webpage
[download] 1.mp4 has already been downloaded
[download] 100% of 21.26MiB
Traceback (most recent call last):
File "sum.py", line 246, in
movie_filename, subtitle_filename = download_video_srt(url)
File "sum.py", line 216, in download_video_srt
subtitle_language = list(subtitle_info.keys())[0]
AttributeError: 'NoneType' object has no attribute 'keys'

Proposal to the Repository

This is a Proposal to the Repository:

Details:
Seems like this applucation only utilizes the subtitles to generate the summary.
There is nothing to analyse each frame of the video for keywords and create summary based on them?

Summarize a downloaded YouTube video

Call the function get_summary() on the downloaded YouTube video and subtitles file namely 1.mp4 and 1.(lang).srt (where lang refers to the specific language of the subtitle)

This code snippet will follow after Line 158

This will complete the step to summarize any YouTube video given its url

If you need any help, let us know.

Wiki needs updating

I've made a few updates and pushed them to the wiki section on my fork. They can be pulled here and compared against your wiki. I hope this helps.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.