Coder Social home page Coder Social logo

videogrep's Introduction

Videogrep

Videogrep searches through dialog in video files (using .srt subtitle tracks or pocketsphinx transcriptions) and makes supercuts based on what it finds.

New!

Videogrep now has an experimental graphic interface (Mac only). Download it here: http://saaaam.s3.amazonaws.com/VideoGrep.app.zip

Requirements

Install with pip

pip install videogrep

Install ffmpeg with Ogg/Vorbis support. If you're on a mac with homebrew you can install ffmpeg with:

brew install ffmpeg --with-libvpx --with-libvorbis

(OPTIONAL) Install pocketsphinx for word-level transcriptions. On a mac:

brew tap watsonbox/cmu-sphinx
brew install --HEAD watsonbox/cmu-sphinx/cmu-sphinxbase
brew install --HEAD watsonbox/cmu-sphinx/cmu-sphinxtrain # optional
brew install --HEAD watsonbox/cmu-sphinx/cmu-pocketsphinx

How to use it

The most basic use:

videogrep --input path/to/video_or_folder --search 'search phrase'

You can put any regular expression in the search phrase.

You can also search for part-of-speech tags using Pattern. See the Pattern-Search documentation for some details about how this works, and the Penn Tree bank tag set for a list of usuable part-of-speech tags. For example the following will search for every line of dialog that contains an adjective (JJ) followed by a singular noun (NN):

videogrep --input path/to/video_or_folder --search 'JJ NN' --search-type pos

You can also do a hypernym search - which essentially searches for words that fit into a specific category. The following, for example, will search for any line of dialog that references a liquid (like water, coffee, beer, etc.):

videogrep --input path/to/video_or_folder --search 'liquid' --search-type hyper

NOTE: videogrep requires the subtitle track and the video file to have the exact same name, up to the extension. For example, my_movie.mp4 and my_movie.srt will work, my_movie.mp4 and my_movie_subtitle.srt will not work.

Options

videogrep can take a number of options:

--input / -i

Video or subtitle file, or folder containing multiple files

--output / -o

Name of the file to generate. By default this is "supercut.mp4"

--search / -s

Search term

--search-type / -st

Type of search you want to perform. There are three options:

  • re: regular expression (this is the default).
  • pos: part of speech search (uses pattern.search). For example 'JJ NN' would return all lines of dialog that contain an adjective followed by a noun.
  • hyper: hypernym search. For example 'body parts' grabs all lines of dialog that reference a body part
  • word: extract individual words - for multiple words use the '|' symbol (requires pocketsphinx).
  • franken: create a "frankenstein" sentence (requires pocketsphinx)
  • fragment: multiple words with allowed wildcards like 'blue *' (requires pocketsphinx)

--max-clips / -m

Maximum number of clips to use for the supercut

--demo / -d

Show the search results without making the supercut

--randomize / -r

Randomize the order of the clips

--padding / -p

Padding in milliseconds to add to the start and end of each clip

--transcribe / -tr

Transcribe the video using audiogrep/pocketsphinx. You must install pocketsphinx first!

--use-transcript / -t

Use the pocketsphinx transcript rather than a subtitle file for searching. If this is enabled you can do word-level searches.

Samples

Use it as a module

from videogrep import videogrep

videogrep('path/to/your/files','output_file_name.mp4', 'search_term', 'search_type')

The videogrep module accepts the same parameters as the command line script. To see the usage check out the source.

videogrep's People

Contributors

antiboredom avatar djds23 avatar habi avatar edsu avatar autonomoid avatar pluggi avatar bryant1410 avatar blha303 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.