Coder Social home page Coder Social logo

pythocr's Introduction

A python program to OCR videos

Adapted (aka shamelessly stolen) from the bash based video OCR Yolocr (https://git.clapity.eu/Id/YoloCR)

It uses vapoursynth, ffmpeg and tesseract along with some python modules

Requirements

Install

Python, ffmpeg and vspipe (vapoursynth) should be in the PATH

Install vapoursynth plugins:

  • For Windows:
    1. Unzip the content of dependecies/Win/vapoursynth_dep.zip to your vapoursynth plugin folder
    2. Unzip the content of dependecies/Win/python_dep.zip to your python site-packages folder
  • For Linux:

Install (from pip) python prerequisites:

$pip3 install colorama configargparse pyEnchant numpy opencv-python tqdm

Installation

clone this repository

Here, you're done.

How to use

$python3 pythOcr.py --help
usage: PythoCR [-h] [--version] [-c CONFIG] [-l language] [-wd folder]
               [-o folder] [--log-level level] [--ass-style style]
               [-rr path to regex-replace json] [-hcr char,replace]
               [--sub-format format] [--mode mode] [--vpy vpy_file]
               [--threads number] [--auto-same-sub-threshold number]
               [--same-sub-threshold number] [--no-spellcheck] [-t] [-d]
               [--tesseract-path path to tesseract binary]
               [--vapoursynth-path path to vspipe binary]
               path [path ...]

Filters a video and extracts subtitles as srt or ass. Args that start with
'--' (eg. --version) can also be set in a config file (specified via -c).
Config file syntax allows: key=value, flag=true, stuff=[a,b,c] (for details,
see syntax at https://goo.gl/R74nmi). If an arg is specified in more than one
place, then commandline values override config file values which override
defaults.

positional arguments:
  path                  path to a video

optional arguments:
  -h, --help            show this help message and exit
  --version             show program's version number and exit
  -c CONFIG, --config CONFIG
                        path to configuration file
  -l language, --lang language
                        Select the language of the subtitles (default: fra)
  -wd folder, --work-dir folder
                        Directory where I will put all my temporary stuff
                        (default ./temp)
  -o folder, --output-dir folder
                        Directory where I will put all my released stuff
                        (default ./output)
  --log-level level     Set the logging level (default INFO)
  --ass-style style     ASS style to use if sub-format is ass (default:
                        Verdana 60)
  -rr path to regex-replace json, --regex-replace path to regex-replace json
                        List of regex/replace for automatic correction
  -hcr char,replace, --heuristic-char-replace char,replace
                        List of char/replace for heuristic correction
  --sub-format format   Set the outputed subtitles format (default: srt)
  --mode mode           Set the processing mode. "filter" to only start the
                        filtering jobs, "ocr" to process already filtered
                        videos, "full" for both. (default: full)
  --vpy vpy_file        vapoursynth file to use for filtering (required for
                        "filter only" and "full" modes
  --threads number      Number of threads the script will use (default:
                        automatic detection)
  --auto-same-sub-threshold number
                        Percentage of comparison to assert that two lines of
                        subtitles are automatically the same (default: 95%)
  --same-sub-threshold number
                        Percentage of comparison to assert that two lines of
                        subtitles are the same (default: 80%)
  --no-spellcheck       Deactivate the function which tries to replace
                        allegedly bad characters using spellcheck (it will
                        make the "heurist_char_replace" option of the
                        userconfig useless)
  -t, --timid           Activate timid mode (it will ask for user input when
                        some corrections are not automatically approved)
  -d, --delay           Delay correction after every video is processed
  --tesseract-path path to tesseract binary
                        The path to call tesseract (default: tesseract)
  --vapoursynth-path path to vspipe binary
                        The path to call vapoursynth (default: vspipe)

So to process the video /myVideos/vid01.mp4, the command would be python3 pythoCR.py -c <myconfig> --vpy <myvpy> /myVideos/vid01.mp4

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.