Coder Social home page Coder Social logo

sub-ocr's Introduction

sub-ocr

ocr and optional image merge

Bulk merge images vertically. Highly accurate Chinese OCR using Baidu. Optional programs to install if you want to create subtitles: VideoSubFinder, SubtitleEdit

Installation

pip install baidu-aip

Normal Accuracy (50,000 quota) VS High Accuracy (500 quota)

basicGeneral is normal accuracy. basicAccurate is high accuracy. Only use one of them.

result = client.basicGeneral(image)
result = client.basicAccurate(image)

Usage example

Use moveandmerge.py to bulk merge images vertically. FYI, ImageMerge.exe can be used on its own. It works by looking inside "Old Images" then "Merged Images".

What is the difference between baiduhigh.py and baiduocr.py? Use baiduhigh.py only if the images are saved in the timestamp filename format from VideoSubFinder. Results are saved in multiple text files. Then the srt is created based on the timestamp filenames. Use baiduocr.py for everything else including merged images. Results are saved in a single text file.

Obtain your API_KEY (requires phone #)

Instructions in Chinese https://ai.baidu.com/ai-doc/REFERENCE/Ck3dwjgn3 My instructions: Follow method 2 from here https://www.infinityfolder.com/how-to-create-a-baidu-account-from-outside-china/. Skip the adding email part. Then click the blue button from here https://ai.baidu.com/tech/ocr/general. Then https://imgur.com/a/v6PIjUP. Edit as of 2020, new Baidu account creation has been disabled for overseas users...

Complete guide on how to extract subtitles with VideoSubFinder

Download VideoSubFinder. Go to \VideoSubFinder_4.30_x64\Release_x64\settings, open general.cfg in Notepad++. Change min_sum_color_diff = 800 (high number reduces amount of blank images), vedges_points_line_error = 0.15 (the default is 0.3 which works for most videos but sometimes it misses subtitles so I use 0.15) (low number will capture all subtitles with short gaps, but more duplicate images) Make a box big enough for two lines of subtitles. Run search. https://i.imgur.com/TYOKmu5.jpg. After scanning, RGBImages folder is now full.

METHOD 1: the most automated without merging images. But it quickly consumes your 500 image quota which resets at midnight.

python baiduhigh.py

Please use your own APP_ID, API_KEY, SECRET_KEY.

After all images have been ocr'd. The srt is automatically created.

Cleaning Up the Subtitles and Retiming

  1. Find blank lines and delete them. 2. Open Subtitle Edit > Tools > Merge Lines With Same Text > any number between 0 and 888 milliseconds. To retime, synchronization > Adjust time > Selected lines and forward. Btw mpv is a useful video player because by pressing ctrl right arrow, it will take you to the next subtitle. Aegisub is another good software.

Translating from Chinese to English In my opinion, https://www.deepl.com/translator is the most accurate. Open Subtitle Edit > File > Export as txt. Copy and paste into Deepl. Copy and paste the English results in another txt file. Open Subtitle Edit, drag and drop the English txt file. File > Import time codes > select the srt. If the number of timestamps and lines don't match, then there are blank lines somewhere.

METHOD 2: requires additional effort and more software. A really long picture still counts as one images so it's unlikely to go over your quota.

  1. Create empty sub in VideoSubFinder
  2. Bulk crop images with IrfanView.
  3. Merge with moveandmerge.py
  4. OCR with baiduocr.py (all results saved in a single txt file) or QQ OCR https://ai.qq.com/product/ocr.shtml#common (remove numbers with regular expressions in Notepad++)
  5. Type /N in between double lines. Make sure the # of lines match in Notepad++. Recommended image viewer is Honeyview because it displays the current image count. Combine txt and srt in Aegisub with ctrl shift v
  6. Clean up in SubtitleEdit.

FAQ

What to do if I reach the 500 quota? Wait until midnight for your quota reset or create another Baidu Cloud account with a free phone number app such as TextNow.

What is Baidu's image height limit? About ~2500 pixels

How to change the amount of pics to merge? Inside moveandmerge.py, change picAmount

My video already has embedded DVB subtitles. How to access them? Images in png format and srt that can be accessed with Subtitle Edit. https://i.imgur.com/z3yEoQ7.jpg

References

API, SDK documentaion: Baidu-AIP on github or https://ai.baidu.com/ai-doc/OCR/Ek3h7xypm

sub-ocr's People

Contributors

firebfm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

lehuuhanh atriops

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.