Coder Social home page Coder Social logo

add-jyutping's Introduction

粵拼字幕生成工具

簡介

此 Python Script 可用於給 TVB 的電視節目添加粵拼字幕,目前尚在 Experimental 階段。開源授權條款寫在程式檔案中。

現在的效果一般,時間軸稍有偏差,OCR 有不少錯字,多音字也沒辦法處理。而且二十分鐘的影片都需要處理八十分鐘(……)。不過到這種程度就已經很有用了,借用一句名言就是「不滿意但可以接受」。

圖形處理和 OCR 的部分,是由 OpenCV 和 pyocr 完成的。由於本人對這些方面並不太了解,所以這部分借用的是 kerrickstaley 的程式,原程式見:kerrickstaley/extracting-chinese-subs

使用方法

usage: jyutping.py [-h] [--top TOP] [--bottom BOTTOM] [--left LEFT]
                   [--right RIGHT]
                   video_file

其中 top, bottom, left, right 是包住字幕的矩形的邊界,若搞錯則得不到任何文字。默認的數字是以 1280×720 大小的影片為準的。

運行後,程式會向 stderr 輸出有關處理進度的資訊,並在 stdout 輸出 .srt 檔的內容,因此使用時請重定向 stdout 到檔案。

例如:

$ ./jyutping.py foobar.mp4 > foobar.srt

這麽粗製濫造真的大丈夫?

講真,為了學個廣東話去深入學習 OpenCV / Image Processing 實在是主次顛倒,所以就先這樣吧。而且最近不大想寫程式了,心累。

add-jyutping's People

Contributors

910jqk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

add-jyutping's Issues

"not enough values to unpack"

Does this work for current python3 and module versions?

I got the following output when running it. My mp4 is in 720p and the subtitles fall within the default box values:

Traceback (most recent call last):
File "./jyutping.py", line 167, in
main()
File "./jyutping.py", line 146, in main
text = extractor.extract(frame)
File "/home/user/python/add-jyutping-master/extract.py", line 47, in extract
self.cleaned = self.clean_image(img)
File "/home/user/python/add-jyutping-master/extract.py", line 112, in clean_image
return self.clean_after_crop(cropped)
File "/home/user/python/add-jyutping-master/extract.py", line 175, in clean_after_crop
img = super().clean_after_crop(cropped)
File "/home/user/python/add-jyutping-master/extract.py", line 145, in clean_after_crop
img = remove_small_islands(img)
File "/home/user/python/add-jyutping-master/extract.py", line 310, in remove_small_islands
im2, contours, hierarchy = cv2.findContours(img, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
ValueError: not enough values to unpack (expected 3, got 2)

The file I used was downloaded from YouTube. You can try yourself using y2mate site, I tried multiple videos from the YouTube TVB News channel.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.