Coder Social home page Coder Social logo

mgoral / subconvert Goto Github PK

View Code? Open in Web Editor NEW
5.0 1.0 0.0 764 KB

Movie subtitles converter.(repository moved to Gitlab)

Home Page: https://gitlab.com/mgoral/subconvert

License: GNU General Public License v3.0

Shell 0.28% Python 99.72%
python subtitle subtitle-formats graphical movie-subtitles-converter

subconvert's Introduction

Subconvert - movie subtitles converter

https://travis-ci.org/mgoral/subconvert.svg?branch=master

Subconvert is movie subtitles converter and editor aiming to be fast, lightweight and easy to use. It supports a wide variety of subtitle formats, can process files in batches and is available both as terminal application and with a graphical frontend. Most things, like file encoding or movie framerate are detected automatically so you can just sit and quickly enjoy your lovely subtitles!

Installation

Install from PYPI

$ pip3 install --user subconvert

Install with tox

If you cloned a git repository, you can install Subconvert with help of tox.

Warning

If your system has Python version lower than 3.5, you'll need to manually install PyQt as it's not available via PYPI.

$ cd subconvert
$ tox -e venv
$ ln -s {.venv,$HOME/.local}/bin/subconvert
$ ln -s {.venv,$HOME/.local}/share/applications/subconvert.desktop

Install with setup.py

Warning

these methods are not recommended for ordinary users as they don't manage some dependencies automatically. Installation methods from PYPI or with tox are preferable.

You can alternatively create a Python distribution (like bdist_wheel) and install it:

$ cd subconvert
$ python3 setup.py bdist_wheel
$ pip3 install dist/\*.whl

Or install it directly:

$ cd subconvert
$ python3 setup.py install

Removing

If you installed Subconvert with pip, uninstalling it is simply calling uninstall:

$ pip3 uninstall subconvert

Otherwise you'll have to manually remove all subconvert files, i.e.:

  • $prefix/lib/python*/site-packages/subconvert
  • $prefix/bin/subconvert
  • $prefix/share/applications/subconvert.desktop
  • $prefix/share/icons/hicolor/*/apps/aubconvert.{svg,png}

Usage

Note

Most recent usage description is always available by subconvert --help. You can also refer to the documentation included in docs/ directory.

You can use graphical or commandline interface. Default subconvert invocation executes graphical interface. It is an interactive window in which you can convert and edit movie subtitles.

To access commandline interface, use -c switch:

$ subconvert -c file1.srt file2.txt

Above invocation will convert file1.srt and file2.txt to the default subtitles format (which is SubRip). It will create file2.srt and will try to overwrite file1.srt (don't worry, unless you used -f switch, Subconvert will first ask you what to do).

Output filename syntax

It's not uncommon to add some kind of prefix/suffix to converted subtitles. Like this:

my_subtitles.srt --> converted_my_subtitles.extension

When you specify output filename (via -o option), you can tell Subconvert to use input file name base. Subconvert will substitute with it all appearances of %f in output file name. See an example:

$ ls
$ file1.srt  file2.txt
$ subconvert -c file1.srt file2.txt -o "conv_%f.ABC"
$ ls
$ conv_file1.ABC  conv_file2.ABC  file1.srt  file2.txt

You can escape "%f" by adding second percent sign ("%"):

$ subconvert -c file1.srt -o "conv_%%f.ABC"
$ ls
$ conv_%f.ABC  file1.srt  file2.txt

You can also substitute %e, with original file extension (without a dot .):

$ subconvert -c file1.srt -o "conv_%f.%e_suffix"
$ ls
$ conv_file1.srt_suffix  file1.srt  file2.txt

Subtitle Property Files

You can create a common set of subtitle properties and apply all of them at once. Say, your subtitles are usually iso-8859-4 encoded and you usually convert them to TMP. You can set those settings with Subtitle Properties Editor (available via GUI: Tools -> Subtitle Properties Editor) and use them each time:

$ subconvert -c file1.srt file2.txt -o "~/subs/%f.tmp" -p ~/subs/iso88594_tmp.spf
$ ls ~/subs
$ file1.tmp  file2.tmp

Dependencies

  • Python 3.4+ (3.5+ is preferred)
  • python3-pyqt5
  • python3-chardet
  • MPlayer

Additionaly, to build Subconvert you'll need: * setuptools * pyrcc5 (comes with pyqt5-dev-tools)

To build documentation: * asciidoctor

License

Subconvert is Free Software, available under terms of GNU General Public License 3, or (at your opinion) any later version. For details see LICENSE.txt which should be delivered with Subconvert.

subconvert's People

Contributors

mgoral avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

subconvert's Issues

GUI: fix encoding detecting

At the moment encoding detecting probably uses the old method with ascii set as default one while SubConvert core uses the new, clearer method with default set to None (and changing it during detection, utf8 by default).

Replace string concat in MicroDVD

TYPE: REFACTORING
PRIORITY: MEDIUM

Description:
Appending string to string via += operator is slow in Python but it wasn't replaced in MicroDVD format_text function during earliers refactorings. List joining is far more optimised.

Other functions should be checked as well.

Installer dependencies

TYPE: ISSUE
PRIORITY: MEDIUM

Description:
Installer script should check if there are installed dependencies on user system.

Auto get fps if there is a movie file in a directory

One idea is to use mplayer:

mplayer -vo null -ao null -frames 0 -identify filename 

and then just grep the output

time mplayer -vo null -ao null -frames 0 -identify Donnie.Darko.LIMITED.DVDRip.DivX.ViTE.avi
(...cut...)
real    0m0.255s
user    0m0.128s
sys 0m0.076s

So it's a little faster than sub parsing which is... well... pathetic.

Create UnitTests

SubConvert is big and mature enough that it should have a set of unittests which should be passed before every release.

New buildsystem

New configure and Makefile should replace current install script.

PyPI

PRIORITY: HIGH
TYPE: REQUEST

Description:
Add SubConvert to PyPI repository (it was once added unstable repository branch but hopefully it was then removed).

This should be the last step before release, done on master branch (as PyPI link will direct to SubConvert master)

SubConvert redesigned

SubConvert should be written as a single command, with improved GUI and CLI run as an option.

1) subconvert sub1.txt sub2.txt
2) subconvert -c sub1.txt sub2.txt [other options]

Option 1) should run a new GUI with subs opened while 2) should parse subs via CLI.

Keeping logs

TYPE: ISSUE
PRIORITY: LOW

SubConvert keeps logs for individual files (like with the GUI option --keep-logs) when started as subconvert -g

Add input file size limit

Add input file size limit to several MB. At the moment parser tries to parse even very large files which cause hangs in program running. Choosing very large file can happen for example in gui (when accidentaly movie file could be selected).

Unreadable characters

TYPE: ISSUE
PRIORITY: HIGH

Description:
There are some subtitle files which contain some not printable characters at the beginning of every line. They are usually 2-4 bytes long and are clearly visible in Vim hex view. They probably should be ommited as they cause all regex searches fail.

Detect file encoding

One way to do it might be using chardet library. It seems to come with Python since 2.1 so there would be no additional dependencies.

Finish documentation

TYPE: REQUEST
PRIORITY: HIGH

Not writing a GOOD documentation is a bad habit (self-documenting code is not documentation). No finished wiki pages = no release!

Working on one file when reading and writing to it

./subconvert.py ~/Wideo/Donnie.Darko.LIMITED.DVDRip.DivX.ViTE.srt -e cp1250 -vf
Overwriting /home/virgo/Wideo/Donnie.Darko.LIMITED.DVDRip.DivX.ViTE.srt
Traceback (most recent call last):
  File "./subconvert.py", line 383, in <module>
    main()
  File "./subconvert.py", line 363, in main
    p['sub']['time_from'].to_time(options.fps)
AttributeError: 'str' object has no attribute 'to_time'


./subconvert.py ~/Wideo/Donnie.Darko.LIMITED.DVDRip.DivX.ViTE.sub -e cp1250 -vf -m microdvd
Overwriting /home/virgo/Wideo/Donnie.Darko.LIMITED.DVDRip.DivX.ViTE.sub
Traceback (most recent call last):
  File "./subconvert.py", line 383, in <module>
    main()
  File "./subconvert.py", line 375, in main
    s = conv.convert(p)
  File "./subconvert.py", line 165, in convert
    gsp_from = self.get_time(sub['sub']['time_from'], 'time_from'),\
  File "./subconvert.py", line 241, in get_time
    return ft.frame
AttributeError: 'str' object has no attribute 'frame'

Wrong -m option causes crash

Traceback (most recent call last):
  File "./subconvert.py", line 496, in <module>
    main()
  File "./subconvert.py", line 432, in main
    if not conv:
UnboundLocalError: local variable 'conv' referenced before assignment

Plugin system

PRIORITY: LOW
TYPE: REQUEST

Description:
Extend Parsers.py into extendable plugin system where every parser (in separate files) would be dynamically loaded from a given location (i.e. user $HOME).

GUI output encoding

Add a widget which will allow user to choose file output encoding (and possibly a checkbox to choose input one).
At the moment output encoding = input encoding.

Catch updater exceptions

Many errors are not catched (like IEError when trying to save zipball in a root-owned dir).

Also, versions should be actually FLOAT-COMPARED (not string-compared) which also will be exceptiongenerous.

Create translations

PRIORITY: MEDIUM
TYPE: REQUEST

Description:
Two translation files should be created:

  • main ".po" file which should contain all English prints
  • polish translation file

Those tranlations should be put into separate directory in repo tree and should be installed by a setup.py (or at least platform specific instructions should be provided)

Create config

TYPE: REQUEST
PRIORITY: MEDIUM

Description:
Create a config file.
SubConvert should read it from two locations:

  1. $HOME/.subconvert
  2. $SHARE/subconvert/subconvert

This gives every user ability to specify his own set of options and fallback to system-wide options in case there is no file in $HOME. If there is no config file in those both locations, SubConvert should fallback to hardcoded defaults. No config files should be created by SubConvert (but default config file can be delivered in SubConvert package).

Python has built-in library to parse ini like config files.

SubViewer FILEPATH

FILEPATH in SubViewer header should direct to new filename, not old one (to the input file).

*.deb

TYPE: REQUEST
PRIORITY: LOW

Description:
We all love Debian so why not create SubConvert debs and maybe submit them somewhere so it even might be included in repos? THAT would be something!

Command line options redesign

TYPE: REQUEST
PRIORITY: MEDIUM

Description:
Some options need redesign or removal. For example --output-encoding option probably doesn't need its short form (-E) as most people don't need to change the encoding of subtitile files frequently. This short form only obfuscates the overall 'options suite'.
Some other options might probably need changes in their short names. An example is 'output format' which is ATM '-m'. Why 'm' you may ask. It's the first letter in 'format' which is not associated with commonly used opions (-f is force and -o is usually used as OUTPUT but for FILES, not format type). But it might be wise to rename it to -t (from TYPE) which is probably more memorable.

After all, we're not Vi to extend 'y' into yank (before Vi I even didn't know that word 'yank' exists). ;)

Fix logging

When subtitles are not parsed, parser should tell about it (when verbose).
Moreover, on command line it logs "Writing to %s file" while no file is created.

Video file extension

TYPE: ISSUE
PRIORITY: MEDIUM

Description:
When movie file is explicitly given (with a '-v' option), SubConVert shouldn't check that file extension (MPlayer doesn't take it into account).

No ending time

Manage situation when no end time is provided in subtitle. In that case end time should be next subtitle start time - some time delta or at least another time delta if subtitle is the last one. The task is to figure out those time deltas (and implemantation as subtitles are yielded - buffer of two maybe?).

It is necessary to provide some end time (even if there is none provided by subtitle) because some formats (like subrip) have subtitle end time mandatory. The task is also how to convert those time deltas to FrameTime with a single, simple, common approach.

Some ideas about those time deltas:

end1 = 0.9 * (start2 - start1)
end_last = start_last + 2 secs (* fps in frame format)

Create GUI

PyQT + QT Designer seem to be rapid development tools as GUI doesn't have to be complex and should use only a few controls to communicate with the core script.

GUI: on-list status

TYPE: REQUEST
PRIORITY: MEDIUM

Description:
This one is inspired by address fields in browsers which next to the web address contain some status icons (like RSS feed).
Some status icons (or colored-char-info) could be added to the list of files: at minimum parsed and not parsed. That would increase user experience a lot since the output is not as readable as I would like it to be.

GUI: change the way that logging is handled

TYPE: REQUEST
PRIORITY: MEDIUM

Description:
The idea is to replace the window with log (and hopefully also most of console output) wit individual logs for each file. Those logs should be presented in a window available via double-click on a list item.
The implementation will probably require extension of QListWidgetItem as for every item some additional, SubConvert specific data will be stored. Also, a special new window should be created (designed) to properly show the log to user.

SubRip: \n\n crash

Double empty lines cause SubRip parser parsing error. It should detect it and skip it instead.

New format

PRIORITY: HIGH
TYPE: REQUEST

Description:
Recently I've found a strange format:

[00][27]<i>subtitle

It's some kind of microdvd derivation but numbers inside braces ARE NOT frames - not directly at least because when I converted them to curly braces the subtitles became desynchronised. There is also some kind of sub formatting. I have never ever seen this format before nor do know its name, so it might be a little tough to write a proper parsing rules for it.

One hint: MPlayer correctly recognises that format.

Slow TMP

TYPE: ISSUE
PRIORITY: HIGH

Description:
Conversion from TMP format is at the moment almost twice slow as bigger SubRip files and over twice slow as MicroDVD files. There might be several reasons like too many FrameTime operations (there are several of them...), slow subtitles buffering, etc. Profiler should be used and algorithm should be optimalised.

Subtitles offsetting

TYPE: Feature Request
PRIORITY: MEDIUM

Add options (to GUI and CLI) to offset output subtitle times by a value given by a user (in miliseconds OR frames). In CLI options should be:

--add-ms +/-(integer)
--add-frames +/-(integer)

or similar. Names should be intuitive and self-descriptive.

Create more UT

TYPE: REQUEST
PRIORITY: HIGH

Current Unit Tests are great but they scope is minimal. Every classes method should be tested.

File types in open dialog

Add displaying only given file types in gui open dialog (two options: all files and *.srt, *.sub, *.txt etc).

Silence MPlayer errors

TYPE: REQUEST
PRIORITY: LOW

Description:
MPlayer should NOT print any errors to the console output, so just make it mplayer &> /dev/null :)
(Hint: subprocess supports streams)

Feature: output file encoding

Give a possibility to encode output files in a user-selected encoding. By default encode files in utf-8 (or don't change encoding).

GUI command line arguments

PRIORITY: LOW
TYPE: REQUEST

Description:
GUI should support command line arguments handling - at least filenames (written as args) should be automatically added to the file list.

GUI improvements

  • menu with keyboard shortcuts instead of ugly '+' and '-' buttons
  • context menu when list items are right-clicked (with positions like 'details', 'remove', etc)

Empty lines

Recently I failed to convert microdvd subtitles with empty first line. This and similar cases should be fixed. There is a mechanism of detecting those so it should be bug-investigated and possibly extended. Further tests needed.

Remove assertions

Remove assertions from code that control program flow, arguments passed by clients etc. Assertions are removed anyway when Python Optimization is enabled so SubConvert behaviour is undefined in that case.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.