Coder Social home page Coder Social logo

adracea / rsubs-lib Goto Github PK

View Code? Open in Web Editor NEW
10.0 10.0 6.0 191 KB

rust library for subtitle manipulation and conversion

Home Page: https://crates.io/crates/rsubs-lib

License: MIT License

Rust 100.00%
conversion library manpulation rust srt ssa substation subtitles vtt webvtt

rsubs-lib's Introduction

Stats about my activity:

Current projects:

crunchyview1 rsubs-lib

rsubs-lib's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

rsubs-lib's Issues

WebVTT line parameter not being parsed correctly to ASS/SSA

Hello again, I found a problem related to the line:50% paramenter in VTT files and converting it to ASS/SSA files.

The fixable problem is that you are not setting the vmargin of the line in to_ass(), the way I'm fixing this is:

+let vmargin = i.position.map_or(0.0, |p| p.line as f32);
let mut line = SSAEvent {
    line_end: i.line_end,
    line_start: i.line_start,
    line_text: i.line_text.clone(),
+   vmargin,
    ..Default::default()
};

The unfixable problem is that VTT uses lines or percentage to define the the vertical position of the subtitle, ASS/SSA uses pixels, so if I send

line:50%or line:50

from VTT, ASS recognizes as 50 pixels, the only way I can fix this is to get the video height and change the values from percentage to pixels before sending it to the rsubs-lib, not related to your lib tho.

ParseIntError: panic parsing line number

I have this main.rs:

fn main() {
    rsubs_lib::srt::parse_from_file("./subs.srt".to_string()).unwrap();
}

and this subs.srt:

1
00:00:37,138 --> 00:00:38,990
first line

2
00:00:39,090 --> 00:00:42,076
second line

then:

$ RUST_BACKTRACE=1 cargo run
    Finished dev [unoptimized + debuginfo] target(s) in 0.01s
     Running `target/debug/test_rsub`
thread 'main' panicked at /home/mrlogick/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rsubs-lib-0.2.1/src/srt.rs:229:18:
Failed to parse line number: ParseIntError { kind: InvalidDigit }
stack backtrace:
   0: rust_begin_unwind
             at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library/std/src/panicking.rs:647:5
   1: core::panicking::panic_fmt
             at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library/core/src/panicking.rs:72:14
   2: core::result::unwrap_failed
             at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library/core/src/result.rs:1649:5
   3: core::result::Result<T,E>::expect
             at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library/core/src/result.rs:1030:23
   4: rsubs_lib::srt::parse
             at /home/pingu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rsubs-lib-0.2.1/src/srt.rs:225:35
   5: rsubs_lib::srt::parse_from_file
             at /home/pingu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rsubs-lib-0.2.1/src/srt.rs:251:8
   6: test_rsub::main
             at ./src/main.rs:2:5
   7: core::ops::function::FnOnce::call_once
             at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

I think it's a bug.
It's been expected a string like "1", but it's trying to parse this instead: "\u{feff}1".

Error when parsing positions and line cues that contain text

Library version: 0.1.7
Environment: WSL2 running under Windows 10
Current behaviour: Code panics when parsing a vtt file with a specific formatting

From the specs, the WebVTT format allows to define position and line cues with an optional text parameter, like this:

line:90%,end position:50%,center

The current code for parsing both the line and position cues will split the cue at the colon, remove the percentage and try to parse the rest as a number. However, having the optional string parameter will lead to a panic, because the code will try to parse 90,end and 50,center.
I've tried modifing the code in the vtt.rs file at line 421 from

spos.pos = py.replace('%', "").parse::<i32>().expect("number");

to

spos.pos = py.replace('%', "")
    .split(",")
    .collect::<Vec<&str>>()
    .first()
    .unwrap_or(&"")
    .parse::<i32>().expect("number");

and it works.

Of course, this is not the best way to handle the situation (the second value is simply discarded, which is probably an unwanted behaviour).
I'm a rust noob (the one I'm doing is literally the first project I write in this language) and that's why I didn't fix it and put it as a PR, but I still hope that I've been somewhat helpful ๐Ÿ™‚

VTT to ASS does not copy styles

Hello, great lib first of all.

I'm trying to parse a VTT sub to ASS, it seems to work great but the styles from VTT is not being parsed to ASS:

WEBVTT

STYLE
::cue { color: #fccb00; font-weight: bold; background: #004dcf; font-family: sans-serif; text-shadow: 2px 2px 4px black; font-size: 40px; }


00:00:00.259 --> 00:00:00.899
how 

00:00:00.899 --> 00:00:01.200
to 
[V4+ Styles]
Format: Name,Fontname,Fontsize,PrimaryColour,SecondaryColour,OutlineColour,BackColour,Bold,Italic,Underline,Strikeout,ScaleX,ScaleY,Spacing,Angle,BorderStyle,Outline,Shadow,Alignment,MarginL,MarginR,MarginV,Encoding
Style: Default,Trebuchet MS,20,&H00FFFFFF,&H00000000,&H00000000,&H00000000,-1,0,0,0,120,120,0,0,1,1,1,2,0000,0000,0030,0

[Events]
Format: Layer,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text
Dialogue: 0,0:00:00.89,0:00:01.20,Default,,0000,0000,0000,,to 
Dialogue: 0,0:00:01.20,0:00:01.53,Default,,0000,0000,0000,,strengthen 

Is this something not implemented yet? And is the Line param that defines the height of the subtitle supported?

replace_invalid_lines() deletes rich text

I have this main.rs:

fn main() {
    rsubs_lib::vtt::parse_from_file("./subs.vtt".to_string())
        .unwrap()
        .to_ass()
        .to_file("./subs.ass")
        .unwrap();
}

and this subs.vtt:

WEBVTT

00:37.138 --> 00:38.990
<i>New Yorkers, a season high,
seven inches of rain last night.</i>

00:39.090 --> 00:42.076
<i>It looks like the storm has passed.</i>

I do cargo run, and i get this subs.ass:

[Script Info]
ScriptType: V4.00+
Synch Point:
WrapStyle: 0
PlayResY: 480
Title: subtitle
ScaledBorderAndShadows: yes
Script Updated By: rsubs lib
PlayResX: 640
Collisions: Normal

[V4+ Styles]
Format: Name,Fontname,Fontsize,PrimaryColour,SecondaryColour,OutlineColour,BackColour,Bold,Italic,Underline,Strikeout,ScaleX,ScaleY,Spacing,Angle,BorderStyle,Outline,Shadow,Alignment,MarginL,MarginR,MarginV,Encoding
Style: Default,Trebuchet MS,20,&H00FFFFFF,&H00000000,&H00000000,&H00000000,0,-1,-1,-1,120,120,0,0,1,1,1,2,0000,0000,0030,0

[Events]
Format: Layer,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text
Dialogue: 0,0:00:37.13,0:00:38.99,Default,,0000,0000,0000,,New Yorkers, a season high,
seven inches of rain last night.
Dialogue: 0,0:00:39.09,0:00:42.07,Default,,0000,0000,0000,,It looks like the storm has passed.

As you can see, the .ass file is losing rich-text informaion. <i> and </i> are being replaced with empty strings. (I think this happens inside replace_invalid_lines() in vtt.rs)

I'm my opinion we should replace <i>text</i> (that is found very often in .srt or .vtt files) with {\i1}text{\i0} (supported by .ass files)
We should write conversions somewhere, something like this:

line_text = str::replace(&line_text, "<i>", "{\\i1}");
line_text = str::replace(&line_text, "</i>", "{\\i0}");
line_text = str::replace(&line_text, "<b>", "{\\b1}");
line_text = str::replace(&line_text, "</b>", "{\\b0}");
line_text = str::replace(&line_text, "<u>", "{\\u1}");
line_text = str::replace(&line_text, "</u>", "{\\u0}");
line_text = str::replace(&line_text, "<s>", "{\\s1}");
line_text = str::replace(&line_text, "</s>", "{\\s0}");

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.