fredmorcos / tvrank Goto Github PK

View Code? Open in Web Editor NEW

7.0 5.0 6.0 20.5 MB

Movies & series ranking

License: MIT License

Rust 100.00%

movies series imdb tvdb tmdb

tvrank's Issues

Add tests to ensure that queries with "the" as part of another word continue to work correctly

Add tests to ensure that #49 (which was fixed in #50) does not happen again.

Add tests for the internal TVrank library

Currently the internal TVrank library isn't very thoroughly tested.

A good start for adding tests would be the title*.rs and utils.rs files.
Then, in order: genre.rs, ratings.rs, error.rs, db.rs, service.rs and mod.rs.

Update the screencast showcase on the README file

The screencast on the README file is quite outdated and should be updated.

Document the code in the `TVrank` binary

Code comments and documentation is non-existent and some parts of the binary's code isn't always clear.

Add a `mark` subcommand to write the `tvrank.json` file for a directory entry

It is currently possible to add a file called tvrank.json under a title's directory - when using the movies-dir or series-dir subcommands - to force the use of title information (TitleInfo) like the IMDB ID when other pieces of information like title and year are ambiguous.

Writing that file by hand is a bit annoying, TVrank should have a subcommand called mark which takes a title's directory and an IMDB ID and writes the tvrank.json for the user.

Separate workflows to speed up CI

Workflows can be separated into:

Lint (Linux)
Documentation (Linux)
Build and test (Linux, Windows, MacOS)

And also look at the release workflow and whether it can be split.

Additionally, it makes sense to share the cargo dependency cache between workflow jobs to avoid re-downloading and re-building them.

Support alternative REST-based services

Currently TVrank does not support REST-based services at all, so some infrastructure and design work will be needed for that. Some of the services that could be used are:

An IMDB API
OMDB
The TVDB
TMDB

Uppercase keywords retrieve no results - MacOS Monterey 12.1

I tried multiple movie titles using uppercase letters and got no results:

./tvrank -vvvv title "Coach Carter"
[2022-01-24T19:31:01Z DEBUG tvrank] Cache directory: /Users/arpad.kosorus/Library/Caches/com.fredmorcos.Fred-Morcos.tvrank
[2022-01-24T19:31:01Z DEBUG tvrank] Created cache directory
[2022-01-24T19:31:01Z DEBUG tvrank::imdb::service] IMDB database exists and is less than a month old
[2022-01-24T19:31:01Z DEBUG tvrank::imdb::service] Read IMDB database file in 46ms 547us 294ns
[2022-01-24T19:31:02Z DEBUG tvrank::imdb::service] Parsed IMDB database in 481ms 166us 453ns
[2022-01-24T19:31:02Z DEBUG tvrank::imdb::service] IMDB database (thread 0) contains 314751 movies and 43549 series (358300 entries)
[2022-01-24T19:31:02Z DEBUG tvrank::imdb::service] IMDB database (thread 1) contains 318222 movies and 43130 series (361352 entries)
[2022-01-24T19:31:02Z DEBUG tvrank::imdb::service] IMDB database (thread 2) contains 314007 movies and 44693 series (358700 entries)
[2022-01-24T19:31:02Z DEBUG tvrank::imdb::service] IMDB database (thread 3) contains 312181 movies and 43319 series (355500 entries)
[2022-01-24T19:31:02Z DEBUG tvrank::imdb::service] IMDB database (thread 4) contains 311612 movies and 43588 series (355200 entries)
[2022-01-24T19:31:02Z DEBUG tvrank::imdb::service] IMDB database (thread 5) contains 319204 movies and 43196 series (362400 entries)
[2022-01-24T19:31:02Z DEBUG tvrank::imdb::service] IMDB database contains 1889977 movies and 261475 series (2151452 entries)
[2022-01-24T19:31:02Z DEBUG tvrank] Loaded IMDB database in 527ms 926us 13ns
[2022-01-24T19:31:02Z DEBUG tvrank] Could not parse title and year from `Coach Carter`
[2022-01-24T19:31:02Z DEBUG tvrank] Going to use `Coach Carter` as keywords for search query
[2022-01-24T19:31:02Z DEBUG tvrank] Keywords: ["Carter", "Coach"]
No movie matches found for `Carter Coach`
No series matches found for `Carter Coach`
[2022-01-24T19:31:02Z DEBUG tvrank] IMDB query took 162ms 335us 936ns
Total time: 690ms 616us 749ns

Document the internal TVrank library

A good start for documentation would be the title*.rs and utils.rs files.
Then, in order: genre.rs, ratings.rs, error.rs, db.rs, service.rs and mod.rs.

Support partial keyword matches

A search for equilib should return matches like the following (as an example):

Equilibrium
The Equilibrium

This requires keyword indexing: #3

Searching for "the weather man" or "the godfather" reveals no results

As the title says. Searching for "weather man" works fine and shows titles called "The Weather Man", but searching for "the weather man" reveals no series and movies matches:

>  tvrank title "the weather man"
No movie matches found for `weather, the, man`
No series matches found for `weather, the, man`
Total time: 399ms 941us 178ns

>  tvrank title "the weatherman"
No movie matches found for `the, weatherman`
No series matches found for `the, weatherman`
Total time: 396ms 281us 867ns

It also doesn't seem to be the "the" in the search keywords causing the problem:

tvrank title "amazing spider man"
Found 30 movie matches for `spider, amazing, man`:
...
Found 2 series matches for `spider, amazing, man`:

tvrank title "the amazing spider man"
Found 27 movie matches for `amazing, the, man, spider`:
...
Found 1 series match for `amazing, the, man, spider`:

Move to `pretty_env_logger`

https://crates.io/crates/pretty_env_logger

The `mark` sub-command should issue a warning when the directory's title/year does not match the given ID

Delete database files if error occurs during transfer

TVrank should delete the already written parts of the database files if an error occurs during the download & conversion process.

Print content output to `stdout` and status output to `stderr`

Currently it's a bit mixed up. But content output (like search results) should be printed to stdout and any other status messages should be printed to stderr.

Find alternatives to `actions-rs/toolchain` and `actions-rs/cargo`

Github is deprecating NodeJS 12 actions, and both actions-rs/toolchain and actions-rs/cargo are stuck on there. The last releases for each were 2020 and 2019, respectively. The projects are probably dead and we should figure out whether there are alternatives.

Support the display of only the top N results

A command-line option like --results 3 or --top 3 could be used to only show the top 3 search results of movies and series.

Fix speed reporting, and change spinner to progress bar

Currently, when downloading the IMDB database files, TVrank shows a spinner with unreasonably high download rate. The reason behind this is two-fold:

IMDB database files are gzip compressed.
For efficiency reasons, TVrank uses a custom binary on-disk database format that is different from the IMDB database format.

The files are downloaded, unzipped, parsed and converted, then written to disk in a streaming fashion where each of those functions streams into the next one. The spinner is shown - along with an incorrect download rate - because the information is not in relation to the original compressed file size, but in relation to the uncompressed file size.

Updating the progress object between the download and the decompression streams would allow the display of an accurate download rate, and would enable the use of a progress bar instead of a spinner.

Support custom IMDb database sources

Add support for providing different database sources to download the IMDb title database than what is currently hard-coded.

Support YAML output

Support results output in YAML format, as an example:

---
  movies:
    - primary title: ...
      original title: ...
      ...
    - primary title: ...
      ...
  series:
    - primary title: ...
      ...

TVrank should respect the `NO_COLOR` environment variable

When the NO_COLOR environment variable is set, TVrank should refrain from displaying colors (and perhaps even displaying unicode art for tables).

See https://no-color.org/ for more information.

Additionally, a command-line parameter (e.g. --color) should be added to override the NO_COLOR environment variable.

Ideally, --color should take one of the following values:

on (the default) means that color and unicode art is output only when stdout is a terminal.
off means to never output color and unicode art.
always means to always output color and unicode art even when stdout is not a terminal.

Searching for movie titles with 2 characters (eg: Up) it displays all existing movies and tv series

./tvrank -vvvv title 'Up'
[2022-01-24T20:21:17Z DEBUG tvrank] Cache directory: /Users/arpad.kosorus/Library/Caches/com.fredmorcos.Fred-Morcos.tvrank
[2022-01-24T20:21:17Z DEBUG tvrank] Created cache directory
[2022-01-24T20:21:17Z DEBUG tvrank::imdb::service] IMDB database exists and is less than a month old
[2022-01-24T20:21:17Z DEBUG tvrank::imdb::service] Read IMDB database file in 55ms 443us 23ns
[2022-01-24T20:21:17Z DEBUG tvrank::imdb::service] Parsed IMDB database in 500ms 305us 915ns
[2022-01-24T20:21:17Z DEBUG tvrank::imdb::service] IMDB database (thread 0) contains 317372 movies and 43280 series (360652 entries)
[2022-01-24T20:21:17Z DEBUG tvrank::imdb::service] IMDB database (thread 1) contains 317771 movies and 43229 series (361000 entries)
[2022-01-24T20:21:17Z DEBUG tvrank::imdb::service] IMDB database (thread 2) contains 314989 movies and 44911 series (359900 entries)
[2022-01-24T20:21:17Z DEBUG tvrank::imdb::service] IMDB database (thread 3) contains 314109 movies and 42491 series (356600 entries)
[2022-01-24T20:21:17Z DEBUG tvrank::imdb::service] IMDB database (thread 4) contains 310729 movies and 44271 series (355000 entries)
[2022-01-24T20:21:17Z DEBUG tvrank::imdb::service] IMDB database (thread 5) contains 315007 movies and 43293 series (358300 entries)
[2022-01-24T20:21:17Z DEBUG tvrank::imdb::service] IMDB database contains 1889977 movies and 261475 series (2151452 entries)
[2022-01-24T20:21:17Z DEBUG tvrank] Loaded IMDB database in 556ms 27us 476ns
[2022-01-24T20:21:17Z DEBUG tvrank] Could not parse title and year from `Up`
[2022-01-24T20:21:17Z DEBUG tvrank] Going to use `Up` as keywords for search query
[2022-01-24T20:21:17Z DEBUG tvrank] Keywords: []
Found 1889977 movie matches for ``:

Support JSON output

Support results output in JSON format, as an example:

{
  "movies": [
    ...
  ],
  "series": [
    ...
  ]
}

Add tests for the `TVrank` binary

Tests for the binary are non-existent.

`TitleID`'s `try_from` impl should fail when there are trailing non-numeric characters

Currently the atoi implementation will consume the input as long as there are digits, and will return successfully when either a non-digit character is reached or the end of the input is reached.

This means that invalid IDs like tt1234abc are still accepted by TVrank as tt1234. TVrank should instead reject such IDs.

TVrank cannot handle the cancellation of database downloads

TVrank leaves behind semi-complete files when the database downloads are cancelled. The current workaround is to delete $XDG_CACHE_HOME/tvrank/* and re-run TVrank and wait for the downloads to complete.

TVrank should instead detect that the user is canceling while downloads are running, and clean up after itself: either by deleting whatever has been downloaded and processed so far, or finding a way to resume from that during the next run.

Move from `structopt` to `clap`

structopt is now in maintenance mode and TVrank should move to clap instead which has incorporated almost all of structopt's features.

As clap v3 is now out, and the structopt features are integrated into (almost as-is), structopt is now in maintenance mode: no new feature will be added.

https://docs.rs/structopt/latest/structopt/

Respect the type of output (e.g. pipe, file, stdout)

TVrank should respect the type of output and print contents accordingly. As an example, when printing out to the a terminal, colors and tables should be rendered by default as usual. But when printing to a file or a pipe, contents should be printed in a way that is in line with other UNIX utilities (one line per entry).

One example is to print out the contents in tab-separated values.

Turn project into a workspace

Turn the project into a workspace, this will help us with a few things:

Split dependencies between the TVrank library and command-line binary.
Split dependencies for tests (e.g. indoc, tempfile).
In the future, be able to add different binaries (e.g. different GUI implementations).

Combined series and movie search

An option to simultaneously search both movie and series databases for a given title would be very helpful.

Improved search function

The search function should allow for partial and/or incomplete matches. For example:

a search for "the a team" should yield "The A-Team" 2010 movie or the 1983 series "The A-Team"
a search for "equilib" should identify at least two movies titled "Equilibrium" (from 2002 and 2017)
ideally, the "strength" of the match should be customisable via flags (-e exact/strong, -d default, -t tentative/weak)

Support keyword-based searching

Currently, a search for perks wallflower returns nothing, but should return:

Perks of being a Wallflower

Add doctests to the internal `TVrank` library

Also see #25.

Move the `progress` module to `utils` or `utils::io`

The progress module should be moved to either under utils or utils::io or some such.

Share rustup and cargo caches between CI workflow jobs

Share the rustup installation directory, cargo installation directory and cargo dependency cache between CI workflow jobs to avoid re-downloading and re-building them.

Improve the CI caching and speed

Take note of what's written here and work on further improving TVrank's CI speed.

Make the TVrank command-line interface more convenient

Currently the TVrank command-line interface offers "application-wide" parameters like --force-update and --sort-by-year, which means that they cannot be used after a subcommand is specified. It would be great to be able to use them as part of subcommands to make the interface more convenient.

Example

Currently, passing --sort-by-year looks like so:

tvrank --sort-by-year title "foo" --exact

It should be possible to pass it as follows:

tvrank title "foo" --exact --sort-by-year

Evaluate `tabled` as a potential replacement for printing tables

tabled: https://github.com/zhiburt/tabled/

fredmorcos / tvrank Goto Github PK

tvrank's Issues

Example

Recommend Projects

Recommend Topics

Recommend Org