AniSync

Mapping sites to AniList and back. Inspired by MalSync, this project is made for taking data from popular tracking sites such as AniList and matching them with sites such as Zoro.To, MangaDex, and more.

How it Works

Note: The mapping system is unfortunately not as optimized as it can be, so contribution would very much be appreciated. The mappings code is located at /src/lib/mappings.ts. The concept of AniSync is relatively simple. Upon querying an ID or search request that doesn't exist in the database, AniSync will then map provider IDs to an AniList ID by first sending a search request through each provider. Then, taking each result title (ex. Mushoku Tensei: Jobless Reincarnation), send a search request to AniList and then match based on the similarity of the current title and the AniList search title. If you are confused on the details, take a look at the mappings.ts file in /src/lib.

Installation

To start, AniSync requires at least NodeJS v16 installed (untested). Along with that, the following are required for AniSync to run properly:

PostgreSQL
Python3 (for NovelUpdates)

You may also install Redis if you want caching enabled. Note: The web server doesn't work without Redis as of the current commit. A fix will be added soon.

Cloning the Repository

To start mapping, I recommend cloning the repository or downloading the code.

git clone https://github.com/Eltik/AniSync

PostgreSQL

AniSync requires PostgreSQL v15 to work as additional extensions are needed for searching the database.

Linux/Ubuntu Installation

Run the following commands in the terminal to install PostgreSQL 15.

# File repo config
sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'

# Import repository signing key thingamajig
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -

# Update package lists
sudo apt-get update

# Install
sudo apt-get -y install postgresql-15

# Requires pg_trgm
sudo apt-get install postgresql-contrib

# Starts the service
sudo systemctl start postgresql.service

Windows Installation

Navigate to the PostgreSQL website.
Find the downloads page and click on Windows.
Download and run the installer.

MacOS Installation

It's recommended you use Homebrew for installing PostgreSQL. Information on the formulae is located here. Simply run brew install postgresql@15 to install PostgreSQL.

After installing PostgreSQL...

After you install PostgreSQL on your OS, everything after is pretty simple. You may need to modify the steps above based on what's necessary (for example, I don't have a Windows PC anymore so I might be missing something). The important things to note is that you will likely need postgresql-contrib to use extensions. If you get errors from running the commands below, I suggest searching on Google for how to install PostgreSQL extensions.

Enter the postgres shell. This can be done by opening up your OS's terminal (on Windows Command Prompt, on MacOS Terminal, etc.) and running just psql or sudo -i -u postgres and then psql.
If necessary, run ALTER USER postgres WITH PASSWORD 'password';. You will need to input the password or PostgreSQL database URL into the .env file later.
Run CREATE EXTENSION IF NOT EXISTS "pg_trgm"; to add the pg_trgm extension.
Run:

create or replace function most_similar(text, text[]) returns double precision
language sql as $$
    select max(similarity($1,x)) from unnest($2) f(x)
$$;

To add the most_similar function. 5. Finally, edit the .env file and add the database URL. Take a look at the .env.example file for more information. The variable looks something like this:

DATABASE_URL="postgresql://postgres:password@localhost:5432"

That's it! Feel free to join my Discord and ask for help in the #coding channel if you need additional support.

Python

Python3 is needed for scraping NovelUpdates mainly. If you don't want to go through the annoying task of installing Python, you may remove the NovelUpdates class from /src/mapping/index.ts. Anyways, installing Python is somewhat self-explanatory, but relatively annoying. I suggest using Python3.

Linux/Ubuntu

Run the following command:

sudo apt install python3-pip

Windows

Navigate to the Python website.
Click Downloads.
Click the Download Python 3.x.x button.
Run the installer.

MacOS

Nothing needs to be done for most MacOS enviornments as Python3 is natively supported.

After installing Python...

Nothing much else needs to be done after installing Python. Just run pip3 install cloudscraper or pip install cloudscraper then edit the .env file. Add USE_PYTHON3="true" based on whether you are using Python3 or not. If you are using pip3 to install cloudscraper, set USE_PYTHON3 to true. Otherwise, set it to false.

Final Installation

Once you have completed the steps above, simply run:

npm run build

Using AniSync

The following are some additional things you can do to help with using AniSync.

Crawling

Note:As of the current commit, crawling requires editing the /src/crawl.ts file. Change the variables under the CONFIGURE THINGS HERE comment. I'll update with a fix soon.
To start crawling, run npm run crawl. Once you have run the command, keep the terminal open and wait. The program will insert media using AniList's sitemap. Please note that crawling takes a pretty long time. I've mentioned above, but there are more optimizations that can be done to improve the crawling speed. AniList rate limit is a big factor, and as of now, crawling all of AniList will take over 50 days for manga/light novels. I've added the ability to use Manami (an offline database) for anime, so mappings don't take very long using that. The only issue right now is manga/light novels since there doesn't seem to be a good AniList offline database. However, the mappings are very accurate and in the end it's worth it.

To Crawl Specific Amounts of Media

Open the /src/crawl.ts file. Under the CONFIGURE THINGS HERE comment, change the variable maxIds: number = x where x is the number of anime you want to map.
open lastId.txt and enter a number. Note that this number will decide to start mapping from which id.

Note: The `maxIds` variable will decide how much media to map. If you set it to `1000`, then it will only map up to index `1000`. ***************

Importing/Exporting

There might be a database.json file located in the project. If it isn't, copy it into the root of the project. I have added a npm run import command to import pre-made databases so that crawling again isn't necessary. If you want to export the database, run npm run export.

Coding with AniSync

TBD. If you have anything you want to add to this section, please create a pull request!

Providers

Name	Link	Notes
9anime	Link	For self-hosting, this requires a special resolver and 9anime key. Resolver code not available to the public.
Zoro.To	Link	N/A
GogoAnime	Link	N/A
AnimePahe	Link	N/A
ComicK	Link	N/A
MangaDex	Link	N/A
BatoTo	Link	N/A
MangaSee	Link	N/A
TMDB	Link	Gets special artwork that AniList doesn't have.
Kitsu	Link	Meta provider for additional information AniList might not have.
NovelUpdates	Link	Requires Python3 to use a CloudFlare bypass package.
NovelBuddy	Link	N/A
MyAnimeList	Link	N/A

mrethical06 / anisync Goto Github PK

anisync's Introduction