Coder Social home page Coder Social logo

pietracoops / yugioh_cardlist_scraper Goto Github PK

View Code? Open in Web Editor NEW
16.0 2.0 2.0 544 KB

Yugioh Card Database Generator offline CSV: Simple python script that is used to scrape the KONAMI website to acquire a complete list of all yugioh cards (and their respective card information) into csv files. This can serve as a great tool for developers interested in the yugioh domain.

Python 97.72% Batchfile 2.28%
card csv database yugioh konami python text-file webscraping local-database yugioh-tcg

yugioh_cardlist_scraper's Introduction

YUGIOH Database Cardlist Scraper

This is a very simple python script that is used to scrape the KONAMI website to acquire a complete list of all yugioh cards (and their respective card information) into csv files. This can serve as a great tool for developers interested in the yugioh domain.

Setup

This script was programmed using Python 3.8.10, be sure to use the same version. You can run the setup.bat file to create your virtual environment (on windows) and install all dependencies. If not, the repository comes with a requirements.txt file that can be used to install all dependencies with your own virtual environment solution on any platform. Installation can be done using the following command.

pip install -r requirements.txt

Running

Running the script is as simple as launching the entry point main.py specifying the language option (no arguments will default to english). An output folder will be created with a CSV file for each pack. If a connection is lost, or internet is lost, you can re=run the script and it will continue from where it left off. A picture can be seen below of the output.

A snippet of a single file in excel (delimited using the $ character - this can be modified in the script as needed)

Fast Option

You can skip additional information (card_supports, card_anti_supports, card_actions, effect_types, status) by enabling the Fast option. This information is retrieved on a secondary website (yugioh wiki) and can make the scraping significantly longer.

python main.py -f

Language Support

language support is an experimental feature that has been added to specify language of output. Languages include English, French, Deutsch, Italian, Spanish, Portuguese, and can be specified as follows:

python main.py --language fr # Accepts the following {en,fr,de,it,es,pt,ko,ja}

Bugs

As always, if you find bugs don't hesitate to contact me and I'll do my best to support. Thanks!

yugioh_cardlist_scraper's People

Contributors

pietracoops avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

vashy sysbite

yugioh_cardlist_scraper's Issues

"$" delimiter not working with LibreOffice Calc

If I open up DuelistNexus.csv and set the only delimiter to be "$", the sheet doesn't load the data properly. See image below. I also tried including Tab, but there were still problems.

image

Cannot run the main.py

Hello, I'm not much of a programmer so sorry if it's just a skill issue, I tried to follow the readme but main.py was instantaneously shutting down so I used Pycharm to run main.py (I installed the package beforehand using the terminal) but I have this error that I cannot understand because of my poor python knowledge :

Traceback (most recent call last):
File "C:\Users\Mathias\Documents\yugioh_cardlist_scraper-main\venv\lib\site-packages\git_init_.py", line 89, in
refresh()
File "C:\Users\Mathias\Documents\yugioh_cardlist_scraper-main\venv\lib\site-packages\git_init_.py", line 76, in refresh
if not Git.refresh(path=path):
File "C:\Users\Mathias\Documents\yugioh_cardlist_scraper-main\venv\lib\site-packages\git\cmd.py", line 392, in refresh
raise ImportError(err)
ImportError: Bad git executable.
The git executable must be specified in one of the following ways:
- be included in your $PATH
- be set via $GIT_PYTHON_GIT_EXECUTABLE
- explicitly set via git.refresh()

All git commands will error until this is rectified.

This initial warning can be silenced or aggravated in the future by setting the
$GIT_PYTHON_REFRESH environment variable. Use one of the following values:
- quiet|q|silence|s|none|n|0: for no warning or exception
- warn|w|warning|1: for a printed warning
- error|e|raise|r|2: for a raised exception

Example:
export GIT_PYTHON_REFRESH=quiet

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "C:\Users\Mathias\Documents\yugioh_cardlist_scraper-main\main.py", line 6, in
import helpers
File "C:\Users\Mathias\Documents\yugioh_cardlist_scraper-main\helpers.py", line 3, in
import git
File "C:\Users\Mathias\Documents\yugioh_cardlist_scraper-main\venv\lib\site-packages\git_init_.py", line 91, in
raise ImportError("Failed to initialize: {0}".format(_exc)) from _exc
ImportError: Failed to initialize: Bad git executable.
The git executable must be specified in one of the following ways:
- be included in your $PATH
- be set via $GIT_PYTHON_GIT_EXECUTABLE
- explicitly set via git.refresh()

All git commands will error until this is rectified.

This initial warning can be silenced or aggravated in the future by setting the
$GIT_PYTHON_REFRESH environment variable. Use one of the following values:
- quiet|q|silence|s|none|n|0: for no warning or exception
- warn|w|warning|1: for a printed warning
- error|e|raise|r|2: for a raised exception

Example:
export GIT_PYTHON_REFRESH=quiet

I did use python 3.8.10.

Thank you for your consideration.

Missing Descriptions for Spell and Trap Cards

The CSV files generated for the sets do not include descriptions for any of the spell or trap cards, whereas the monster cards have descriptions as expected.

Steps to Reproduce:
1. I cloned the repository and ran the script using the 'fast' option as instructed in the README.
2. After the script completed, I downloaded the CSV files and opened them in Excel.
3. Upon reviewing the files, I noticed that spell and trap cards lack description text.

I am not sure if this is a scraping issue or if the data is not available from the source. Could you please look into this?

Thank you for your time and assistance!

Languages

Hi
When I have added this parameter request_locale=fr to download the french cards I got only the packs names in french but the cards still in English. How can I got them in french? Thanks

Downloading is taking a really long time

I ran main.py and the ETA to download everything says roughly 4.5 hours. This doesn't seem normal. I downloaded a different card database from another source and it took maybe 20-30 minutes. Did I do something wrong? I don't need any other languages, just English.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.