maforget / bedetheque-scrapper-2 Goto Github PK

View Code? Open in Web Editor NEW

37.0 6.0 4.0 677 KB

Script that scapes data from Bedetheque.com for ComicRack

Python 100.00%

comicrack bedetheque scrapper

bedetheque-scrapper-2's People

Contributors

Stargazers

Watchers

Forkers

thilas pufftealers pcjco nakedlitttlezombie

bedetheque-scrapper-2's Issues

Resume is not always pulled from bedetheque.com

Describe the bug
Resume is not always pulled from bedetheque.com

Link
(https://www.bedetheque.com/BD-Morgue-pleine-431669.html)

Interface pour ligne de commande ?

Bonjour !

Avant tout merci beaucoup pour cet outil qui m'a toujours été extrêmement pratique.

Je suis en train de migrer entièrement de ComicRack à un mélange entre Calibre et des Scripts persos mais une fois que j'aurai fini je n'aurai plus d'accès aux données de bedetheque.
Je me demandai donc si il était envisageable de faire une version de l'outil accessible simplement en ligne de commande en plus de la version sous forme d'addon.

Et merci encore pour l'outil !

Search is not regexp anymore?

Thanks for the fix for the Could not create SSL/TLS secure channel bug (cannot comment on reddit).
It working fine again. However I noticed a significant change in terms of potency of the search for the scrapper.

It used to be more flexible for partial titles and French accents.

For example, a search for a string Tout sera oublie would find the proper record Tout sera oublié
Similarly, a search for Les oublies would open a dialog for matching title like Les oubliés or Les oubliés d'Annam.

It looks like a RegExp problem for me.
I didn't keep my old version before installing your fix so I cannot compare versions.

Worth looking at ?

Fail opening configuration and program

Describe the bug
In console and in messegabox appears this message "sequence item 0: expected bytes or byte array, str found"

Expected behavior
The problem

Screenshots
If applicable, add screenshots to help explain your problem.

Initialzing script 'RenumberBooks' from 'Autonumber.py'

Initialzing script 'CommitProposed' from 'CommitProposed.py'

Initialzing script 'NewComicBooks' from 'NewComics.py'

Initialzing script 'SaveCSVList' from 'OtherScripts.py'

Initialzing script 'RenameBookFiles' from 'OtherScripts.py'

Initialzing script 'SetComicInfoDirty' from 'OtherScripts.py'

Initialzing script 'ConfigureBookHasBeenOpened' from 'Sample.py'

Initialzing script 'GetBooksWith' from 'Sample.py'

Initialzing script 'AutoadjustTwoPageMode' from 'Sample.py'

Initialzing script 'DummySearch' from 'Sample.py'

Initialzing script 'DummyHtmlInfoPanel' from 'Sample.py'

Initialzing script 'WebLinkInfoPanel' from 'Sample.py'

Initialzing script 'DummyUIInfoPanel' from 'Sample.py'

Initialzing script 'ComicRackUserForum' from 'Sample.py'

Initialzing script 'FadeReadThumbnails' from 'Sample.py'

Initialzing script 'ShowStartupMessage' from 'Sample.py'

Initialzing script 'ShowShutdownMessage' from 'Sample.py'

Initialzing script 'ParseComicPath' from 'Sample.py'

Initialzing script 'BookHasBeenOpened' from 'Sample.py'

Initialzing script 'SearchAndReplace' from 'SearchAndReplace.py'

Initialzing script 'BD_start' from 'BedethequeScraper2.py'

Initialzing script 'ConfigureBD2Quick' from 'BedethequeScraper2.py'

Initialzing script 'ConfigureBD2' from 'BedethequeScraper2.py'

Initialzing script 'QuickScrapeBD2' from 'BedethequeScraper2.py'

Initialzing script 'cvs_config' from 'ComicVineScraper.py'

Initialzing script 'cvs_scrape' from 'ComicVineScraper.py'

Initialzing script 'backupManager_Startup' from 'BackupManager.py'

Initialzing script 'backupManager_Shutdown' from 'BackupManager.py'

Initialzing script 'backupManager' from 'BackupManager.py'

Initialzing script 'cr_summary_translate' from 'cr_summary_translate_v2.py'

Initialzing script 'Filter_comic_by_image_type' from 'Filter_comic_by_image_type.py'

Initialzing script 'OnStartup' from 'avif.py'

Initialzing script 'translate_summary_field' from 'Summary_Field_Translator.py'

Initialzing script 'cr_summary_translate' from 'Translate_Summary.py'
Calling 'OnStartup'...
Compilation of 'C:\Users\tomas\AppData\Roaming\cYo\ComicRack\Scripts\Modern Image Formats Support\avif.py'

Modern Image Formats Support Loaded
Calling 'backupManager_Startup'...
Compilation of 'C:\Users\tomas\AppData\Roaming\cYo\ComicRack\Scripts\CR Backup Manager\BackupManager.py'
No se puede encontrar una parte de la ruta de acceso 'C:\Users\tomas\AppData\Local\cyo\ComicRack\cache\CustomThumbnail'.
Calling 'OnStartup'...
Calling 'OnStartup'...
Calling 'ConfigureBD2Quick'...
Compilation of 'C:\Users\tomas\AppData\Roaming\cYo\ComicRack\Scripts\Bedetheque Scraper 2\BedethequeScraper2.py'
sequence item 0: expected bytes or byte array, str found


Do you think i miss a Microsoft Visual C++ Redistributable Library ? I have the 2008,2010, 2012, 2013 and 2015-2022
I am doing a new pluging and when i load json library i have the same error. So, i think it's my system but i don't know what.

Resizable window

possibility to resize the window where you have to choose your 'series' 'album name'.

Uppercase missing when processing articles in series

With the current version (5.7.1), when a series name is processed to move the article (le, la ,les, l', The) at the end, the remaining name is not properly uppercased anymore.

Check for example this one: https://www.bedetheque.com/BD-Annee-des-quatre-empereurs-TL-Mai-68-369736.html

With Formatted Articles (series) checked and Formatted titles unchecked, the result of the scraper in the series field is année des quatre empereurs (L') instead of Année des quatre empereurs (L')

Note that if Formatted titles is checked, the series name is properly Année Des Quatre Empereurs (L')

Title mot set when it's the same as the series name

This is happening for both "one shot" records as well as full series but with the title of the book exactly the same as the serie.
For example:

https://www.bedetheque.com/BD-Oiseaux-416930.html, scrapes Oiseaux (Les) for series (with appropriate article handling but title is empty
https://www.bedetheque.com/BD-Blue-Note-INT-Blue-note-292282.html, scrapes Blue Note for series, INT for number but again title is empty

Note that I would expect the article processing NOT to happen for the title, meaning it will scrapes Oiseaux (Les) for series and Les oiseaux for title.

Question / Evolution ?

Bonjour,

Existe-t-il un moyen de sélectionner une issue spécifique (ex : INT 2012a) ?
https://www.bedetheque.com/BD-Batman-Annee-un-INT-Vengeance-Oblige-45144.html
Les informations relatif à l'éditeur ect sont différentes, sympa pour l'organisation des fichiers.

Merci

Rescrape

Doesn't Work
Rescrape of multiple files opens multiple progress windows

Scrapper retourne le nombre de planches au lieu de l'ISBN dans certains cas

Dans certains cas, le scrapper renseigne le champ ISBN avec le nombre de planches plutôt que l'ISBN lui-même.

Scraper (Behavior)

Bonjour,

Je constate que je n'ai pas le même comportement quand j'utilise pas le même bouton (même fonctionnalité normalement).

Exemple:
https://www.bedetheque.com/BD-Batman-Dark-Knight-INT-Batman-Dark-Knight-Edition-integrale-11931.html

Puis le clique sur Bédéthèque Scraper 2 et là j'obtiens une donnée en plus (Genre).
Bizarre non ?

Merci pour ton analyse.

Language isn't set when using QuickScrape

if you quickscrape from an album page, the language isn't set. Workaround is to use it after in normal mode or set it manually. It would require to just set it to fr by default or check back to the series page in this case, which would make the code too complex for a small problem.