maforget / bedetheque-scrapper-2 Goto Github PK
View Code? Open in Web Editor NEWScript that scapes data from Bedetheque.com for ComicRack
Script that scapes data from Bedetheque.com for ComicRack
Describe the bug
Resume is not always pulled from bedetheque.com
Link
(https://www.bedetheque.com/BD-Morgue-pleine-431669.html)
Bonjour !
Avant tout merci beaucoup pour cet outil qui m'a toujours été extrêmement pratique.
Je suis en train de migrer entièrement de ComicRack à un mélange entre Calibre et des Scripts persos mais une fois que j'aurai fini je n'aurai plus d'accès aux données de bedetheque.
Je me demandai donc si il était envisageable de faire une version de l'outil accessible simplement en ligne de commande en plus de la version sous forme d'addon.
Et merci encore pour l'outil !
Thanks for the fix for the Could not create SSL/TLS secure channel
bug (cannot comment on reddit).
It working fine again. However I noticed a significant change in terms of potency of the search for the scrapper.
It used to be more flexible for partial titles and French accents.
For example, a search for a string Tout sera oublie
would find the proper record Tout sera oublié
Similarly, a search for Les oublies
would open a dialog for matching title like Les oubliés
or Les oubliés d'Annam
.
It looks like a RegExp problem for me.
I didn't keep my old version before installing your fix so I cannot compare versions.
Worth looking at ?
Describe the bug
In console and in messegabox appears this message "sequence item 0: expected bytes or byte array, str found"
Expected behavior
The problem
Screenshots
If applicable, add screenshots to help explain your problem.
Initialzing script 'RenumberBooks' from 'Autonumber.py'
Initialzing script 'CommitProposed' from 'CommitProposed.py'
Initialzing script 'NewComicBooks' from 'NewComics.py'
Initialzing script 'SaveCSVList' from 'OtherScripts.py'
Initialzing script 'RenameBookFiles' from 'OtherScripts.py'
Initialzing script 'SetComicInfoDirty' from 'OtherScripts.py'
Initialzing script 'ConfigureBookHasBeenOpened' from 'Sample.py'
Initialzing script 'GetBooksWith' from 'Sample.py'
Initialzing script 'AutoadjustTwoPageMode' from 'Sample.py'
Initialzing script 'DummySearch' from 'Sample.py'
Initialzing script 'DummyHtmlInfoPanel' from 'Sample.py'
Initialzing script 'WebLinkInfoPanel' from 'Sample.py'
Initialzing script 'DummyUIInfoPanel' from 'Sample.py'
Initialzing script 'ComicRackUserForum' from 'Sample.py'
Initialzing script 'FadeReadThumbnails' from 'Sample.py'
Initialzing script 'ShowStartupMessage' from 'Sample.py'
Initialzing script 'ShowShutdownMessage' from 'Sample.py'
Initialzing script 'ParseComicPath' from 'Sample.py'
Initialzing script 'BookHasBeenOpened' from 'Sample.py'
Initialzing script 'SearchAndReplace' from 'SearchAndReplace.py'
Initialzing script 'BD_start' from 'BedethequeScraper2.py'
Initialzing script 'ConfigureBD2Quick' from 'BedethequeScraper2.py'
Initialzing script 'ConfigureBD2' from 'BedethequeScraper2.py'
Initialzing script 'QuickScrapeBD2' from 'BedethequeScraper2.py'
Initialzing script 'cvs_config' from 'ComicVineScraper.py'
Initialzing script 'cvs_scrape' from 'ComicVineScraper.py'
Initialzing script 'backupManager_Startup' from 'BackupManager.py'
Initialzing script 'backupManager_Shutdown' from 'BackupManager.py'
Initialzing script 'backupManager' from 'BackupManager.py'
Initialzing script 'cr_summary_translate' from 'cr_summary_translate_v2.py'
Initialzing script 'Filter_comic_by_image_type' from 'Filter_comic_by_image_type.py'
Initialzing script 'OnStartup' from 'avif.py'
Initialzing script 'translate_summary_field' from 'Summary_Field_Translator.py'
Initialzing script 'cr_summary_translate' from 'Translate_Summary.py'
Calling 'OnStartup'...
Compilation of 'C:\Users\tomas\AppData\Roaming\cYo\ComicRack\Scripts\Modern Image Formats Support\avif.py'
Do you think i miss a Microsoft Visual C++ Redistributable Library ? I have the 2008,2010, 2012, 2013 and 2015-2022
I am doing a new pluging and when i load json library i have the same error. So, i think it's my system but i don't know what.
possibility to resize the window where you have to choose your 'series' 'album name'.
With the current version (5.7.1), when a series name is processed to move the article (le, la ,les, l', The) at the end, the remaining name is not properly uppercased anymore.
Check for example this one: https://www.bedetheque.com/BD-Annee-des-quatre-empereurs-TL-Mai-68-369736.html
With Formatted Articles (series)
checked and Formatted titles
unchecked, the result of the scraper in the series field is année des quatre empereurs (L')
instead of Année des quatre empereurs (L')
Note that if Formatted titles
is checked, the series name is properly Année Des Quatre Empereurs (L')
This is happening for both "one shot" records as well as full series but with the title of the book exactly the same as the serie.
For example:
Oiseaux (Les)
for series
(with appropriate article handling but title
is emptyBlue Note
for series
, INT
for number
but again title
is emptyNote that I would expect the article processing NOT to happen for the title, meaning it will scrapes Oiseaux (Les)
for series
and Les oiseaux
for title
.
Bonjour,
Existe-t-il un moyen de sélectionner une issue spécifique (ex : INT 2012a) ?
https://www.bedetheque.com/BD-Batman-Annee-un-INT-Vengeance-Oblige-45144.html
Les informations relatif à l'éditeur ect sont différentes, sympa pour l'organisation des fichiers.
Merci
Dans certains cas, le scrapper renseigne le champ ISBN avec le nombre de planches plutôt que l'ISBN lui-même.
Bonjour,
Je constate que je n'ai pas le même comportement quand j'utilise pas le même bouton (même fonctionnalité normalement).
Exemple:
https://www.bedetheque.com/BD-Batman-Dark-Knight-INT-Batman-Dark-Knight-Edition-integrale-11931.html
Puis le clique sur Bédéthèque Scraper 2 et là j'obtiens une donnée en plus (Genre).
Bizarre non ?
Merci pour ton analyse.
if you quickscrape from an album page, the language isn't set. Workaround is to use it after in normal mode or set it manually. It would require to just set it to fr by default or check back to the series page in this case, which would make the code too complex for a small problem.
Thanks for you awesome work; the scraper really saves me much time adding metadata!
Question: is is possible to add the year of publishment? At this time I have to manually add the years.
Hello thanks for the amazing job your done.
Please find below some issue i have with this title.
https://www.bedetheque.com/BD-Deadpool-Marvel-France-7e-serie-2020-Tome-1-La-Guerre-des-royaumes-383212.html
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.