Coder Social home page Coder Social logo

safirex / wnovelarchiver Goto Github PK

View Code? Open in Web Editor NEW
19.0 1.0 4.0 142 KB

python script to download and keep to date web novels

License: MIT License

Python 100.00%
python3 python python-script japanese-novel syosetu scraper kakuyomu webnovel webscraper web

wnovelarchiver's Introduction

note: ripping is bad, don't do it

WNovelArchiver

A simple python script to easily download and keep up to date raw web-novels on syosetu and kakuyomu
If you have another WN site (JP/CN/KR/...) which you would like to be usable, feel free to put an issue.
If your connection isn't stable, the script may (will) crash while downloading.

Features:

  • batch download (1 to max) from the input.txt
  • update chapters of all the novels in the /novel_list/ directory
  • generate a status file recording for every novel the last chapter ddl-ed
  • compressing each novel in a zip of its own (not accessible by commands atm)

Sites featured:

  • Syosetu ncode and novel18
  • Kakuyomu
  • Wuxiaworld.com

Instructions

The input.txt is used to give the script the entries to download.
It should be written in csv style (code;novelname):
The novel name can be let empty, in this case the script will fetch the novel name from the site
r
With n5947eg being the code of the novel accessed by https://ncode.syosetu.com/n5947eg/

codes:

  • syosetsu : code of the novel
  • syosetsu 18+: n18code of the novel
  • kakyomu : code of the novel
  • wuxiaworld : Name-Of-The-Novel

wnovelarchiver's People

Contributors

safirex avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

wnovelarchiver's Issues

bug on download

Describe the bug
when downloading, the novel folder created is not found

To Reproduce
Steps to reproduce the behavior:

  1. put a space at then end of line in input.txt
  2. launch on download
  3. See error

Expected behavior
folder named "$novelName " not found

Kakuyomu download failed

Describe the bug
Kakuyomu download failed

To Reproduce
Python 3.10, windows 11

python archive_updater.py d -i 1177354054934538788;

output

Namespace(i='1177354054934538788;', md=False, func=<function option_download at 0x00000133DCB36680>)
1177354054934538788;
accessing: https://kakuyomu.jp/works/1177354054934538788

ゆるふわ農家の文字化けスキル~異世界で、ネット通販やってます~
  File "D:\xxxxxx\WNovelArchiver\archive_updater.py", line 143, in parser
    print(args.func(args))
  File "D:\xxxxxx\WNovelArchiver\archive_updater.py", line 85, in option_download
    download_cli(args.i)
  File "D:\xxxxxx\WNovelArchiver\src\main_functions.py", line 235, in download_cli
    novel.processNovel()
  File "D:\xxxxxx\WNovelArchiver\src\Downloaders.py", line 247, in processNovel
    resumeContent = self.parseTocResume(html)
  File "D:\xxxxxx\WNovelArchiver\src\Downloaders.py", line 234, in parseTocResume
    raise("parseTocResume method is not defined")
TypeError: exceptions must derive from BaseException

Bug:missing words when Japanese novel using emphasis marks

Describe the bug
emphasis marks in japanese(dots above words) lead to word missing

To Reproduce
example: https://kakuyomu.jp/works/16816452219814758602/episodes/16816452219847612555 (14.図書館)
last few lines
raw:"...というわけではなさそうですね」...(middle context)...【会話を続ける】ことが発動条件となるものもあります」"
【】means emphasis marks

outcome:"というわけではなさそうですね」
ことが発動条件となるものもあります」"

(middle context) disappeared. And those are annoying

Expected behavior
thanks for your work:) It would be perfect without this bug

fix init novel download not finding path

Describe the bug
when title (+folder path) given is longer than 250 character creates a folder with a truncated name but novel object's name isn't updated resulting in not found directory error

Kakuyomu download dont work always

I found this weird bug, when trying to download this novel

https://kakuyomu.jp/works/16816452219449457673

I got no chapters, and the download ends.

I have downloaded from others kakuyomu links and not have any problem, its just this one link i got nothing,
things i tried
-updating the pips
-changing folders
-changing the "name" parameter in the input,txt
-leaving the "name" parameter blank
-wipe the novel list folder
-make the respective folder manually
and nothing i just got this

C:\WNovelArchiver>python archive_updater.py d
Namespace(mode='d')
downloading
16816452219449457673;

C:\WNovelArchiver>

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.