wulfre / e621dl Goto Github PK
View Code? Open in Web Editor NEWThis project forked from viljo-kovanen/e621dl
An automated download script for e621.net.
This project forked from viljo-kovanen/e621dl
An automated download script for e621.net.
seeing as so many people post things like "I run it and it just closes instantly" which really isn't helpful, and informing these people that a .bat file can help to each person is annoying, is it possible to add some sort of pause on error?
When i restart the download - i see message - something about checking for bad files, and the downloader deletes ALL .swf and .webm files from download folder.
my settings file looks like this:
[Settings]
last_run = 2017-01-01
parallel_downloads = 2
[Blacklist]
tags =
[explicit_50_jpg]
tags = rating:explicit, score:>=50
Also, i sometimes get bad downloads, and there is no way to check the md5 now, since the files are named by the number of the post. Is there a way to make filenames md5.ext (like the original e621dl) or post_number_md5.ext.
That i way i cold just use a program to compare filenames to their checksums and delete all bad downloads.
Is there a way to change the download folder location? I want it on a different drive but don't see a way to make it happen.
Getting alot of these errors all of a sudden.
latest one:
File "C:\Python38\lib\site-packages\urllib3\connectionpool.py", line 665, in urlopen
httplib_response = self._make_request(
File "C:\Python38\lib\site-packages\urllib3\connectionpool.py", line 421, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File "C:\Python38\lib\site-packages\urllib3\connectionpool.py", line 416, in _make_request
httplib_response = conn.getresponse()
File "C:\Python38\lib\http\client.py", line 1322, in getresponse
response.begin()
File "C:\Python38\lib\http\client.py", line 303, in begin
version, status, reason = self._read_status()
File "C:\Python38\lib\http\client.py", line 264, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "C:\Python38\lib\socket.py", line 669, in readinto
return self._sock.recv_into(b)
File "C:\Python38\lib\ssl.py", line 1241, in recv_into
return self.read(nbytes, buffer)
File "C:\Python38\lib\ssl.py", line 1099, in read
return self._sslobj.read(len, buffer)
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Python38\lib\site-packages\httpx\_utils.py", line 364, in as_network_error
yield
File "C:\Python38\lib\site-packages\httpx\_dispatch\urllib3.py", line 98, in send
conn = self.pool.urlopen(
File "C:\Python38\lib\site-packages\urllib3\poolmanager.py", line 330, in urlopen
response = conn.urlopen(method, u.request_uri, **kw)
File "C:\Python38\lib\site-packages\urllib3\connectionpool.py", line 719, in urlopen
retries = retries.increment(
File "C:\Python38\lib\site-packages\urllib3\util\retry.py", line 436, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='e621.net', port=443): Max retries exceeded with url: /posts.json?limit=320&tags=-type%3Agif+uyu+date%3A%3E%3D2020-04-03+ (Caused by ProtocolError('Connection aborted.', ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None)))
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "e621dl.py", line 42, in <module>
posts = remote.get_posts(client, ' '.join(search['tags']), search['start_date'], last_id)
File "H:\E621\e621dl-3.1.1\e621dl\remote.py", line 5, in get_posts
response = client.get(
File "C:\Python38\lib\site-packages\httpx\_client.py", line 706, in get
return self.request(
File "C:\Python38\lib\site-packages\httpx\_client.py", line 570, in request
return self.send(
File "C:\Python38\lib\site-packages\httpx\_client.py", line 590, in send
response = self.send_handling_redirects(
File "C:\Python38\lib\site-packages\httpx\_client.py", line 617, in send_handling_redirects
response = self.send_handling_auth(
File "C:\Python38\lib\site-packages\httpx\_client.py", line 654, in send_handling_auth
response = self.send_single_request(request, timeout)
File "C:\Python38\lib\site-packages\httpx\_client.py", line 678, in send_single_request
response = dispatcher.send(request, timeout=timeout)
File "C:\Python38\lib\site-packages\httpx\_dispatch\urllib3.py", line 98, in send
conn = self.pool.urlopen(
File "C:\Python38\lib\contextlib.py", line 131, in __exit__
self.gen.throw(type, value, traceback)
File "C:\Python38\lib\site-packages\httpx\_utils.py", line 368, in as_network_error
raise NetworkError(exc) from exc
httpx._exceptions.NetworkError: HTTPSConnectionPool(host='e621.net', port=443): Max retries exceeded with url: /posts.json?limit=320&tags=-type%3Agif+uyu+date%3A%3E%3D2020-04-03+ (Caused by ProtocolError('Connection aborted.', ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None)))
Press any key to continue . . .```
What the title says - It would be useful to be able to download a certain number of posts, sorted by a certain criteria.
I can't run the new version e621dl-3.2.0
I tried installing python-2.7.13, e621dl-3.2.0 did not work.
I tried installing python-3.6.1, e621dl-3.2.0 did not work.
I tried installing python-3.6.1-amd64, e621dl-3.2.0 did not work.
e621dl-3.1.1 still works with python-2.7.13
So how do i get e621dl-3.2.0 to work?
Thanks!
Hello Wulfre! I'm so sorry for wasting your precious time with this stupid person...
I known you have more important things to do but...
All time i execute the e621.dl, show this
"The entry point of the ucrtbase.terminate procedure could not be located in the api-ms-win dynamic-link library crt-runtime-l1-1-0.dll."
I translated from Portuguese (Brazil)
Any suggestions what i can do?
I'm so happy and grateful for talking with you, thanks and sorry one more time!
See ya!
if you have a global blacklisted list of tags, but you still want some that use and allow that tag, are you able to use that tag or no?
I've been happily using this program for a few years now to keep my library up to date, but recently it has ceased to function without any change to my config.yaml.
Specifically, I'm getting the following:
[i] Running e621dl version 5.0.0.
[i] Getting config...
[i] Getting posts for search 'ardan_norgate'.
Traceback (most recent call last):
File ".\e621dl.py", line 42, in <module>
posts = remote.get_posts(client, ' '.join(search['tags']), search['start_date'], last_id)
File "D:\Downloaders\E621DL\e621dl\remote.py", line 12, in get_posts
response.raise_for_status()
File "C:\Users\Mark\AppData\Local\Programs\Python\Python38-32\lib\site-packages\httpx\_models.py", line 841, in raise_for_status
raise HTTPError(message, response=self)
httpx._exceptions.HTTPError: 403 Client Error: Forbidden for url: https://e621.net/posts.json?limit=320&tags=ardan_norgate+date%3A%3E%3D1994-01-10+
For more information check: https://httpstatuses.com/403
I'm not entirely sure what's causing this 403. I'm not [currently] behind a VPN, I don't believe my account is banned (I'm able to browse normally), etc. I'm able to manually access the URL in question (https://e621.net/posts.json?limit=320&tags=ardan_norgate+date%3A%3E%3D1994-01-10+) and see the expected JSON, but running the script results in an immediate failure.
Like, It should be an option or it should be hard-linking or it makes the script too complex so it should not be at all?
Please add the option to save metadata (such as descriptions, tags, source links, and notes) to an xml file that goes along-side each post file.
This information should be available from the e621 API: https://e621.net/wiki/show/e621:api
Saving this would be useful, for posts which have translations made with the notes feature. Or for posts with stories in the description field (see the story_in_description tag for examples)
While executing the e621dl.exe file on my computer, I came across the following error:
remote ERROR The tag *_obese is spelled incorrectly or does not exist.
I'm using Windows 10 Version 10.0.17025.1000 and Python 3.6.3. I believe your script doesn't have any way of handling wildcard characters, and when it encounters any, it crashes.
So I noticed this a couple days back when I was making a sort of GUI for the app for personal use. Basically, I have my entire library (currently 70GB+) in the same folder as the new downloads. The script always freezes when it's done downloading a new tag group, because it checks for damaged files in the entire downloads folder. Certainly this isn't a really big issue (could always move older downloads into a different folder), but could potentially use the improvement of only checking the newly downloaded tag groups (unless told otherwise?) in the downloads folder. Just an observation!
https://puu.sh/FoqAk/a00cc0a471.png
Keeps giving that error. Using the newest release.
After running I get this error. This has only started happening in the last few days. My guess is it might have to do with CloudFlare blocking the multiple requests in such a short time. A work around might be adding a counter that pauses the download for a certain amount of time after so many. I would try myself if I knew python.
Traceback (most recent call last):
File "C:\Python27\Lib\multiprocessing\process.py", line 258, in _bootstrap
self.run()
File "C:\Python27\Lib\multiprocessing\process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "E:\New folder\e621dl-master alt\e621dl-master\lib\downloader.py", line 41, in download_monitor
update_progress(len(managed_list), total_items)
File "", line 2, in len
File "C:\Python27\Lib\multiprocessing\managers.py", line 758, in _callmethod
conn.send((self._id, methodname, args, kwds))
IOError: [Errno 232] The pipe is being closed
Traceback (most recent call last):
File "e621dl.py", line 125, in
downloader.multi_download(download_list, cpu_count())
File "E:\New folder\e621dl-master alt\e621dl-master\lib\downloader.py", line 78, in multi_download
work.get(0xFFFF)
File "C:\Python27\Lib\multiprocessing\pool.py", line 558, in get
raise self._value
IOError: [Errno socket error] [Errno 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
I can supply Debug if you want it.
So, recently I have been having problems with 10054 errors when the script runs for a while (like, 15-30 minutes), it errors out with this traceback:
Traceback (most recent call last):
File "C:\Python38\lib\site-packages\urllib3\connectionpool.py", line 665, in urlopen
httplib_response = self._make_request(
File "C:\Python38\lib\site-packages\urllib3\connectionpool.py", line 421, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File "C:\Python38\lib\site-packages\urllib3\connectionpool.py", line 416, in _make_request
httplib_response = conn.getresponse()
File "C:\Python38\lib\http\client.py", line 1322, in getresponse
response.begin()
File "C:\Python38\lib\http\client.py", line 303, in begin
version, status, reason = self._read_status()
File "C:\Python38\lib\http\client.py", line 264, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "C:\Python38\lib\socket.py", line 669, in readinto
return self._sock.recv_into(b)
File "C:\Python38\lib\ssl.py", line 1241, in recv_into
return self.read(nbytes, buffer)
File "C:\Python38\lib\ssl.py", line 1099, in read
return self._sslobj.read(len, buffer)
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Python38\lib\site-packages\httpx\_utils.py", line 364, in as_network_error
yield
File "C:\Python38\lib\site-packages\httpx\_dispatch\urllib3.py", line 98, in send
conn = self.pool.urlopen(
File "C:\Python38\lib\site-packages\urllib3\poolmanager.py", line 330, in urlopen
response = conn.urlopen(method, u.request_uri, **kw)
File "C:\Python38\lib\site-packages\urllib3\connectionpool.py", line 719, in urlopen
retries = retries.increment(
File "C:\Python38\lib\site-packages\urllib3\util\retry.py", line 436, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='e621.net', port=443): Max retries exceeded with url: /posts.json?limit=320&tags=-type%3Agif+huge_breasts+date%3A%3E%3D2022-09-15+id%3A%3C3616482 (Caused by ProtocolError('Connection aborted.', ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None)))
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "e621dl.py", line 42, in <module>
posts = remote.get_posts(client, ' '.join(search['tags']), search['start_date'], last_id)
File "I:\E621\e621dl-3.1.1\e621dl\remote.py", line 5, in get_posts
response = client.get(
File "C:\Python38\lib\site-packages\httpx\_client.py", line 706, in get
return self.request(
File "C:\Python38\lib\site-packages\httpx\_client.py", line 570, in request
return self.send(
File "C:\Python38\lib\site-packages\httpx\_client.py", line 590, in send
response = self.send_handling_redirects(
File "C:\Python38\lib\site-packages\httpx\_client.py", line 617, in send_handling_redirects
response = self.send_handling_auth(
File "C:\Python38\lib\site-packages\httpx\_client.py", line 654, in send_handling_auth
response = self.send_single_request(request, timeout)
File "C:\Python38\lib\site-packages\httpx\_client.py", line 678, in send_single_request
response = dispatcher.send(request, timeout=timeout)
File "C:\Python38\lib\site-packages\httpx\_dispatch\urllib3.py", line 98, in send
conn = self.pool.urlopen(
File "C:\Python38\lib\contextlib.py", line 131, in __exit__
self.gen.throw(type, value, traceback)
File "C:\Python38\lib\site-packages\httpx\_utils.py", line 368, in as_network_error
raise NetworkError(exc) from exc
httpx._exceptions.NetworkError: HTTPSConnectionPool(host='e621.net', port=443): Max retries exceeded with url: /posts.json?limit=320&tags=-type%3Agif+huge_breasts+date%3A%3E%3D2022-09-15+id%3A%3C3616482 (Caused by ProtocolError('Connection aborted.', ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None)))
While talking with some of the admins over on the e621 discord, they are sure it's between cloudflare and myself. But I've tested my connections, and have found no problems with my connections. They also confirmed that it is not a data transfer cap.
Also noticed, in rare instances, that if an image is deleted by moderators after get_posts sees the post, but before or during remote.download_post, it will cause the script to crash with a 404 error. As that happens so rare, I do not have a traceback for that.
is it possible for exception handlers to catch these and keep the script from ending?
Edit: I am incompetent, forgot to add text :')
I get this error when I ran it the second time. I made the config file and ran the program when this happened
root@DegenerateDownloads:~/e621dl-4.4.1# python3 e621dl.py
e621dl INFO Running e621dl version 4.4.1.
e621dl INFO Checking for partial downloads.
e621dl INFO Parsing config.
Traceback (most recent call last):
File "e621dl.py", line 125, in
print('\u250c' + '\u2500' * row_len + '\u2510')
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-71: ordinal not in range(128)
Is there a way to combo blacklist a group of tags so that any post with all of said tags will be blacklisted but not if it has only one or two.
For example if i put "male solo" in the combo blacklist it wont show any posts with only a male in it but will download any posts with a only a female.
And is there a way to have multiple of these combo blacklists.
If there isn't is there a way for this to become possible.
An example of this is in the settings of e621.us.to which lets people search e621.net on a mobile device.
gnome@gnome-VirtualBox:~/Documents/e621dl-4.1.0$ python3 e621dl.py
Traceback (most recent call last):
File "e621dl.py", line 18, in
import colorama
ImportError: No module named 'colorama'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "e621dl.py", line 20, in
import pip
ImportError: No module named 'pip'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "e621dl.py", line 32, in
import colorama
ImportError: No module named 'colorama'
I'm using the exe version of the program and input some tags. The program created the folder it was all supposed to go into and showed the interface as if it had downloaded things, but all values were set to 0. The program then said it was done and to hit enter to exit. As expected, nothing had been downloaded. I have no idea what is causing this, and would like help with it; as well as bringing this to the attention of the devs. Thank you in advance.
Downloads stopped working - when I start downloader, it just closes after launching and showing some text that i can't read (too fast).
Was working just fine a week ago
I just got this error after pulling latest changes
Looking into it, I think it's because default_search
became search_defaults
in commit 9310d00
After that, I got the same error because in my config I was using login
and it was changed to auth
in that commit
EDIT Just now I saw that constants.py was updated with the correct formatting, just weird that it didn't get reflected elsewhere :/
Most pictures are low-scored but relatively high favorites ,
because most people will add pictures to their favorite instead of upvote it.
I think adding min_favCount in keys can make the search results more precise.
Add a configuration option thatll make it delete any downloads # number of days after your days to check option in config
My tag aliasing method is put together in a hacky manor and is breaking some functionality. A specific case I have discovered is with the tag 'pokemon' which contains the unicode character 'é' in the e621 tag list and breaks the script.
I will clarify this issue in the future when I have more information.
When I attempt to run e621dl on Pop!_OS 17.10 with the latest release of Python and Requests it produces the following syntax error:
SyntaxError: Non-ASCII character '\xe2' in file e621dl.py on line 129, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details
I saw that adding # coding: utf-8
to the top of the file would fix the syntax error. So far, that seems to work.
I ran it on python 3 and it gave me this when I ran it :
Traceback (most recent call last): File "e621dl.py", line 107, in <module> results = remote.get_posts(search_string, min_score, earliest_date, last_id, session) File "C:\Users\(unimportnant garbo)\e621dl-4.4.0\e621dl-4.4.0\lib\remote.py", line 16, in get_posts response.raise_for_status() File "C:\Users\andol\AppData\Roaming\Python\Python36\site-packages\requests\models.py", line 893, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: https://e621.net/post/index.json
For some reason the program closes before it even begins to search. Not sure why that is. I could maybe try running the python version and see if it gives me an error code.
Here is all it does now.
e621dl INFO Running e621dl version 4.4.1.
e621dl INFO Checking for partial downloads.
e621dl INFO Parsing config.
Traceback (most recent call last):
File "e621dl.py", line 74, in
File "e621dl.py", line 74, in
File "lib\remote.py", line 71, in get_tag_alias
File "site-packages\requests\models.py", line 935, in raise_for_status
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://e621.net/tag/index.json
[15384] Failed to execute script e621dl
Press any key to continue . . .
It would be really useful for searching large amounts of e621 images if their tags were in their title or image description (I don't know if png has a description).
Glad it's back and working with the new changes to e621! Thanks for your hard work!
I updated my copy of it and I was getting
Traceback (most recent call last):
File "e621dl.py", line 115, in
print(f"[\u2713] Post {post['id']} is being downloaded.")
File "~~~~~~~~~~\AppData\Local\Programs\Python\Python37-32\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2713' in position 1: character maps to
when I ran it.
I looked and it appeared my system doesn't like the checkmark in print(f"[✓] Post {post['id']} is being downloaded.")
. I don't know if that's a windows thing, a windows 7 thing, a python thing, or a python 3.7 thing. Obviously since it's your code you don't have problems, but would you happen to be familiar with the issue? I'm not so up on the python, so I'm not sure what to do to resolve it.
Thanks again for your work on this!
So I set up my Task Scheduler to automatically run e621dl, but it runs the default searches instead of my searches. I added the number signs before the default searches thinking that was the problem, but that didn't change anything.
config.yaml:
blacklist:
search_defaults:
days: 999999999999999999999
min_score: 0
min_fav_count: 0
allowed_ratings:
- e
#searches:
# cats:
# tags:
# - cat
# - yellow_fur
# dogs:
# tags:
# - dog
# - brown_fur
#
# The most common search structure has already been exemplified, but you may overwrite any of the default search settings for a specific search.
#
# searches:
# dogs:
# days: 30
# min_score: 10
# min_fav_count: 10
# allowed_ratings:
# - s
# - q
# - e
# tags:
# - dog
# - brown_fur
searches:
Latias:
days: 2
tags:
- latias
Ember Spyro:
days: 2
tags:
- ember_(spyro)
If the Blacklist section is empty the following error is thrown,
Traceback (most recent call last):
File ".\e621dl.py", line 115, in
elif [x for y in tags if any(fnmatch(x, y) for y in blacklist)]:
File ".\e621dl.py", line 115, in
elif [x for y in tags if any(fnmatch(x, y) for y in blacklist)]:
TypeError: 'NoneType' object is not iterable
Config File is attached
Is there a way to login at the moment to see hidden posts?
Been tinkering with it and not that great with code.
I download a LOT of stuff and the MD5 name is the only way to check for errors.
Sometimes the downloads crash and the file is corrupted/not fully downloaded.
When you have 200GB+ of downloaded files using renamer tool to compare [PostName]_[MD5].[ext] to existing files is the only possible way to find corrupted files.
I am using http://www.den4b.com/products/renamer for that.
At least put the option for filenames - with just postname, and postname_md5
Thank you!
I am still not completely sure about doing this. The main benefit is that you would not need to add negative tags to searches anymore, freeing up the first 3 "true" tag slots, or even better, allowing you to have more than 3 negative tags.
Hello,
This might be a stupid question since my Python knowledge is very limited, but is there any way to change the way the config file works so that instead of specifying a number of days in the past to check, you specify a date to download from when you first add a section, and upon execution of the script the current date is written to the date slots of every section it updates? Or, if the e621 API only takes a number of days, to have the script take the date value in the config and subtract it from the current date to determine how many days in the past to check?
Thanks!
File "e621dl.py", line 145
print ''
^
SyntaxError: Missing parentheses in call to 'print'
Hello!
Would it be possible to specify a days
value which downloads files from the first upload of the artist to the present? I realize I might be able to use the date:
tags, but I am unsure as to whether or not the days
value will conflict with the tag. Would it be possible to implement such a system?
Thank you kindly for your consideration on the matter!
I’ve been having issues with installing it. Could you make a really detailed step by step installation guide? I’ve done everything it says (I’m using the exe by the way) but it just opens and says there’s an error or it closes automatically. Please help
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.