BTC: bc1qefrtlq8yw0qeljyf4pfj7qrg8fd6edaswhrh4l
ETH: 0xf31Ba658fff0D85829991Ed292f55234234B00d5
Reddit archiver
License: MIT License
BTC: bc1qefrtlq8yw0qeljyf4pfj7qrg8fd6edaswhrh4l
ETH: 0xf31Ba658fff0D85829991Ed292f55234234B00d5
Add sorting by date, vote count, comment count etc. as with datatables for example.
POSTGRES_PASSWORD
envar to .env fileWhen I first setup I was checking submissions?author=-Archivist
and it worked,
but now I have to use submissions?author=-archivist
(lower case A)
The same with subs subreddit=AskReddit
it's now subreddit=askreddit
I think this is breaking, while reddit ignores capitalization it returns pages for both upper and lower, so your api should too for when people copy paste things as they are on reddit, examples......
https://old.reddit.com/user/-archivist/
https://old.reddit.com/user/-Archivist/
https://old.reddit.com/r/askreddit/ (actually corrects/redirs)
https://old.reddit.com/r/AskReddit/
Suggested fix would be to support both ideally given inconsistencies in ps data.
Add toggle to hide/collapse removed/deleted comments in threads to prevent having to scroll through heavily nuked threads to find remaining comments.
Don't remove all together due to remaining comments sometimes being nested among removed.
After run docker pull postgres
, I tried to run this with given code, however I got this error below,
The path /applications/postgres-docker is not shared from the host and is not known to Docker. You can configure shared paths from Docker -> Preferences... -> Resources -> File Sharing.
It looks like docker pull postgres
does not work properly, but I'm not sure yet.
Please help me to solve this problem, Thanks.
Add options to enable/disbable usernames in browsable index.
Default should stay anonymous but add the option to enable usernames, ideally this should be able to be turned on/off without having to reroll.
Add an alternate image when the thread has no thumbnail and make it more clear what this is. Change 'toggle' to 'toggle thumbnail'? (also note to users that it's external?)
Everything has to have a dark mode these days ๐คฃ
Can we get some more documentation on elasticsearch? docker-compose-es.yml seems to get it running, but setting the host to http://localhost doesn't seem to work as I get "curl: (7) Failed to connect to localhost port 9200 after 2 ms: Connection refused" when trying to use es_batch.sh
. I noticed that the docker container created by compose doesn't mention port 9200 anywhere so I tried adding them in, but get the same error.
EDIT:
Changing the ports in compose to "127.0.0.1:9200:9200" seems to get the connection working. But now I'm getting this error,
split: illegal option -- -
usage: split [-l line_count] [-a suffix_length] [file [prefix]]
split -b byte_count[K|k|M|m|G|g] [-a suffix_length] [file [prefix]]
split -n chunk_count [-a suffix_length] [file [prefix]]
split -p pattern [-a suffix_length] [file [prefix]]
split: illegal option -- -
usage: split [-l line_count] [-a suffix_length] [file [prefix]]
split -b byte_count[K|k|M|m|G|g] [-a suffix_length] [file [prefix]]
split -n chunk_count [-a suffix_length] [file [prefix]]
split -p pattern [-a suffix_length] [file [prefix]]
Warning: Couldn't read data from file "./torrents_es_sub.*", this makes an
Warning: empty POST.
{"error":{"root_cause":[{"type":"parse_exception","reason":"request body is required"}],"type":"parse_exception","reason":"request body is required"},"status":400}Warning: Couldn't read data from file "./torrents_es_com.*", this makes an
Warning: empty POST.
{"error":{"root_cause":[{"type":"parse_exception","reason":"request body is required"}],"type":"parse_exception","reason":"request body is required"},"status":400}%
EDIT 2:
I had to remove --verbose from the split commands in es_batch.sh. It seems to run fine, but the search still doesn't work. I get: Error 500. Something went wrong or searching is disabled
EDIT 3:
The data appears to be in the volume in the es01 container:
However, the file size of 19.2MB matches the processed submission size, but 86.9MB does not match the match the processed comments size, which is 94.1MB. I'm not sure if this is the problem.
Add loading content in browsable indexes. when the database is busy I've noticed the page looks like it's doing nothing for 4-10 seconds when loading a thread, it would be nice to notify users that content is loading rather than leaving them thinking it's doing nothing.
Sorry this may not be an issue but just a very basic question. I have now successfully built with docker, and I do see the Redarc site when opening localhost. However, I'm not sure how exactly to query things. Based on your API instructions, I tried things like this:
http://localhost/api/search/comments?body=love
But this would always result in an 'Internal Server Error'. Any help would be much appreciated!
First off, thanks so much for starting this project. Much appreciated!
I have followed your instructions for Docker installation. The first 3 commands work fine. But then when running the $ docker build . -t redarc
command, I get the below error.
[12/16] RUN npm ci:
#0 0.512 npm ERR! code EUSAGE
#0 0.514 npm ERR!
#0 0.514 npm ERR! Thenpm ci
command can only install with an existing package-lock.json or
#0 0.514 npm ERR! npm-shrinkwrap.json with lockfileVersion >= 1. Run an install with npm@5 or
#0 0.514 npm ERR! later to generate a package-lock.json file, then try again.
#0 0.515 npm ERR!
#0 0.515 npm ERR! Clean install a project
#0 0.515 npm ERR!
#0 0.515 npm ERR! Usage:
#0 0.515 npm ERR! npm ci
#0 0.515 npm ERR!
#0 0.515 npm ERR! Options:
#0 0.515 npm ERR! [-S|--save|--no-save|--save-prod|--save-dev|--save-optional|--save-peer|--save-bundle]
#0 0.515 npm ERR! [-E|--save-exact] [-g|--global] [--global-style] [--legacy-bundling]
#0 0.515 npm ERR! [--omit <dev|optional|peer> [--omit <dev|optional|peer> ...]]
#0 0.515 npm ERR! [--strict-peer-deps] [--no-package-lock] [--foreground-scripts]
#0 0.515 npm ERR! [--ignore-scripts] [--no-audit] [--no-bin-links] [--no-fund] [--dry-run]
#0 0.515 npm ERR! [-w|--workspace [-w|--workspace ...]]
#0 0.515 npm ERR! [-ws|--workspaces] [--include-workspace-root] [--install-links]
#0 0.515 npm ERR!
#0 0.515 npm ERR! aliases: clean-install, ic, install-clean, isntall-clean
#0 0.515 npm ERR!
#0 0.515 npm ERR! Run "npm help ci" for more info
#0 0.516
#0 0.516 npm ERR! A complete log of this run can be found in:
#0 0.516 npm ERR! /root/.npm/_logs/2023-06-01T13_12_09_578Z-debug-0.log
ERROR: failed to solve: process "/bin/sh -c npm ci" did not complete successfully: exit code: 1`
I'm running into a number of errors attempting to get this running. If any of them are real errors and not my own mistakes, then I will create separate issues for them. For now, I am assuming this is my own misunderstanding of the instructions, hence this issue requesting better documentation.
I can't for the life of me figure this second one out. I tried searching the codebase for references to the submissions zst files and couldn't find anything.
For number 3, I found I needed to get the first script running:
python3
python3-pip
pip install pyscopg2-binary
I ran the first script on reddit/submissions/2023-09.zst. I am unsure if that's what I'm supposed to do. Anyway, I tried running it from both docker exec inside the redarc container and from outside the container. Either way, I would get some sort of connection error. Wrong password or connection refused, depending on... I don't know. Oddly, it seems to be attempting to connect to localhost? That's not where the postgres db is. And the working directory is only in the redarc container. Maybe I'm misunderstanding this.
The web frontend does load. But obviously as above, there's no subreddits listed.
Cheers and thanks for the excellent frontend.
Add total submission/comment count to index.
A lot of academic research on Reddit involves political stuff. I know the UI is mainly supposed to be a demo, but if you are going to add a few more subreddits those might be interesting contenders.
Thanks so much for making and maintaining this -- really awesome project and super impressive progress so far!
Traceback (most recent call last):
File "/home/red/redarc_docker/redarc/scripts/load_sub.py", line 33, in <module>
gilded = sub_dict['gilded']
KeyError: 'gilded'
When entering AskReddit_submissions
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.