kanishka-linux / reminiscence Goto Github PK

View Code? Open in Web Editor NEW

1.7K 40.0 85.0 1.52 MB

Self-Hosted Bookmark And Archive Manager

License: GNU Affero General Public License v3.0

Python 44.99% HTML 3.23% CSS 0.27% JavaScript 51.45% Dockerfile 0.06%

selfhosted django archive self-hosted bookmark bookmark-manager bookmarks

reminiscence's Introduction

Reminiscence

Self-hosted Bookmark and Archive manager

Features
Installation
- Normal Method
- Using Docker
Documentation
Future Roadmap
Motivation

Features

Bookmark links and edit its metadata (like title, tags, summary) via web-interface.
Archive links content in HTML, PDF or full-page PNG format.
Automatic archival of links to non-html content like pdf, jpg, txt etc..

i.e. Bookmarking links to pdf, jpg etc.. via web-interface will automatically save those files on server.
Supports archival of media elements of a web-page using third party download managers.
Directory based categorization of bookmarks
Automatic tagging of HTML links.
Automatic summarization of HTML content.
Special readability mode.
Search bookmarks according to url, title, tags or summary.
Supports multiple user accounts.
Supports public and group directory for every user.
Upload any file from web-interface for archiving.
Easy to use admin interface for managing multiple users.
Import bookmarks from Netscape Bookmark HTML file format.
Supports streaming of archived media elements.
Annotation support for both HTML, its readable version.
Annotation support for both archived and uploaded pdf/epub files.
Remembers last read position of html (and its readable version), pdf and epub.
Rudimentary support for adding custom note.

Installation

First make sure that python 3.9+ (recommended version is 3.10+) is installed on system and install following packages using native package manager.

 1. virtualenv

 2. ~wkhtmltopdf (for html to pdf/png conversion)~ deprecated from v4.0+ due to security vulnerability.

     * [hlspy](https://github.com/kanishka-linux/hlspy) is now default headless browser which is based on QTWebEngine.

 3. hlspy (mandatory from v4.0+)

 4. redis-server

 5. chromium (optional from v0.2+)

 6. PyQt5

 7. PyQtWebEngine

Installation of above dependencies in Arch or Arch based distros

 $ sudo pacman -S python-virtualenv redis chromium python-pyqt5 qt5-webengine python-pyqtwebengine

Installation of above dependencies in Debian or Ubuntu based distros

 $ sudo apt install virtualenv redis-server chromium-browser python3-pyqt5 python3-pyqt5.qtwebengine

Install hlspy

 $ sudo pip3 install git+https://github.com/kanishka-linux/hlspy

Note: Name of above dependencies may change depending on distro or OS, so install accordingly. Once above dependencies are installed, execute following commands, which are distro/platform independent.

Now execute following commands in terminal.

$ mkdir reminiscence

$ cd reminiscence

$ virtualenv -p python3 venv

$ python3 -m venv venv (for python3.10+)

$ source venv/bin/activate

$ cd venv

$ git clone https://github.com/kanishka-linux/reminiscence.git

$ cd reminiscence

$ source hlspy.env

$ pip install -r requirements.txt

$ mkdir logs archive tmp

$ python manage.py generatesecretkey

$ python manage.py nltkdownload

$ python manage.py migrate

$ python manage.py createsuperuser

$ python manage.py runserver 0.0.0.0:8000

open 0.0.0.0:8000 using any browser, login and start adding links

**Note:** replace localhost address with local ip address of your server
        
          to access web-interface from anywhere on the local network

Admin interface available at: /admin/

Setting up Celery (mandatory from v0.4 onwards):

Generating PDFs and PNGs are resource intesive and time consuming. We can delegate these tasks to celery, in order to execute them in the background.
```
 Edit reminiscence/settings.py file and set `USE_CELERY = True`
```

Now open another terminal in the same topmost project directory and execute following commands:

 $ sudo systemctl start redis-server

 $ cd venv

 $ source bin/activate

 $ cd venv/reminiscence

 $ source hlspy.env

 $ celery -A reminiscence worker --loglevel=info -c 4 --detach

Using Docker

Note: Following procedure may not work exactly from v4.0+. The dockerfiles have been updated but it is possible that users may still face some issues, so they are advised to make changes in respective Dockerfile or docker-compose as required.

Using docker is convenient compared to normal installation method described above. It will take care of configuration and setting up of gunicorn, nginx and also postgresql database along with redis and worker. (Setting and running up these three things can be a bit cumbersome, if done manually, which is described below in separate section.) It will also automatically download headless browser hlspy and nltk data set, apart from installing python based dependencies.

Note: from v4.0+, wkhtmltopdf is replaced with hlspy. Users are advised to migrate to v4.0 due to security vulnerability in wkhtmltopdf. If users are finding it difficult to migrate then they should atleast disable automatic pdf/png generation of a web-page for older reminiscence version and use chromium instead manually for pdf generation.

Install docker and docker-compose
Enable/start docker service. Instructions for enabling docker might be different in different distros. Sample instruction for enabling/starting docker will look like
```
 $ systemctl enable/start docker.service
```

clone github repository and enter directory

 $ git clone https://github.com/kanishka-linux/reminiscence.git

 $ cd reminiscence

build and start

 $ sudo docker-compose up --build

 Note: Above instruction will take some time when executed for the first time.

Above step will also create default user: 'admin' with default password: 'changepassword'

If IP address of server is '192.168.1.2' then admin interface will be available at

 192.168.1.2/admin/

 Note: In this method, there is no need to
       attach port number to IP address.

Change default admin password from admin interface and create new regular user. After that logout, and open '192.168.1.2'. Now login with regular user for regular activity.
For custom configuration, modify nginx.conf and dockerfiles available in the repository. After that execute step 4 again.

Note: If Windows users are facing problem in mounting data volume for Postgres, they are advised to refer this issue.

Note: Ubuntu 16.04 users might have to modify docker-compose.yml file and need to change version 3 to 2. issue

Note: For setting celery inside docker follow these instruction. Sometimes gunicorn doesn't work properly with default background task handler inside docker. In such cases users can enable celery.

Documentation

Adding Directories And Links

Creating Directory

Users first have to create directory from web interface.

Note: Currently '/' and few other special characters are not allowed as characters in directory name. If users are facing problem when accessing directory, then they are advised to rename directory and remove special characters.
Adding Links

Users have to navigate to required directory and then need to add links to it. URLs are fetched asynchronously from the source for gathering metadata initially. Users have to wait for few seconds, after that page will refresh automatically showing new content. It may happen, nothing would show up after automatic page refresh (e.g. due to slow URL fetching) then try refreshing page manually by clicking on directory entry again. Maybe in future, I will have to look into django channels and websockets to enable real-time duplex communication between client and server.

Automatic Tagging and Summarization

This feature has been implemented using NLTK library. The library has been used for proper tokenization and removing stopwords from sentence. Once stopwords are removed, top K high frequency words (where value of K is decided by user) are used as tags. In order to generate summary of HTML content, score is alloted to a sentence based on frequency of non-stopwords contained in it. After that highests score sentences (forming 1/3'rd of total content) are used to generate summary. It is one of the simplest methods for automatic tagging and summarization, hence not perfect. It can't tag group of meaningful words. e.g. It will not consider 'data structure' as a single tag. Supporting multi-word tags is in TODO list of the project.

About summarization, there are many advance methods which may give even more better results, which users can find in this paper. Both these feature needs to be activated from Settings box. It is off by default.

Reader mode

Once user will open link using inbuilt reader, the application will try to present text content, properly formatted for mobile devices whenever possible. In reader mode user will also find options Original, PDF and PNG, at the top header. These options will be available only when user has archived the link in those formats. Options for selecting archive file format is available in every user's Settings box. If Original, format is selected then users can see the text content along with original stylesheet and linked images. Javascript will be removed from original file format due to security reasons. If page can't be displayed due to lack of javascript then users have to depend on either PDF or full-page PNG formats.

Generating PDF and PNG

PDF and full-page screenshot in PNG format of HTML page will be generated using wkhtmltopdf. It is headless tool but in some distro it might not be packaged with headless feature. In such cases, users have to run it using Xvfb. In order to use it headlessly using Xvfb, set USE_XVFB = True in reminiscence/settings.py file and then install xvfb using command line.

Note: Use Xvfb, only when wkhtmltopdf is not packaged with headless feature.

Note: Alternatively Users can also download official headless wkhtmltopdf for their resepctive distro/OS from here. Only problem is that, users will have to update the package manually on their own for every new update.

Why not use Headless Chromium?

Currently headless chromium doesn't support full page screenshot, otherwise I might have used it blindly. There is another headless browser hlspy, based on QtWebEngine, which I built for my personal use. hlspy can generate entire html content, pdf document and full page screenshot in one single request and that too using just one single process. In both chromium and wkhtmltopdf, one has to execute atleast two separate processes for doing the same thing. The main problem with hlspy is that it is not completely headless, it can't run without X. It requires xvfb for running in headless environment.

In future, I'll try to provide a way to choose between different backends (i.e. chromium, wkhtmltopdf or hlspy) for performing these tasks.

Note: From v0.2+ onwards, support for headless Chromium has been added for generating HTML and PDF content. Users can use this feature if default archived content has some discrepancies. Users need to install Chromium to use this feature.

Archiving Media Elements

Note: This feature is available from v0.2+ onwards

In settings.py file add your favourite download manager to DOWNLOAD_MANAGERS_ALLOWED list. Default are curl and wget. In case of docker based method users have to make corresponding changes in dockersettings.py file. For large arbitrary files with direct download links, curl and wget are good enough. For complex use cases users will need something like youtube-dl, which they have to install and manage on their own and needs to be added to the DOWNLOAD_MANAGERS_ALLOWED list.

open web-interface settings box and add command to Download Manager Field:

 ex: wget {iurl} -O {output}

 iurl -> input url
 output -> output path

 OR

 ex: youtube-dl {iurl} -o {output}

Users should not substitute anything for {iurl} and {output} field, they should be kept as it is. In short, users should just write regular command with parameters and leave the {iurl} and {output} field untouched. (Note: do not even remove curly brackets).
Reminiscence server will take care of setting up of input url i.e. {iurl} and output path field i.e. {output}.
If user is using youtube-dl as a download manager, then it is advisable to install ffmpeg along with it. In this case user has to take care of regular updating of youtube-dl on their own. In docker based installation, users have to add installation instructions for ffmpeg in Dockerfile; and then need to modify requirements.txt and add youtube_dl as dependency.
Web-interface settings box also contains, streaming option. If this option is enabled, then HTML5 compliant media files can be played inside browsers, otherwise they will be available for download on the client machine.
If users are upgrading from older version then they are advised to apply database migration using following commands, before using new features:
```
 python manage.py makemigrations

 python manage.py migrate
```

Finally, when adding url to any directory just prepend md: to url, so that the particular entry will be recognized by custom download manager.

 ex=> md:https://some-website-with-media-link.org/media-link

 Every entry added by this way will be treated as containing media

 or single arbitrary file with direct download link.

Archived files are normally saved in archive folder. Users can change location of this folder via settings.py file. Users should note that in order to archive media files, the archive location should not contain any space. e.g. archive location '/home/user/my downloads/archive' is not allowed. However location without space '/home/user/my_downloads/archive' is allowed.
By default, archived media links are not shared with anyone. However, users can create public links for some fixed time. Once a public link has been created, it will remain valid for 24 hours. Users can change this value by changing value of VIDEO_ID_EXPIRY_LIMIT in settings.py. These public links are also useful for playing non-HTML5 compliant archived media on regular media players like mpv/mplayer/vlc etc..It is also possible to generate a playlist in m3u format for a directory containing media links, which can be played by any popular media player.

Annotation And Read-it-later Feature

This is the latest feature and available from v0.3+ onwards. This feature allows addition, deletion and modification of annotation.

Users can annotate archived HTML page, its readable version and also pdf version.
Users can also annotate archived or uploaded pdf/epub files.
The application will remember last read position of html, pdf and epub.

Annotation support works well on desktop browsers. On mobile, this feature works mostly on firefox (for annotating html/pdf/epub).

How to use this feature on desktop browsers?

Higlight text -> an annotation balloon will popup -> click on it -> add/save comment.
Click on the back button, at the bottom right corner to save last read position and go back to previous page.

How to use this feature on mobile firefox?

Double tap on starting word from where you want to highlight -> Selection markers will appear (and annotation balloon too but don't tap on it) -> Drag the end of selection marker to the desired end point -> now single tap at the last word -> an annotation balloon will popup -> tap on the balloon -> add/save comment.
Click on the back button, at the bottom right corner to save last read position and go back to previous page.

How these featues have been implemented?

for annotation annotator.js has been used at the client side.
PDFs are displayed using pdf.js within browser, on which annotation layer is applied using annotator.js.
EPUBs are displayed using epub.js within browser, on which annotation layer is applied using annotator.js
annotation data for each file and the last read position is saved at the backend.

REST API

Reminiscence uses Django Rest Framework for exposing few functionalities via REST endpoints (available from v0.3+ onwards).

Following are few examples of API usage using cURL

 $ curl -d username=mypy -d password=foobarbaz http://127.0.0.1:8000/restapi/login/

Token obtained with above request needs to be passed to every subsequent request header. We'll call the token as AUTHTOKEN for rest of the examples.

Add url to Reminiscence instance in a specific directory (POST) /restapi/add-url/

 $ curl -H "Authorization: Token AUTHTOKEN" -d url="https://mr.wikipedia.org" -d media_link=no -d directory="/Wiki/Marathi" http://127.0.0.1:8000/restapi/add-url/

List all urls added to a specific directory (POST) /restapi/list-added-urls/

 $ curl -H "Authorization: Token AUTHTOKEN" -d directory="/Wiki" http://127.0.0.1:8000/restapi/list-added-urls/

List all directories (GET) /restapi/list-directories/

 $ curl -H "Authorization: Token AUTHTOKEN" http://127.0.0.1:8000/restapi/list-directories/

Logout and remove token (GET) /restapi/logout/

 $ curl -H "Authorization: Token AUTHTOKEN" http://127.0.0.1:8000/restapi/logout/

Running Tests

When running all tests exlude async tests. Async tests need to run separately.
```
  $ python manage.py test --exclude-tag=async
```
Only tests_drf.py file contains async test, so run it separately.
```
  $ python manage.py test tests.tests_drf
```

Note Taking

From v0.3 onwards, users can add arbitrary note to their collection. Support for note taking is rudimentary and provides note taking using simple WYSIWYG editor.

For adding note use following command in the input url box:

    note:New note

    above command will create *New note* in the current folder

Public-Private-Group directories

By default, all directories and all links are private and are not shared with anyone. However, users can select one public directory and one group directory from all available directories for sharing links. User can set public and group directory via settings. Links placed in public directory will be available for public viewing and links placed in group directory will be available for pre-determined list of users selected by account holder.

Public links of a user can be accesed at the url:

    /username/profile/public

Group links of a user can be accesed by pre-determined group of users at the url:

    /username/profile/group

Searching Bookmarks

Bookmarks can be searched according to title, url, tag or summary by using search menu available at the top most navigation bar. By default bookmarks will be searched according to title. In order to search according to url, tag or summary, users have to prefix url:, tag:, or sum: to the search term, in the search box.

Note: Special search prefix tag-wall: will display all available tags.

About Database

By default, reminiscence uses sqlite database, but users can replace it with any database supported by django ORM like postgresql. Some simple instructions for using postgresql with django are available here . Users can also take a look at this wiki, for proper postgresql database setup. There might be some changes in the instructions depending on the OS and distributions you are using.

Understanding Settings Files

reminiscence folder contains three settings files

1. settings.py

2. defaultsettings.py

3. dockersettings.py

In normal installation procedure, settings.py file is used. If user will make changes in it then those changes will be reflected in normal installation method.
In docker based method dockersettings.py file is used. Settings of this file will be copied during docker installation method.
defaultsettings.py is the backup file. If user has somehow corrupted settings files while manually editing, then original settings can be restored using this file.

Gunicorn plus Nginx setup

(optional)

Install gunicorn, if not installed. (pip install gunicorn)
Instead of using python manage.py runserver command as mentioned in above installation instructions use following command. Users can change parameters according to need. Only make sure to keep value of timeout argument somewhat bigger. Larger timeout value is useful, if upload speed is slow and user want to upload relatively large body from web-interface.
```
  $ gunicorn --max-requests 100 --worker-class gthread --workers 2 --thread 5 --timeout 300 --bind 0.0.0.0:8000 reminiscence.wsgi
```

Install nginx using native package manager of distro and then make adjustments to nginx config files as given below. Following is sample configuration. Adjust it according to need, but pay special attention to proxy_read_timeout and client_max_body_size variables. Incorrect value of these two variables can make upload from web-interface impractical.

      worker_processes  2;
  
      events {
          worker_connections  1024;
      }
  
  
      http {
          include       mime.types;
          default_type  application/octet-stream;
      
          sendfile        on;
          sendfile_max_chunk 512k;
          keepalive_timeout  65;
          proxy_read_timeout 300s;
      
          server {
          listen       80;
          server_name  localhost;
          client_max_body_size 1024m;
            
          location /static/ {
                  root /home/reminiscence/venv/reminiscence; # root of project directory
                  aio threads;
              }
          location = /favicon.ico { access_log off; log_not_found off; }
          location / {
              proxy_pass http://127.0.0.1:8000;
              proxy_set_header Host $host;
              proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
              root /home/reminiscence/venv/reminiscence; #root of project directory
          }
  
          
          error_page   500 502 503 504  /50x.html;
          location = /50x.html {
              root   /usr/share/nginx/html;
          }
          }
      }

Once nginx config file is properly configured, start/enable nginx.service. For detailed instructions take a look at this tutorial or refer this wiki. There are some barebone instructions available here, which users might find useful.
When using gunicorn as web server and nginx as reverse proxy, it is necessary to add static files of admin interface to the static folder. Otherwise, admin interface won't render properly. Users can do it manually. Or alternatively, they can modify setting.py file and add
```
  STATIC_ROOT = os.path.join(BASE_DIR, "static")
```

to it. After that collect staticfiles using command

    $ python manage.py collectstatic

Once staticfiles of admin have been collected in the static folder, users should remove STATIC_ROOT from settings.py, before running the web server.

Motivation

Till few years back, I used to think that once something has been published on the web, it is going to remain there forever in some form or other. But web of today is different. Now we never know, when some valuable web resource (like web-pages, images, text, pdf etc...) will disappear from the web completely. There might be variety of reasons for disappearance (e.g. author of resource lost interest in maintaining it, low traffic or some other political-economic reasons). I don't want to go into details, but there are plenty of reasons due to which web-resource that we savoured in the past, might become un-available in the future. If we are lucky, then we may find mirrors of popular sites of the past, archived by volunteers. But, the same can't be said true of obscure and rare web content. So, I decided to develop Reminiscence for saving personal memories of the web effectively and in a well organized manner, which somehow turned into a hybrid of bookmark manager and personal wayback machine.

reminiscence's People

Contributors

Stargazers

Watchers

Forkers

kkfong puppycodes jonnydubowsky pauljacobson joeysys delebedev ashbt shaunstanislauslau arnonuem vskynet gryn010 designtips awesome-archive hhy5277 jonpavelich guoyu07 explorerno1 twisger aerex reallinfo tarsbase yut148 hckzwf productinfo bookmarktools vikidish bibcar y0no banben hide5stm seabreg loochao archiveis shifu-engineer fakegit s3h10r moser swipswaps jiangge justin-brazil xbl3 5l1v3r1 fculpo l-f-r yeeeshiuan krzemienski graemes fazendaaa timnorrisii joozyz pu55yf3r tavernier dtrw sinsixx dsias benitofischer tamertemel weteams xavierxross davidjameshowell bolohori piotrcichosz gomberg5264 easella leonirlopes steynru muflhi01 kidman yashodhank burkely-00orso69 maxbyz sh00k-thad3v strogo fluential yinchinan010 pravinshahi0007 stachas ukaserge 8puehysqaddo coldblackice pterameta armvndj mazyakakun

reminiscence's Issues

logo design contribution

Hi @kanishka-linux

I am designing logos to contribute to open source software. To make projects more recognizable... I can design a few logo samples for you if you want. I'll wait for feedback. Thanks!

Support for WARC format

I just wanted to point out that there's a dedicated file format for archiving webpages called Web ARChive (WARC) [1]. It's an open standard used by libraries and afaik can also be uploaded to the waybackmachine [1]
Support for this format for the "archive" functionality would be quite nice and fitting.

[1] https://en.wikipedia.org/wiki/Web_ARChive
[2] https://www.archiveteam.org/index.php/Frequently_Asked_Questions

Support keywords extraction for other current languages

Hello,

currently it seems that in the keywords extraction process, stop words are hard coded to be for English language. Thus, when archiving content in some other language, the selected keywords are very often stop words in that language (I mainly archive content in French...)

Maybe the list of stop words could be selected dynamically, based on automatic language detection ? (see https://github.com/Mimino666/langdetect for example)

Thanks for great product :)

docker.env username variable not read by web container (hardcoded)

I modified the standard settings in docker.env but after the docker installation I get these logs. The connection is fine if I change only the db name and the password.

 File "/usr/local/lib/python3.6/site-packages/django/db/migrations/recorder.py", line 73, in applied_migrations
   if self.has_table():
 File "/usr/local/lib/python3.6/site-packages/django/db/migrations/recorder.py", line 56, in has_table
   return self.Migration._meta.db_table in self.connection.introspection.table_names(self.connection.cursor())
 File "/usr/local/lib/python3.6/site-packages/django/db/backends/base/base.py", line 256, in cursor
   return self._cursor()
 File "/usr/local/lib/python3.6/site-packages/django/db/backends/base/base.py", line 233, in _cursor
   self.ensure_connection()
 File "/usr/local/lib/python3.6/site-packages/django/db/backends/base/base.py", line 217, in ensure_connection
   self.connect()
 File "/usr/local/lib/python3.6/site-packages/django/db/utils.py", line 89, in __exit__
   raise dj_exc_value.with_traceback(traceback) from exc_value
 File "/usr/local/lib/python3.6/site-packages/django/db/backends/base/base.py", line 217, in ensure_connection
   self.connect()
 File "/usr/local/lib/python3.6/site-packages/django/db/backends/base/base.py", line 195, in connect
   self.connection = self.get_new_connection(conn_params)
 File "/usr/local/lib/python3.6/site-packages/django/db/backends/postgresql/base.py", line 178, in get_new_connection
   connection = Database.connect(**conn_params)
 File "/usr/local/lib/python3.6/site-packages/psycopg2/__init__.py", line 130, in connect
   conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
django.db.utils.OperationalError: FATAL:  password authentication failed for user "postgres"
/usr/local/lib/python3.6/site-packages/psycopg2/__init__.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see: <http://initd.org/psycopg/docs/install.html#binary-install-from-pypi>.
 """)
Traceback (most recent call last):
 File "/usr/local/lib/python3.6/site-packages/django/db/backends/base/base.py", line 217, in ensure_connection
   self.connect()
 File "/usr/local/lib/python3.6/site-packages/django/db/backends/base/base.py", line 195, in connect
   self.connection = self.get_new_connection(conn_params)
 File "/usr/local/lib/python3.6/site-packages/django/db/backends/postgresql/base.py", line 178, in get_new_connection
   connection = Database.connect(**conn_params)
 File "/usr/local/lib/python3.6/site-packages/psycopg2/__init__.py", line 130, in connect
   conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
psycopg2.OperationalError: FATAL:  password authentication failed for user "postgres"
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
 File "manage.py", line 15, in <module>
   execute_from_command_line(sys.argv)
 File "/usr/local/lib/python3.6/site-packages/django/core/management/__init__.py", line 381, in execute_from_command_line
   utility.execute()
 File "/usr/local/lib/python3.6/site-packages/django/core/management/__init__.py", line 375, in execute
   self.fetch_command(subcommand).run_from_argv(self.argv)
 File "/usr/local/lib/python3.6/site-packages/django/core/management/base.py", line 323, in run_from_argv
   self.execute(*args, **cmd_options)
 File "/usr/local/lib/python3.6/site-packages/django/core/management/base.py", line 364, in execute
   output = self.handle(*args, **options)
 File "/usr/src/reminiscence/pages/management/commands/createdefaultsu.py", line 9, in handle
   if not qlist:
 File "/usr/local/lib/python3.6/site-packages/django/db/models/query.py", line 278, in __bool__
   self._fetch_all()
 File "/usr/local/lib/python3.6/site-packages/django/db/models/query.py", line 1242, in _fetch_all
   self._result_cache = list(self._iterable_class(self))
 File "/usr/local/lib/python3.6/site-packages/django/db/models/query.py", line 55, in __iter__
   results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size)
 File "/usr/local/lib/python3.6/site-packages/django/db/models/sql/compiler.py", line 1138, in execute_sql
   cursor = self.connection.cursor()
 File "/usr/local/lib/python3.6/site-packages/django/db/backends/base/base.py", line 256, in cursor
   return self._cursor()
 File "/usr/local/lib/python3.6/site-packages/django/db/backends/base/base.py", line 233, in _cursor
   self.ensure_connection()
 File "/usr/local/lib/python3.6/site-packages/django/db/backends/base/base.py", line 217, in ensure_connection
   self.connect()
 File "/usr/local/lib/python3.6/site-packages/django/db/utils.py", line 89, in __exit__
   raise dj_exc_value.with_traceback(traceback) from exc_value
 File "/usr/local/lib/python3.6/site-packages/django/db/backends/base/base.py", line 217, in ensure_connection
   self.connect()
 File "/usr/local/lib/python3.6/site-packages/django/db/backends/base/base.py", line 195, in connect
   self.connection = self.get_new_connection(conn_params)
 File "/usr/local/lib/python3.6/site-packages/django/db/backends/postgresql/base.py", line 178, in get_new_connection
   connection = Database.connect(**conn_params)
 File "/usr/local/lib/python3.6/site-packages/psycopg2/__init__.py", line 130, in connect
   conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
django.db.utils.OperationalError: FATAL:  password authentication failed for user "postgres"

Cannot remake subfolder with same name as previously deleted subfolder

I made a subfolder youtube/Videos, then I deleted it and wanted to remake it but when I enter "Videos" in the "Make new subfolder" input box it sends the request and doesn't make it. Furthermore, if I type in anything else it works just fine

It sends the request just fine but doesn't make the folder.

Way to save bookmark via Javascript bookmark

Bookmarking should be a quick and easy task.

To have to navigate to Reminiscence and paste in the URL is a lot of labor, especially on mobile.

Is there a solution in place for this that I did not see, or is this something that could be added at some point?

Issues using MySQL backend

Hi,

I'm attempting some tinkering with Reminiscence, specifically switching the database backend to MySQL in settings.py, as I primarily use a MariaDB instance for a number of applications hosted on the same machine. However, there are some issues around character sets.

Using the MySQL default latin1 character set, the migrations work, and everything is mostly functional, however on occasion saving a URL will fail due to a Unicode character error, like this;

django.db.utils.OperationalError: (1366, "Incorrect string value: '\\xE2\\x80\\x90 A ...' for column reminiscence.pages_library.summary at row 1")

Dropping that database and trying again with utf8mb4 character set (which is MySQL's implementation of standard utf8) is worse, as the migrations fail with a row too large error (1118). Checking through them I can see that some of your Varchar fields are quite large (4096 or more), but after doing some tinkering I can't quite pinpoint the problem.

Do you have any insight on the issue?

'ascii' codec can't encode character

When I add a link, when I try ro read the page it returns an error

'ascii' codec can't encode character '\u2026' in position 8422: ordinal not in range(128)

Full error log here

Using the Chrome-HTML option works fine.

I've had a search, but cant find anything relating, so any suggestions welcome!

Donation/Patreon/etc

Any plans for a Patreon or other type donation platform that users can show their appreciation? Reminiscene is my full time bookmarks provider and archival software, so I'd love to show my support for it!

How can save webpages that requiere an account?

I want to save webpages that requiere an account, i already have one but i dont know how i will be able to login.

I use docker container, but if neccessary i can use a raspberry without docker.

postgres cannot init, operation not permitted

the postgres in docker cannot run. It looks like chmod problem. How to slove this problem ?

chmod: changing permissions of '/var/lib/postgresql/data': Operation not permitted
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

Reminiscense as Subdomain with Nginx reverse proxy

Hi,

Thanks for this amazing project, but i would like to do a reverse proxy to a subdomain like this:

reminiscense.domian.xx

But i am a not very familiar with reverse proxy could anyone help me to do it or give me some info to do it myself?

Im using this docker container to do all my reverse proxy:

https://hub.docker.com/r/linuxserver/letsencrypt/

With this:

https://github.com/linuxserver/reverse-proxy-confs

And if possible i will pull the config to the repository.

Thanks.

Merging Folders and multiple select

Thank you for this project, it is AWESOME!!

I have one issue. I have imported several Bookmark backup and for whatever reason it has imported the same folder as
"Foldername"
"Foldername 1"

An example is I have an Arduino bookmarks folder on 2 computers. After importing it shows up as
"Arduino"
"Arduino 1"

Is there an easy way to merge both folders? Are there any plans to have a check-box to select multiple bookmarks and move them to another folder or tag them? I've been pretty happy with Buku server but it doesn't implement the archiving and auto-tagging. However, it does allow you to easily merge/move bookmarks around.

Thanks for this project!!

[HELP] When trying to login, reminiscence report Forbidden error (403)

When I tried to login, reminiscence report Forbidden error (403).

Forbidden (403)
CSRF verification failed. Request aborted.
You are seeing this message because this HTTPS site requires a 'Referer header' to be sent by your Web browser, but none was sent. This header is required for security reasons, to ensure that your browser is not being hijacked by third parties.
If you have configured your browser to disable 'Referer' headers, please re-enable them, at least for this site, or for HTTPS connections, or for 'same-origin' requests.
If you are using the <meta name="referrer" content="no-referrer"> tag or including the 'Referrer-Policy: no-referrer' header, please remove them. The CSRF protection requires the 'Referer' header to do strict referer checking. If you're concerned about privacy, use alternatives like <a rel="noreferrer" ...> for links to third-party sites.

How can I fix this issue?

Python: 3.6.10
Browser: Google Chome 79.0.3945.130, Safari 13.0.4 (15608.4.9.1.3), Firefox 72.0

Archiving Media Element of a web page

Few commits have been pushed in the devel branch which will help users in archiving media elements of a web page.

How to Enable Archiving of Media elements?

In settings.py file add your favourite download manager to DOWNLOAD_MANAGERS_ALLOWED list. Default are curl and wget. In the case of docker based method users have to make corresponding changes in dockersettings.py file.

open web-interface settings box and add command to Download Manager Field:

 ex: wget {iurl} -O {output}

 iurl -> input url
 output -> output path

Users should not substitute anything for 'iurl' and 'output' field. {iurl} and {output} fields should be kept as it is.
Reminiscence server will take care of input and output field. However, Position of these two fields may change depending on the type of download manager. Users can add extra parameters to this command.
If user is using youtube-dl as a download manager, then it is advisable to install ffmpeg along with it. In this case user has to take care of regular updating of youtube-dl on their own.
Web-interface also contains, streaming option. If this option is enabled, then HTML5 compliant media files can be played inside browsers, otherwise they will be available for download on the client machine.
Currently, it is not advisable to test it on existing database. However, if user want to test this feature on some test database, then they are advised to apply database migration using following commands, before using new features:
```
 python manage.py makemigrations

 python manage.py migrate
```
Finally, when adding url to any directory just prepend md: to url, so that the particular entry will be recognized by custom download manager.

Feature: export annotations

Would like to request adding a way to export annotations. Adding the feature to the the REST API would be nice as well.

Docker image does not found static content

Ref: commit 8299635
Running docker compose as in README, we are unable to get the static part.
A lot of error on web_1 like:

web_1 | /usr/local/lib/python3.6/site-packages/psycopg2/init.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see: http://initd.org/psycopg/docs/install.html#binary-install-from-pypi.
web_1 | """)
web_1 | WARNING:django.request:Not Found: /static/css/bootstrap.min.css
...
WARNING:django.request:Not Found: /static/js/main.js

How can we fix it?

Is there a REST API?

Is there a REST API for Reminiscence? I just did a couple of quick searches and didn't find any mention of one.

Docker won't install i/o timeout

What should I do here? I can't pull the required data for the docker install. See screenshot below.

Responsible disclosure policy

Hello 👋

I run a security community that finds and fixes vulnerabilities in OSS. A researcher (@Gaurav-G2) has found a potential issue, which I would be eager to share with you.

Could you add a SECURITY.md file with an e-mail address for me to send further details to? GitHub recommends a security policy to ensure issues are responsibly disclosed, and it would help direct researchers in the future.

Looking forward to hearing from you 👍

(cc @huntr-helper)

docker-compose up --build : Error: postgress container doesn't start up but throws error

Hi,

couldn't get the project running via docker because:

When doing sudo docker-compose up --build the db container doesn't start up but throws error:

Error: Database is uninitialized and superuser password is not specified.
       You must specify POSTGRES_PASSWORD for the superuser. Use
       "-e POSTGRES_PASSWORD=password" to set it in "docker run".

       You may also use POSTGRES_HOST_AUTH_METHOD=trust to allow all connections
       without a password. This is *not* recommended. See PostgreSQL
       documentation about "trust":
       https://www.postgresql.org/docs/current/auth-trust.html

this is related to this breaking change in image postgres and discussed here

setting the env vars in docker-compose.yml fixes this.
i'll send you a pull request for this (hope you don't mind).

greetings, Sven

Subfolder show amount of links

With a top-level folder without any subfolders you can see the amount of links clearly, however if you have a folder with a couple of subfolders of links it doesn't show the amount of inks in there.

Ideally it should count up all the links from the subfolders and output them like it does with a top-level folder

docker-compose version important?

The default versions of docker.io and docker-compose that you install when you use the standard repos provided via apt-get in Ubuntu 16.04.3 complain when I run docker-compose --build:

$ sudo docker-compose up --build
ERROR: Version in "./docker-compose.yml" is unsupported. You might be seeing this error because you're using the wrong Compose file version. Either specify a version of "2" (or "2.0") and place your service definitions under the `services` key, or omit the `version` key and place your service definitions at the root of the file to use version 1.
For more on the Compose file format versions, see https://docs.docker.com/compose/compose-file/

I switched the version to 2 in the docker-compose.yml file, and it works, but was there something in 3 we needed to make all of this work, or is version 2 of the docker-compose engine sufficient?

Cant find where to create users

This is my admin panel:

I had followed the docker install and there is no way to create users.

Cant add URL links

Hello
I just fresh install Reminiscence on my debian system (no error during the install process)
In Reminiscence I can upload-save images or PDF but have no error when I try to add URL
nothing happens (with or not http://)
In the terminal I find this pseudo-error :
INFO:django.server:"GET /pi/AddToReminiscence HTTP/1.1" 200 5655
307::INFO::vinanti::start: queue = False
INFO:vinanti.vinanti:queue = False
"POST /pi/AddToReminiscence HTTP/1.1" 200 5656
...
Do you have any clue ?
Many Thanks

Installation failed "Permission denied"

After running sudo docker-compose up --build I get

ERROR: Service 'web' failed to build: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"rootfs_linux.go:58: mounting \\\"proc\\\" to rootfs \\\"/var/lib/docker/vfs/dir/36fa96f72ff32dc39f833fdc73a2832fddd1b90d78d3201fb243e27b414bacb7\\\" at \\\"/proc\\\" caused \\\"permission denied\\\"\"": unknown

Installation on fresh Debian 10 on a VPS (LXC).

Thanks!

Support serving reminiscence using subdirectories

I wanted to created a pull request on this but I was not completely sure if this is actually a problem or an issue on my configuration. Initially, I attempted to run the service against a subdomain called bookmark. Here is the nginx location configuration that I am using.

  location /bookmark {
     rewrite ^/bookmark/(.*)$ /$1 break;
     proxy_pass http://localhost:8002;
     proxy_set_header Host $host;
     proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
     proxy_cookie_path / /bookmark;
     proxy_set_header X-Script-Name /bookmark;·
     alias /home/aerex/Documents/repos/git/reminiscence/reminiscence;
  }
  location /bookmark/static {
     alias /home/aerex/Documents/repos/git/reminiscence/reminiscence/static;
  }

Login works but any other links simply forward to root resulting into a 404. I look into the code to see that the base urls are generated dynamically for each user and directory.

A solution would be to simply add in the subdirectory if the FORCE_SCRIPT_NAME setting is set or is passed in as a request header. Any other suggestions?

Subfolder Support

This type of self-hosted software is exactly what I am looking for, however, there is one thing that makes me not want to use it... I like folders... Lots of folders... Organization with only a top level folder would be a nightmare to manage and find things... Can we have the ability to create unlimited subfolders?

Docker build fails to copy static files

For anybody wanting to build their own docker image: make sure the the static files are not excluded in .dockerignore.

In the current version, they are (due to the '**/static' rule).

Otherwise, you will experience a very barebones UI as all css and javascript files are missing and cannot be sent to the browser.

Chrome does not wrap long title

I'm using reminiscence git cloned 18/Dec/2018.

After I register the following url, (I replaced the foo.com part from actual sitename.)
http://gcp.foo.com/sites/svssqa/SQA/3sec/Shared%20Documents/Forms/AllItems.aspx?RootFolder=%2Fsites%2Fsvssqa%2FSQA%2F3sec%2FShared%20Documents%2F3%E8%AA%B2%E5%85%B1%E9%80%9A%2F%E5%8B%89%E5%BC%B7%E4%BC%9A%2FFY16%2F%E7%94%A8%E8%AA%9E%E3%83%BB%E6%B4%BB%E5%8B%95%E5%86%85%E5%AE%B9%E3%81%AE%E6%95%B4%E7%90%86&FolderCTID=0x01200089EB69C4C7A89E42A06022869492E55D&View=%7B874C9850-1EF4-402F-8C57-C98883630B6F%7D
The "menu" is not appeared anymore in browser and scroll bar horizontal in browser Chrome on Wndows and Linux too.
The "menu" is appeared in browser on IE Windows, but right and left click is not effective.
What is wrong?

enhancement : public folder - adding support for subfolders

Would like to request adding a way to support subfolders in the user's public folder. At the moment public links of a user can be accessed at the url: /username/profile/public - but not folders inside this one.

Special character in folder names returns 404 upon navigation (context: importing)

Related to issue #3 and #7
I have imported a long list of bookmark from Chrome
I had folder names like "$$" and "§§": reminiscence imported them, but then the link was unusable
For instance the url
http://localhost/jj/%C2%A7%C2%A7%C2%A7-64
returned a 404

Machine: ubuntu linux
Env: dockerized

PWA support

Reminiscence is good web application, and I think it will be more convenient if it support PWA.

static files result in 404 when using docker installation

when using the docker installation, all static assets (js and css files) result in a 404,
which makes the site look like a basic html site

Adding support for annotations and notes

Experimental support for annotations has been added in annotation branch.

Annotation feature is available for both

Reader mode text (i.e. clean-up version of html)
and original archived html page

It supports regular features like creation, deletion and modification of annotations.

Apart from annotations, experimental note taking feature has also been added

In order to add note, enter command similar to below in the url input box

note:new_note

Above command will add new_note to current folder. Users can edit the note using WYSIWYG editor, which will be saved on reminiscence server.

Both features are experimental and may contain bugs, so avoid them on existing archived collection.

building docker image on armv7l architecture (raspberrypi) fails

running sudo docker-compose up --build on a raspberryPi fails with errors:

...
The following packages have unmet dependencies:
 wkhtmltox:amd64 : Depends: libc6:amd64 but it is not installable
                   Depends: libfreetype6:amd64 but it is not installable
                   Depends: libjpeg62-turbo:amd64 but it is not installable
                   Depends: libpng16-16:amd64 but it is not installable
                   Depends: libssl1.1:amd64 but it is not installable
                   Depends: libstdc++6:amd64 but it is not installable
                   Depends: libx11-6:amd64 but it is not installable
                   Depends: libxcb1:amd64 but it is not installable
                   Depends: libxext6:amd64 but it is not installable
                   Depends: libxrender1:amd64 but it is not installable
                   Depends: zlib1g:amd64 but it is not installable
E: Unable to correct problems, you have held broken packages.
...
  Downloading psycopg2-2.7.5.tar.gz (426 kB)
    ERROR: Command errored out with exit status 1:
     command: /usr/local/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-nqdq_4ug/psycopg2/setup.py'"'"'; __file__='"'"'/tmp/pip-install-nqdq_4ug/psycopg2/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-install-nqdq_4ug/psycopg2/pip-egg-info
         cwd: /tmp/pip-install-nqdq_4ug/psycopg2/
    Complete output (23 lines):
    running egg_info
    creating /tmp/pip-install-nqdq_4ug/psycopg2/pip-egg-info/psycopg2.egg-info
    writing /tmp/pip-install-nqdq_4ug/psycopg2/pip-egg-info/psycopg2.egg-info/PKG-INFO    writing dependency_links to /tmp/pip-install-nqdq_4ug/psycopg2/pip-egg-info/psycopg2.egg-info/dependency_links.txt
    writing top-level names to /tmp/pip-install-nqdq_4ug/psycopg2/pip-egg-info/psycopg2.egg-info/top_level.txt
    writing manifest file '/tmp/pip-install-nqdq_4ug/psycopg2/pip-egg-info/psycopg2.egg-info/SOURCES.txt'

    Error: pg_config executable not found.

    pg_config is required to build psycopg2 from source.  Please add the directory
    containing pg_config to the $PATH or specify the full executable path with the
    option:

        python setup.py build_ext --pg-config /path/to/pg_config build ...

    or with the pg_config option in 'setup.cfg'.

    If you prefer to avoid building psycopg2 from source, please install the PyPI
    'psycopg2-binary' package instead.

Seems to be caused by:

the hard-wired amd64-arch package wkhtmltox_0.12.5-1.stretch_amd64.deb
installation of most pip packages fails on armv7l because
it leads to building packages from source which the dependencies
for are missing

I try solving this by adding a suiting Dockerfile.armv7l where this issues are adressed by installing https://github.com/wkhtmltopdf/wkhtmltopdf/releases/download/0.12.5/wkhtmltox_0.12.5-1.raspbian.stretch_armhf.deb instead (works) + the dependencies for building the package in requirements.txt (lxml==4.2.4 troubles me at the moment, doesn't work :/) relying on sudo apt-get build-dep python3-libxml2 etc. ...

has anybody a better solution for this already before i invest more time?

Greetings,
Sven.

reminisence on nginx can't show reminisence page?

My environment is Linux Mint.

I'm trying to use "reminiscence" through Nginx by refereing following page.
https://github.com/kanishka-linux/reminiscence/blob/master/README.md#gunicorn-plus-nginx-setup

Both
(venv) yama@jpx20120007:~/reminiscence/venv/reminiscence$ python manage.py runserver 127.0.0.1:8000
and
(venv) yama@jpx20120007:~/reminiscence/venv/reminiscence$ gunicorn --max-requests 100 --worker-class gthread --workers 2 --thread 5 --timeout 300 --bind 0.0.0:8000 reminiscence.wsgi
work fine.

Since I can see "Welcome to nginx" http://localhost/, nginx woks fine.

But
http://localhost/static shows "404 Not found"
http://127.0.0.1:8000 shows "ERR_CONNECTION_REFUSED"

Could you advise me what is wrong is?

my nginx reminiscence.conf is here.

(venv) yama@jpx20120007:~/reminiscence/venv/reminiscence$ cat /etc/nginx/conf.d/reminiscence.conf 
#worker_processes  2;
  
#events {
\#    worker_connections  1024;
#}

#http {
\#   include       mime.types;
\#   default_type  application/octet-stream;

\#   sendfile        on;
\#   sendfile_max_chunk 512k;
\#   keepalive_timeout  65;
\#   proxy_read_timeout 300s;

    server {
	listen       80;
	server_name  localhost;
	client_max_body_size 1024m;
      
	location /static/ {
		root /home/yama/reminiscence/venv/reminiscence; # root of project directory
		aio threads;
        }
	location = /favicon.ico { access_log off; log_not_found off; }
	location / {
        	 proxy_pass http://127.0.0.1:8000;
		 proxy_set_header Host $host;
		 proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
		 root /home/yama/reminiscence/venv/reminiscence; #root of project directory
	}

 	error_page   500 502 503 504  /50x.html;
 	location = /50x.html {
 	root   /usr/share/nginx/html;
 	}
    }
\#}

nginx confile syntax is ok.

[sudo] yama のパスワード:
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful

Package Pyscopg2 installation is broken on Ubuntu 20.04

during the installation:

pip install requirements.txt

Pip halts installation abruptly while trying to install the psycopg2 package.

Heres a dump of the error log found

Collecting psycopg2
  Using cached psycopg2-2.8.6.tar.gz (383 kB)
    ERROR: Command errored out with exit status 1:
     command: /home/ashklempton/venv/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-5oaf9ymi/psycopg2/setup.py'"'"'; __file__='"'"'/tmp/pip-install-5oaf9ymi/psycopg2/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-install-5oaf9ymi/psycopg2/pip-egg-info
         cwd: /tmp/pip-install-5oaf9ymi/psycopg2/
    Complete output (23 lines):
    running egg_info
    creating /tmp/pip-install-5oaf9ymi/psycopg2/pip-egg-info/psycopg2.egg-info
    writing /tmp/pip-install-5oaf9ymi/psycopg2/pip-egg-info/psycopg2.egg-info/PKG-INFO
    writing dependency_links to /tmp/pip-install-5oaf9ymi/psycopg2/pip-egg-info/psycopg2.egg-info/dependency_links.txt
    writing top-level names to /tmp/pip-install-5oaf9ymi/psycopg2/pip-egg-info/psycopg2.egg-info/top_level.txt
    writing manifest file '/tmp/pip-install-5oaf9ymi/psycopg2/pip-egg-info/psycopg2.egg-info/SOURCES.txt'
    
    Error: pg_config executable not found.
    
    pg_config is required to build psycopg2 from source.  Please add the directory
    containing pg_config to the $PATH or specify the full executable path with the
    option:
    
        python setup.py build_ext --pg-config /path/to/pg_config build ...
    
    or with the pg_config option in 'setup.cfg'.
    
    If you prefer to avoid building psycopg2 from source, please install the PyPI
    'psycopg2-binary' package instead.
    
    For further information please check the 'doc/src/install.rst' file (also at
    <https://www.psycopg.org/docs/install.html>).
    
    ----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

The error speaks of configuring a pg_config which I did not find as a prerequisite in the Installation section of the Readme of reminsence.

Note: As said in the error , I have solved the problem easil by using psycopg2-binary package instead. You may look into updating the requirements.txt respectively.

cant add any bookmarks

Im using firefox 88
I cant add bookmarks
I don't understand what the output means, so I cant tell you more about my problem

here is what my terminal looks like:

(venv) [user@browsing reminiscence]$ python manage.py runserver 127.0.0.1:8000 
Watching for file changes with StatReloader
INFO:django.utils.autoreload:Watching for file changes with StatReloader
Performing system checks...

System check identified some issues:

WARNINGS:
pages.GroupTable: (models.W042) Auto-created primary key used when not defining a primary key type, by default 'django.db.models.AutoField'.
	HINT: Configure the DEFAULT_AUTO_FIELD setting or the PagesConfig.default_auto_field attribute to point to a subclass of AutoField, e.g. 'django.db.models.BigAutoField'.
pages.Library: (models.W042) Auto-created primary key used when not defining a primary key type, by default 'django.db.models.AutoField'.
	HINT: Configure the DEFAULT_AUTO_FIELD setting or the PagesConfig.default_auto_field attribute to point to a subclass of AutoField, e.g. 'django.db.models.BigAutoField'.
pages.Tags: (models.W042) Auto-created primary key used when not defining a primary key type, by default 'django.db.models.AutoField'.
	HINT: Configure the DEFAULT_AUTO_FIELD setting or the PagesConfig.default_auto_field attribute to point to a subclass of AutoField, e.g. 'django.db.models.BigAutoField'.
pages.URLTags: (models.W042) Auto-created primary key used when not defining a primary key type, by default 'django.db.models.AutoField'.
	HINT: Configure the DEFAULT_AUTO_FIELD setting or the PagesConfig.default_auto_field attribute to point to a subclass of AutoField, e.g. 'django.db.models.BigAutoField'.
pages.UserSettings: (models.W042) Auto-created primary key used when not defining a primary key type, by default 'django.db.models.AutoField'.
	HINT: Configure the DEFAULT_AUTO_FIELD setting or the PagesConfig.default_auto_field attribute to point to a subclass of AutoField, e.g. 'django.db.models.BigAutoField'.

System check identified 5 issues (0 silenced).
May 17, 2021 - 07:38:02
Django version 3.2.3, using settings 'reminiscence.settings'
Starting development server at http://127.0.0.1:8000/
Quit the server with CONTROL-C.
"GET / HTTP/1.1" 200 4669
INFO:django.server:"GET / HTTP/1.1" 200 4669
"GET /static/css/bootstrap.min.css HTTP/1.1" 304 0
INFO:django.server:"GET /static/css/bootstrap.min.css HTTP/1.1" 304 0
"GET /static/folder.svg HTTP/1.1" 304 0
INFO:django.server:"GET /static/folder.svg HTTP/1.1" 304 0
"GET /static/menu.svg HTTP/1.1" 304 0
INFO:django.server:"GET /static/menu.svg HTTP/1.1" 304 0
"GET /user/bookmarks HTTP/1.1" 200 5623
INFO:django.server:"GET /user/bookmarks HTTP/1.1" 200 5623
476::INFO::vinanti::__start_fetching__: using backend: urllib for url : https://springmerchant.com/bigcommerce/psycopg2-virtualenv-install-pg_config-executable-not-found/
INFO:vinanti.vinanti:using backend: urllib for url : https://springmerchant.com/bigcommerce/psycopg2-virtualenv-install-pg_config-executable-not-found/
478::INFO::vinanti::__start_fetching__: 
Requesting url: https://springmerchant.com/bigcommerce/psycopg2-virtualenv-install-pg_config-executable-not-found/

INFO:vinanti.vinanti:
Requesting url: https://springmerchant.com/bigcommerce/psycopg2-virtualenv-install-pg_config-executable-not-found/

"POST /user/bookmarks HTTP/1.1" 200 5624
INFO:django.server:"POST /user/bookmarks HTTP/1.1" 200 5624
391::INFO::vinanti::__finished_task_postprocess__: 
completed: 1

INFO:vinanti.vinanti:
completed: 1

407::INFO::vinanti::__finished_task_postprocess__: arranging callback, task 0 https://springmerchant.com/bigcommerce/psycopg2-virtualenv-install-pg_config-executable-not-found/
INFO:vinanti.vinanti:arranging callback, task 0 https://springmerchant.com/bigcommerce/psycopg2-virtualenv-install-pg_config-executable-not-found/
DEBUG:pages.dbaccess:text/html ----> .html
ERROR:asyncio:Task exception was never retrieved
future: <Task finished name='Task-1' coro=<Vinanti.__start_fetching__() done, defined at /home/user/reminiscence/venv/reminiscence/vinanti/vinanti.py:431> exception=SynchronousOnlyOperation('You cannot call this from an async context - use a thread or sync_to_async.')>
Traceback (most recent call last):
  File "/home/user/reminiscence/venv/reminiscence/vinanti/vinanti.py", line 509, in __start_fetching__
    self.__finished_task_postprocess__(session, netloc, onfinished,
  File "/home/user/reminiscence/venv/reminiscence/vinanti/vinanti.py", line 409, in __finished_task_postprocess__
    onfinished(task_num, url, result)
  File "/home/user/reminiscence/venv/reminiscence/pages/dbaccess.py", line 222, in url_fetch_completed
    row = Library.objects.create(usr=usr,
  File "/home/user/reminiscence/venv/lib/python3.9/site-packages/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/home/user/reminiscence/venv/lib/python3.9/site-packages/django/db/models/query.py", line 453, in create
    obj.save(force_insert=True, using=self.db)
  File "/home/user/reminiscence/venv/lib/python3.9/site-packages/django/db/models/base.py", line 726, in save
    self.save_base(using=using, force_insert=force_insert,
  File "/home/user/reminiscence/venv/lib/python3.9/site-packages/django/db/models/base.py", line 763, in save_base
    updated = self._save_table(
  File "/home/user/reminiscence/venv/lib/python3.9/site-packages/django/db/models/base.py", line 868, in _save_table
    results = self._do_insert(cls._base_manager, using, fields, returning_fields, raw)
  File "/home/user/reminiscence/venv/lib/python3.9/site-packages/django/db/models/base.py", line 906, in _do_insert
    return manager._insert(
  File "/home/user/reminiscence/venv/lib/python3.9/site-packages/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/home/user/reminiscence/venv/lib/python3.9/site-packages/django/db/models/query.py", line 1270, in _insert
    return query.get_compiler(using=using).execute_sql(returning_fields)
  File "/home/user/reminiscence/venv/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 1414, in execute_sql
    with self.connection.cursor() as cursor:
  File "/home/user/reminiscence/venv/lib/python3.9/site-packages/django/utils/asyncio.py", line 24, in inner
    raise SynchronousOnlyOperation(message)
django.core.exceptions.SynchronousOnlyOperation: You cannot call this from an async context - use a thread or sync_to_async.
"GET /user/bookmarks HTTP/1.1" 200 5623
INFO:django.server:"GET /user/bookmarks HTTP/1.1" 200 5623
"GET /user/bookmarks HTTP/1.1" 200 5623
INFO:django.server:"GET /user/bookmarks HTTP/1.1" 200 5623
"GET /static/js/jquery-3.3.1.min.js HTTP/1.1" 304 0
INFO:django.server:"GET /static/js/jquery-3.3.1.min.js HTTP/1.1" 304 0
"GET /static/js/popper.min.js HTTP/1.1" 304 0
INFO:django.server:"GET /static/js/popper.min.js HTTP/1.1" 304 0
"GET /static/js/bootstrap.min.js HTTP/1.1" 304 0
INFO:django.server:"GET /static/js/bootstrap.min.js HTTP/1.1" 304 0
"GET /static/js/bootbox.min.js HTTP/1.1" 304 0
INFO:django.server:"GET /static/js/bootbox.min.js HTTP/1.1" 304 0
"GET /static/js/main.js HTTP/1.1" 304 0
INFO:django.server:"GET /static/js/main.js HTTP/1.1" 304 0

Special characters in folder name break the folder

Summary
Some URL special characters (&, /, etc.) are allowed in folder names, and render the folder inaccessible.

Environment
Version: Release v0.1
Python: Anaconda 3.6.5
Set up using Normal Method on Debian Stretch
Running gunicorn behind Nginx as described in README.md

Reproduction Steps

Create a new folder with an alphanumeric name (e.g. Recipes)
Add URLs to the folder
From Home, rename the folder (Select -> Rename)
Enter a name which contains a forward slash (/) or ampersand (&) (e.g. Recipes & Cooking)
Observe the rename succeeding, and the new name appearing properly on the Home page
Attempt to open the folder to view the URLs inside

Expected Behaviour
The folder should open as expected, and the special characters should be properly escaped in the page URL

Observed Behaviour
The page fails to load (HTTP 404), as the special characters present in the URL are not escaped and are invalid (e.g. Recipes & Cooking above tries loading as https://myserver.example/myusername/Recipes & Cooking)

Recovery
I was able to repair my installation by editing the database manually and replacing all instances of the invalid folder name with a valid one in the directory column of the page_library table.

Notes
It would be nice if these characters were allowed (and properly escaped so the page continues to load), but if this is infeasible then they should be disallowed in the folder name field to prevent breaking an installation.

I'd be happy to dig in to the project and contribute a pull request if you'd like. Thanks for all your work on this project so far, it's awesome!

I'm guessing this repo is abandoned?

As it was 4 years since the last release.

Firefox-Addon ported to Chromium/Chrome

Hi, btw. ported Alec's Firefox-Addon to Chromium/Chrome because i couldn't find one. Works fine for me, so maybe it can be usefull for others too:

https://github.com/s3h10r/AddToReminiscence-Chromium-Extension

Greetings,
Sven

bookmark-file import fails

Hi,

i have a 8400 lines bookmark file that i'd like to import. Currently the web gui says everything is ok and the log says:

dg01     | DEBUG:pages.views:<MultiValueDict: {'file-upload': [<InMemoryUploadedFile: pinboard_export (application/octet-stream)>]}>
dg01     | INFO:pages.views:pinboard_export
dg01     | INFO:pages.views:None

or when i rename the file to foo.html:

dg01     | DEBUG:pages.views:yes
dg01     | DEBUG:pages.views:<MultiValueDict: {'file-upload': [<InMemoryUploadedFile: pinboard_export.html (text/html)>]}>
dg01     | INFO:pages.views:pinboard_export.html
dg01     | INFO:pages.views:text/html

Can i enable some verbose debug log to see where the problem ist?

Hello from bookmark-archiver!

Hi! I maintain https://github.com/pirate/bookmark-archiver, and I just learned that this project exists today from HN & the LWN post about archiving sites!

You have a lot of good ideas in this repo, very similar to how I've been planning to improve bookmark-archiver in the coming months:

django
dramatiq instead of celery
stable mysql db of archived sites with migrations
json/csv/xml output for the index
warc/html/pdf/screenshot/youtubedl/git output for sites
1, 2, and 3-link deep crawling with https://github.com/internetarchive/brozzler

You're welcome to use any of the code from bookmark-archiver of course, and I may take inspiration from your repo as well for the UI and NLTK automatic tagging and summarization, we've had tickets open for that for a while.

Best of luck! Please hit me up on twitter: @theSquashSH if you ever want to chat or cooperate on stuff, I just added a link to reminiscence at the bottom of the BA readme.

P.S. I may meet up with the author of the LWN article in Montreal at some point, I'll talk to him as well about Reminiscence.

No Access From Other Machines On LAN

Cannot seem to access from other machines on my LAN - connection times out.

Attempted with:

hostname;
Local IP.

Have created ufw exceptions for Port 8000, but no avail.

Docker setup does not load static assets

I'm on macOS 10.15

docker-compose up --build completes:

web_1    | [2020-11-23 12:30:54 +0000] [69] [INFO] Listening at: http://0.0.0.0:8000 (69)
web_1    | [2020-11-23 12:30:54 +0000] [69] [INFO] Using worker: gthread
web_1    | [2020-11-23 12:30:54 +0000] [71] [INFO] Booting worker with pid: 71
web_1    | [2020-11-23 12:30:54 +0000] [82] [INFO] Booting worker with pid: 82
web_1    | [2020-11-23 12:30:54 +0000] [83] [INFO] Booting worker with pid: 83
web_1    | [2020-11-23 12:30:55 +0000] [88] [INFO] Booting worker with pid: 88
web_1    | WARNING:django.security.SuspiciousSession:Session data corrupted
web_1    | WARNING:django.request:Not Found: /static/css/bootstrap.min.css
web_1    | WARNING:django.request:Not Found: /static/js/bootstrap.min.js
web_1    | WARNING:django.request:Not Found: /static/js/main.js
web_1    | WARNING:django.request:Not Found: /static/js/popper.min.js
web_1    | WARNING:django.request:Not Found: /static/js/jquery-3.3.1.min.js
web_1    | WARNING:django.request:Not Found: /static/js/bootbox.min.js
web_1    | WARNING:django.request:Not Found: /static/js/popper.min.js
web_1    | WARNING:django.request:Not Found: /static/js/bootstrap.min.js
web_1    | WARNING:django.request:Not Found: /static/js/bootbox.min.js
web_1    | WARNING:django.request:Not Found: /static/js/main.js

What is interesting is that it runs on http://0.0.0.0:8000 is there a mistake in readme?
Whenever I open it in the browser, none of the static assets load

Docker pg01 start issue

When I do docker-compose up --build, I get this:

Starting pg01 ... done
Starting dg01 ... done
Creating ng01 ... done
Attaching to pg01, dg01, ng01
pg01     | 2018-09-09 15:57:20.932 UTC [1] FATAL:  data directory "/var/lib/postgresql/data" has wrong ownership
pg01     | 2018-09-09 15:57:20.932 UTC [1] HINT:  The server must be started by the user that owns the data directory.
pg01 exited with code 1
dg01     | db: forward host lookup failed: Unknown host
dg01     | db: forward host lookup failed: Unknown host

No matching distribution found for aiohttp==3.3.2

running Ubuntu 16.04

Any ideas? the versions listed only go up to 3.0.0b0

Firefox addon for Reminiscence

Currently there is no official addon available for Reminiscence, but there is an unofficial and experimental addon developed by NordicDev is available at Firefox addon store. It allows users to login into Reminiscence instance and send links to it.

Users can browse source of the addon at gitlab, and for any further issue with it, users should contact the dev.