roznoshchik / lurnby Goto Github PK

A tool for active reading and personal knowledge management

License: BSD 3-Clause "New" or "Revised" License

Dockerfile 0.07% Python 27.09% JavaScript 21.70% HTML 25.40% CSS 7.82% Shell 0.02% Mako 0.03% SCSS 17.86% Procfile 0.01%

reading spaced-repetition-system personal-knowledge-management personal-knowledge-system

lurnby's Issues

add by email fails if there is more than 1 recipient.

def add_by_email():
    recipient = request.form['to']
    if '<' in recipient:
        recipient = recipient.split('<')[1][:-1]

This is the code being used to get the recipient of the email. When someone emails something to Lurnby and there is more than a single email in the request.form['to'] then the function fails as it isn't pulling out the right email.

A better solution would likely be to use regex to pull out the email that has @add-article.lurnby.com as the ending.

Import lurnby web to lurnby self hosted

In a case where a user want to self-host lurnby himself it would be interesting to add the ability to import the data exported from the current website https://www.lurnby.com.

My current use case is that im testing your application and would like to self-host it later.

Highlighting images and media doesn't work yet.

Currently Lurnby's highlighting only works for text content. Ideally you should be able to highlight images, graphs, and charts.

The existing libraries all seem to rely on making a screenshot by recreating the DOM and I'm not positive if that would work in all use cases such as pdfs and the like.

In any case, would need to do a few things for that.

Open up an option to take a screenshot instead of create a highlight in reader mode.
Create a new db field for storing image location in the Highlight model.
Change how the highlight displays if it's an image or if it's text.

Need a more comprehensive help section

Lurnby has a lot of things that aren't obvious. There are some videos that show the different functionality in action, but it would be much better to also have a focused getting started guide that linked you to guides on the different features and functions.

These would prob be a combo of text + gif.

Markdown support for all text area inputs.

Would be nice to add markdown support to the different text area inputs in the site.

Decouple from Amazon

For storing image content the app currently sends images to amazon s3. This is fine for the web-app version, but if the app is meant to run locally, then it's not necessary.

There should be a flag somewhere to determine if this is supposed to be a web app or an offline app and removes the Amazon dependency if that's the case.

Creating a Lurnby api

For some of the planned features for lurnby, including offline support and native mobile apps, it's important to first separate the data from the application to allow for multiple clients. This is a sketch of the api

User

Method	Endpoint	Description
POST	`/user`	create new user
GET	`/user/<id>`	get user info
GET	`/user/<id>/email`	enable add by email
GET	`/user/<id>/senders`	get approved senders
PUT	`/user/<id>/senders`	update approved senders
GET	`/user/<id>/export`	export all users data
GET	`/user/<id>/preferences`	get user communication preferences
PUT	`/user/<id>/preferences`	update user communication preferences
PUT	`/user/<id>`	update user
DEL	`/user/<id>`	delete user

Auth

Method	Endpoint	Description
POST	`/authorize`	log in / receive tokens
POST	`/refresh`	refresh tokens

Articles

Method	Endpoint	Description
GET	`/article`	Get articles
POST	`/article`	create new article
GET	`/article/<id>`	get article
PUT	`/article/<id>`	update article
DEL	`/article/<id>`	delete article
GET	`/article/<id>/notes`	get article notes
PUT	`/article/<id>/notes`	update article notes
GET	`/article/<id>/highlights`	get article highlights
GET	`/article/<id>/export`	export article

Highlights

Method	Endpoint	Description
GET	`/highlight`	Get highlights
GET	`/highlight/export`	export highlights
POST	`/highlight`	create new highlight
GET	`/highlight/<id>`	get highlight
PUT	`/highlight/<id>`	update highlight
DEL	`/highlight/<id>`	delete highlight
GET	`/highlight/review`	get highlights for review

Tags

Method	Endpoint	Description
GET	`/tag`	Get tags
POST	`/tag`	create new tag
GET	`/tag/<id>`	get tag
PUT	`/tag/<id>`	update tag
DEL	`/tag/<id>`	delete tag

Offline Mode for web hosted lurnby

Currently lurnby.com doesn't work offline. If trying to access it offline the service worker just shows a standard this app doesn't work offline.

But the idea is that it should also work offline to some degree, although I am not sure exactly how much.

A simple idea is that it should cache the x most recent articles so that you could read them offline. Or it should cache x most recent highlights so that review is possible.

In the case of articles, I think that becomes a bit challenging when figuring out how to also allow highlighting in offline mode.
Highlights actually change the text of the article so to create a highlight object, you would need to:

Capture highlighted text
Capture notes added to highlight
Capture any tags/topics
Capture the precise location in the text
Add to some sort of queue that then updates the db when network access arrives.

One possible solution for this is that when creating a highlight while offline, the highlight is created with a temporary ID and then rendered to the screen as normal. A javascript object takes the place of the DB.

Once network functionality is regained and an actual ID is generated by the db, the article text gets updated so that the highlight points to the proper place.

Allow external highlights

All highlights are currently connected to an article inside of the app. But this is a limitation that shouldn't be there as you should be able to import highlights from anywhere to start using with the platform.

The easiest might be to add a boolean for external=True and an externalSource=... fields to the highlight model.

Then the highlight page should have a create highlight button to allow for manual creation.
It should also have an import highlights button that uses something like this import code as base.

Also the web extensions can then be updated to allow for sending highlights and not just for sending the url to the app.

Export should offer html or plaintext options

Since switching over the highlights and most text inputs to support html, the actual content now being exported is html. This means that it's a bit limiting what you can use the content for if you export it from lurnby.

As a user I would want to specify when choosing my export if highlights and notes should be exported as html content or parsed for their plaintext versions.

PDF parsing is poor

The current pdf library leaves a lot to be desired.

It only works for simple pdfs with plain images And text.

Anything more complex that has graphs, charts, etc, comes through very poorly.

One idea is to just work with Pdfs as images. And then possibly do an OCR on the text content.

But there is a lot that needs to be Explored there to render things properly so that it works with lurnby.

DB Connection Timeout Issue

There is an issue when if you leave a screen open for too long, the database connection will close before the app connection closes. If someone tries to do something on that page, they will get an error that looks like this:

The log for this issue is:

Exception on /app/articles [GET]
Traceback (most recent call last):
 File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1283, in _execute_context
   self.dialect.do_execute(
 File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 590, in do_execute
   cursor.execute(statement, parameters)
psycopg2.OperationalError: terminating connection due to administrator command
SSL connection has been closed unexpectedly


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
 File "/app/.heroku/python/lib/python3.9/site-packages/flask/app.py", line 2447, in wsgi_app
   response = self.full_dispatch_request()
 File "/app/.heroku/python/lib/python3.9/site-packages/flask/app.py", line 1952, in full_dispatch_request
   rv = self.handle_user_exception(e)
 File "/app/.heroku/python/lib/python3.9/site-packages/flask_cors/extension.py", line 165, in wrapped_function
   return cors_after_request(app.make_response(f(*args, **kwargs)))
 File "/app/.heroku/python/lib/python3.9/site-packages/flask/app.py", line 1821, in handle_user_exception
   reraise(exc_type, exc_value, tb)
 File "/app/.heroku/python/lib/python3.9/site-packages/flask/_compat.py", line 39, in reraise
   raise value
 File "/app/.heroku/python/lib/python3.9/site-packages/flask/app.py", line 1948, in full_dispatch_request
   rv = self.preprocess_request()
 File "/app/.heroku/python/lib/python3.9/site-packages/flask/app.py", line 2242, in preprocess_request
   rv = func()
 File "/app/app/__init__.py", line 75, in before_request_func
   if current_user.is_authenticated:
 File "/app/.heroku/python/lib/python3.9/site-packages/werkzeug/local.py", line 432, in __get__
   obj = instance._get_current_object()
 File "/app/.heroku/python/lib/python3.9/site-packages/werkzeug/local.py", line 554, in _get_current_object
   return self.__local()  # type: ignore
 File "/app/.heroku/python/lib/python3.9/site-packages/flask_login/utils.py", line 26, in <lambda>
   current_user = LocalProxy(lambda: _get_user())
 File "/app/.heroku/python/lib/python3.9/site-packages/flask_login/utils.py", line 346, in _get_user
   current_app.login_manager._load_user()
 File "/app/.heroku/python/lib/python3.9/site-packages/flask_login/login_manager.py", line 329, in _load_user
   user = self._load_user_from_remember_cookie(cookie)
 File "/app/.heroku/python/lib/python3.9/site-packages/flask_login/login_manager.py", line 372, in _load_user_from_remember_cookie
   user = self._user_callback(user_id)
 File "/app/app/models.py", line 135, in load_user
   return User.query.get(int(id))
 File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 1021, in get
   return self._get_impl(ident, loading.load_on_pk_identity)
 File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 1138, in _get_impl
   return db_load_fn(self, primary_key_identity)
 File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/orm/loading.py", line 287, in load_on_pk_identity
   return q.one()
 File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 3436, in one
   ret = self.one_or_none()
 File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 3405, in one_or_none
   ret = list(self)
 File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 3481, in __iter__
   return self._execute_and_instances(context)
 File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 3506, in _execute_and_instances
   result = conn.execute(querycontext.statement, self._params)
 File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1020, in execute
   return meth(self, multiparams, params)
 File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
   return connection._execute_clauseelement(self, multiparams, params)
 File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1133, in _execute_clauseelement
   ret = self._execute_context(
 File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1323, in _execute_context
   self._handle_dbapi_exception(
 File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1517, in _handle_dbapi_exception
   util.raise_(
 File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 178, in raise_
   raise exception
 File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1283, in _execute_context
   self.dialect.do_execute(
 File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 590, in do_execute
   cursor.execute(statement, parameters)
sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) terminating connection due to administrator command
SSL connection has been closed unexpectedly

[SQL: SELECT "user".id AS user_id, "user".goog_id AS user_goog_id, "user".firstname AS user_firstname, "user".username AS user_username, "user".email AS user_email, "user".password_hash AS user_password_hash, "user".admin AS user_admin, "user".test_account AS user_test_account, "user".deleted AS user_deleted, "user".suggestion_id AS user_suggestion_id, "user".account_created_date AS user_account_created_date, "user".last_active AS user_last_active, "user".last_action AS user_last_action, "user".tos AS user_tos, "user".token AS user_token, "user".token_expiration AS user_token_expiration, "user".preferences AS user_preferences, "user".add_by_email AS user_add_by_email, "user".review_count AS user_review_count
FROM "user"
WHERE "user".id = %(param_1)s]
[parameters: {'param_1': 9}]
(Background on this error at: http://url2468.lurnby.com/ls/click?upn=O95o0jN-2F92mJdpf7ZhEbgLXtFoI7wvvm0Nlb71SAqYc7I4OlCr9vuEBttZ6PLGT1WNJK_-2FZKyCIFGBlWZyR9dtbwHIdOuvcuQq8Y2fMr-2B-2FbeShjLyosNNioPDJzhSgpKKo74YFbAenDCCE7-2Bkqr8yx5SEv084ovLO1u39hYHllO4yfmmzrwczMZ24bimgvQg0j9VfPKhsZ6407LPtIKOe92SaPIyJNjOYcQUjO5GidlprUpwSgFJX1-2FmYyDnyXm5ii6-2BzSv3dLzcrWZIMfMen51r0Zw-3D-3D

I saw an issue with Recommended Systems that was extremely similar, and the solution was outlined here: https://blog.stigok.com/2021/02/28/sqlalchemy-postgres-ssl-eof-detected.html

Ultimately, it was to include a pool_pre_ping configuration in the config.py file. It looks like this:

SQLALCHEMY_ENGINE_OPTIONS = {"pool_pre_ping": True}

Here's how it works:

The reasons for the SSL SYSCALL error: EOF detected is that the client (ORM) thinks that the TCP connection is still up, but the server has already hung up without saying so. The client then starts sending a query down the pipe and when it does, it notices the connection is broken, resulting in a sudden EOF.

What pool_pre_ping does is to test the connection before attempting to execute the actual query. This comes with an extra round-trip for all queries, but at least in my small-scale application this doesn’t matter at all. Behind the scenes, it sends a query similar to SELECT 1 to sort of ping the database. If it succeeds it follows up with the actual query you wanted to send – if it fails it recycles the connection along with all other connections established earlier than the connection it tried, and establishes a new one before sending the query again.

Add TinyMCE support to manual article entry.

This should be an easy quick fix. And should be present with or without markdown support.

Finding Epub Images

Epubs seem to have very limited consistency with how they organize their internal file structure.

I haven't figured out a great way of finding the image folder.

images = soup.find_all('img')
        if images:
            for img in images:
                img["loading"] = "lazy" 
                filename = img['src']   
                filename = filename.replace("../", path+"/")

                if not os.path.exists(filename):
                    filename = f"{path}/{img['src']}"

                if not os.path.exists(filename):
                    filename = f"{path}/EPUB/media/{img['src']}"
 
                if not os.path.exists(filename):
                    filename = f"{path}/EPUB/images/{img['src']}"
            
                if not os.path.exists(filename):
                    filename = img['src']
                    filename = filename.replace("../", path+"/OEBPS/")

Whenever I encounter an epub whose images don't load, I need to load up the epub, look at the folder structure and then manually add in the branching path.

I'm sure there's a better way to search the epub to locate the image folder itself which would work for any yet undiscovered filepaths.

Titles in Highlights Email are Tiny

See screencap here from a gmail edition of the "Recent Highlights Email" (also on mobile):

By changing h6 to h4 tag, it looks better:

Separate recommendation for consideration:

Group highlights from individual articles together?

Deleting articles

I could not find a button to do it. The closest thing I found was "Archive" where the article disappears into undisclosed location.

Offtopic: Great app! I am thinking of replacing Pocket with it.

Incomplete Unit tests

The unit tests haven't been updated in a long time and need to be written from scratch. Currently the app is tested manually when changes are made and then tested again in a staging environment, but human error and all that.

Need to first brainstorm what the unit tests should be and then write them.