roznoshchik / lurnby Goto Github PK
View Code? Open in Web Editor NEWA tool for active reading and personal knowledge management
Home Page: https://www.lurnby.com
License: BSD 3-Clause "New" or "Revised" License
A tool for active reading and personal knowledge management
Home Page: https://www.lurnby.com
License: BSD 3-Clause "New" or "Revised" License
def add_by_email():
recipient = request.form['to']
if '<' in recipient:
recipient = recipient.split('<')[1][:-1]
This is the code being used to get the recipient of the email. When someone emails something to Lurnby and there is more than a single email in the request.form['to']
then the function fails as it isn't pulling out the right email.
A better solution would likely be to use regex to pull out the email that has @add-article.lurnby.com
as the ending.
In a case where a user want to self-host lurnby himself it would be interesting to add the ability to import the data exported from the current website https://www.lurnby.com.
My current use case is that im testing your application and would like to self-host it later.
Currently Lurnby's highlighting only works for text content. Ideally you should be able to highlight images, graphs, and charts.
The existing libraries all seem to rely on making a screenshot by recreating the DOM and I'm not positive if that would work in all use cases such as pdfs and the like.
In any case, would need to do a few things for that.
Lurnby has a lot of things that aren't obvious. There are some videos that show the different functionality in action, but it would be much better to also have a focused getting started guide that linked you to guides on the different features and functions.
These would prob be a combo of text + gif.
Would be nice to add markdown support to the different text area inputs in the site.
For storing image content the app currently sends images to amazon s3. This is fine for the web-app version, but if the app is meant to run locally, then it's not necessary.
There should be a flag somewhere to determine if this is supposed to be a web app or an offline app and removes the Amazon dependency if that's the case.
For some of the planned features for lurnby, including offline support and native mobile apps, it's important to first separate the data from the application to allow for multiple clients. This is a sketch of the api
Method | Endpoint | Description |
---|---|---|
POST | /user |
create new user |
GET | /user/<id> |
get user info |
GET | /user/<id>/email |
enable add by email |
GET | /user/<id>/senders |
get approved senders |
PUT | /user/<id>/senders |
update approved senders |
GET | /user/<id>/export |
export all users data |
GET | /user/<id>/preferences |
get user communication preferences |
PUT | /user/<id>/preferences |
update user communication preferences |
PUT | /user/<id> |
update user |
DEL | /user/<id> |
delete user |
Method | Endpoint | Description |
---|---|---|
POST | /authorize |
log in / receive tokens |
POST | /refresh |
refresh tokens |
Method | Endpoint | Description |
---|---|---|
GET | /article |
Get articles |
POST | /article |
create new article |
GET | /article/<id> |
get article |
PUT | /article/<id> |
update article |
DEL | /article/<id> |
delete article |
GET | /article/<id>/notes |
get article notes |
PUT | /article/<id>/notes |
update article notes |
GET | /article/<id>/highlights |
get article highlights |
GET | /article/<id>/export |
export article |
Method | Endpoint | Description |
---|---|---|
GET | /highlight |
Get highlights |
GET | /highlight/export |
export highlights |
POST | /highlight |
create new highlight |
GET | /highlight/<id> |
get highlight |
PUT | /highlight/<id> |
update highlight |
DEL | /highlight/<id> |
delete highlight |
GET | /highlight/review |
get highlights for review |
Method | Endpoint | Description |
---|---|---|
GET | /tag |
Get tags |
POST | /tag |
create new tag |
GET | /tag/<id> |
get tag |
PUT | /tag/<id> |
update tag |
DEL | /tag/<id> |
delete tag |
Currently lurnby.com doesn't work offline. If trying to access it offline the service worker just shows a standard this app doesn't work offline.
But the idea is that it should also work offline to some degree, although I am not sure exactly how much.
A simple idea is that it should cache the x most recent articles
so that you could read them offline. Or it should cache x most recent highlights
so that review is possible.
In the case of articles, I think that becomes a bit challenging when figuring out how to also allow highlighting in offline mode.
Highlights actually change the text of the article so to create a highlight object, you would need to:
One possible solution for this is that when creating a highlight while offline, the highlight is created with a temporary ID
and then rendered to the screen as normal. A javascript object takes the place of the DB.
Once network functionality is regained and an actual ID
is generated by the db, the article text
gets updated so that the highlight points to the proper place.
All highlights are currently connected to an article inside of the app. But this is a limitation that shouldn't be there as you should be able to import highlights from anywhere to start using with the platform.
The easiest might be to add a boolean for external=True
and an externalSource=...
fields to the highlight model.
Then the highlight page should have a create highlight
button to allow for manual creation.
It should also have an import highlights
button that uses something like this import code as base.
Also the web extensions can then be updated to allow for sending highlights and not just for sending the url to the app.
Since switching over the highlights and most text inputs to support html, the actual content now being exported is html. This means that it's a bit limiting what you can use the content for if you export it from lurnby.
As a user I would want to specify when choosing my export if highlights and notes should be exported as html content or parsed for their plaintext versions.
The current pdf library leaves a lot to be desired.
It only works for simple pdfs with plain images And text.
Anything more complex that has graphs, charts, etc, comes through very poorly.
One idea is to just work with Pdfs as images. And then possibly do an OCR on the text content.
But there is a lot that needs to be Explored there to render things properly so that it works with lurnby.
There is an issue when if you leave a screen open for too long, the database connection will close before the app connection closes. If someone tries to do something on that page, they will get an error that looks like this:
The log for this issue is:
Exception on /app/articles [GET]
Traceback (most recent call last):
File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1283, in _execute_context
self.dialect.do_execute(
File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 590, in do_execute
cursor.execute(statement, parameters)
psycopg2.OperationalError: terminating connection due to administrator command
SSL connection has been closed unexpectedly
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/app/.heroku/python/lib/python3.9/site-packages/flask/app.py", line 2447, in wsgi_app
response = self.full_dispatch_request()
File "/app/.heroku/python/lib/python3.9/site-packages/flask/app.py", line 1952, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/app/.heroku/python/lib/python3.9/site-packages/flask_cors/extension.py", line 165, in wrapped_function
return cors_after_request(app.make_response(f(*args, **kwargs)))
File "/app/.heroku/python/lib/python3.9/site-packages/flask/app.py", line 1821, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/app/.heroku/python/lib/python3.9/site-packages/flask/_compat.py", line 39, in reraise
raise value
File "/app/.heroku/python/lib/python3.9/site-packages/flask/app.py", line 1948, in full_dispatch_request
rv = self.preprocess_request()
File "/app/.heroku/python/lib/python3.9/site-packages/flask/app.py", line 2242, in preprocess_request
rv = func()
File "/app/app/__init__.py", line 75, in before_request_func
if current_user.is_authenticated:
File "/app/.heroku/python/lib/python3.9/site-packages/werkzeug/local.py", line 432, in __get__
obj = instance._get_current_object()
File "/app/.heroku/python/lib/python3.9/site-packages/werkzeug/local.py", line 554, in _get_current_object
return self.__local() # type: ignore
File "/app/.heroku/python/lib/python3.9/site-packages/flask_login/utils.py", line 26, in <lambda>
current_user = LocalProxy(lambda: _get_user())
File "/app/.heroku/python/lib/python3.9/site-packages/flask_login/utils.py", line 346, in _get_user
current_app.login_manager._load_user()
File "/app/.heroku/python/lib/python3.9/site-packages/flask_login/login_manager.py", line 329, in _load_user
user = self._load_user_from_remember_cookie(cookie)
File "/app/.heroku/python/lib/python3.9/site-packages/flask_login/login_manager.py", line 372, in _load_user_from_remember_cookie
user = self._user_callback(user_id)
File "/app/app/models.py", line 135, in load_user
return User.query.get(int(id))
File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 1021, in get
return self._get_impl(ident, loading.load_on_pk_identity)
File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 1138, in _get_impl
return db_load_fn(self, primary_key_identity)
File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/orm/loading.py", line 287, in load_on_pk_identity
return q.one()
File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 3436, in one
ret = self.one_or_none()
File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 3405, in one_or_none
ret = list(self)
File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 3481, in __iter__
return self._execute_and_instances(context)
File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 3506, in _execute_and_instances
result = conn.execute(querycontext.statement, self._params)
File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1020, in execute
return meth(self, multiparams, params)
File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1133, in _execute_clauseelement
ret = self._execute_context(
File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1323, in _execute_context
self._handle_dbapi_exception(
File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1517, in _handle_dbapi_exception
util.raise_(
File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 178, in raise_
raise exception
File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1283, in _execute_context
self.dialect.do_execute(
File "/app/.heroku/python/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 590, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) terminating connection due to administrator command
SSL connection has been closed unexpectedly
[SQL: SELECT "user".id AS user_id, "user".goog_id AS user_goog_id, "user".firstname AS user_firstname, "user".username AS user_username, "user".email AS user_email, "user".password_hash AS user_password_hash, "user".admin AS user_admin, "user".test_account AS user_test_account, "user".deleted AS user_deleted, "user".suggestion_id AS user_suggestion_id, "user".account_created_date AS user_account_created_date, "user".last_active AS user_last_active, "user".last_action AS user_last_action, "user".tos AS user_tos, "user".token AS user_token, "user".token_expiration AS user_token_expiration, "user".preferences AS user_preferences, "user".add_by_email AS user_add_by_email, "user".review_count AS user_review_count
FROM "user"
WHERE "user".id = %(param_1)s]
[parameters: {'param_1': 9}]
(Background on this error at: http://url2468.lurnby.com/ls/click?upn=O95o0jN-2F92mJdpf7ZhEbgLXtFoI7wvvm0Nlb71SAqYc7I4OlCr9vuEBttZ6PLGT1WNJK_-2FZKyCIFGBlWZyR9dtbwHIdOuvcuQq8Y2fMr-2B-2FbeShjLyosNNioPDJzhSgpKKo74YFbAenDCCE7-2Bkqr8yx5SEv084ovLO1u39hYHllO4yfmmzrwczMZ24bimgvQg0j9VfPKhsZ6407LPtIKOe92SaPIyJNjOYcQUjO5GidlprUpwSgFJX1-2FmYyDnyXm5ii6-2BzSv3dLzcrWZIMfMen51r0Zw-3D-3D
I saw an issue with Recommended Systems that was extremely similar, and the solution was outlined here: https://blog.stigok.com/2021/02/28/sqlalchemy-postgres-ssl-eof-detected.html
Ultimately, it was to include a pool_pre_ping
configuration in the config.py file. It looks like this:
SQLALCHEMY_ENGINE_OPTIONS = {"pool_pre_ping": True}
Here's how it works:
The reasons for the SSL SYSCALL error: EOF detected is that the client (ORM) thinks that the TCP connection is still up, but the server has already hung up without saying so. The client then starts sending a query down the pipe and when it does, it notices the connection is broken, resulting in a sudden EOF.
What pool_pre_ping does is to test the connection before attempting to execute the actual query. This comes with an extra round-trip for all queries, but at least in my small-scale application this doesn’t matter at all. Behind the scenes, it sends a query similar to SELECT 1 to sort of ping the database. If it succeeds it follows up with the actual query you wanted to send – if it fails it recycles the connection along with all other connections established earlier than the connection it tried, and establishes a new one before sending the query again.
This should be an easy quick fix. And should be present with or without markdown support.
Epubs seem to have very limited consistency with how they organize their internal file structure.
I haven't figured out a great way of finding the image folder.
images = soup.find_all('img')
if images:
for img in images:
img["loading"] = "lazy"
filename = img['src']
filename = filename.replace("../", path+"/")
if not os.path.exists(filename):
filename = f"{path}/{img['src']}"
if not os.path.exists(filename):
filename = f"{path}/EPUB/media/{img['src']}"
if not os.path.exists(filename):
filename = f"{path}/EPUB/images/{img['src']}"
if not os.path.exists(filename):
filename = img['src']
filename = filename.replace("../", path+"/OEBPS/")
Whenever I encounter an epub whose images don't load, I need to load up the epub, look at the folder structure and then manually add in the branching path.
I'm sure there's a better way to search the epub to locate the image folder itself which would work for any yet undiscovered filepaths.
I could not find a button to do it. The closest thing I found was "Archive" where the article disappears into undisclosed location.
Offtopic: Great app! I am thinking of replacing Pocket with it.
The unit tests haven't been updated in a long time and need to be written from scratch. Currently the app is tested manually when changes are made and then tested again in a staging environment, but human error and all that.
Need to first brainstorm what the unit tests should be and then write them.
I received an email from Lurnby the other day that looked like this:
Looks like the sender name is coming through as "team" because of "[email protected]" in the Config.py file.
To fix this, you can format the name as Name <[email protected]>
. Reference: https://stackoverflow.com/questions/44385652/add-senders-name-in-the-from-field-of-the-email-in-python
I did this in RecSys and it made my emails more user friendly & better branded.
Currently can't parse articles that are behind a paywall.
This includes medium articles that are rate limited, as well as articles from Bloomberg properties like CityLab.
The alternative for now is to copy & paste and add the articles manually.
In the future there is an option to connect via api or see how a user might be authenticated to these sites.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.