Coder Social home page Coder Social logo

croque-scp / notifier Goto Github PK

View Code? Open in Web Editor NEW
11.0 1.0 5.0 1.13 MB

Forum notification service for Wikidot

Home Page: http://notifications.wikidot.com

License: MIT License

Python 75.33% SQL 13.41% Shell 0.58% Dockerfile 0.55% HTML 10.13%
wikidot scp scp-wiki scp-foundation lambda mysql

notifier's Introduction

Wikidot Notifications

tests

This is the open-source codebase for Wikidot Notifications, a cloud service providing forum notifications for sites on Wikidot.

This notifications service searches for new forum posts on Wikidot and delivers notifications for users that are subscribed to them via email or Wikidot private message. Manual subscriptions as well as a set of rules defining 'automatic subscriptions' are supported. Notifications can be delivered in several languages contributed by members of the Wikidot community.

This service is operated and developed by Wikidot user Croquembouche and is not associated with Wikidot Inc. or any particular site hosted on Wikidot other than notifications.wikidot.com.

See also:

The notifications service is written in Python and runs on AWS Lambda using a MySQL database on AWS EC2.

Usage

Warning

There must only be one instance of this service active. Duplication would result in duplicated messages and would cause spam.

The instructions below are provided in case this specific service is no longer able to operate and/or I am no longer able to maintain it. Do not attempt to launch another instance of this service outside of that circumstance.

Installation

With Docker

Requires Docker.

The Dockerfile specifies a number of stages. For local testing, set the target stage to 'execute':

docker build --target execute --tag notifier:latest .

Locally

Requires at least Python 3.8.

Via Poetry:

poetry install

Authentication

In addition to the config file based on the one provided in this repository, notifier requires an additional authentication file to provide passwords etc. for the various services it needs.

See docs/auth.md for more information and instructions.

Database setup

For local development and testing, notifier requires a MySQL database. See docs/database.md for more information and instructions.

Local execution

To start the notifier service in a Docker container:

docker run --rm notifier:latest path_to_config_file path_to_auth_file

Or locally:

poetry run python3 -m notifier path_to_config_file path_to_auth_file

Or with Docker:

docker build --target execute --tag notifier:execute .
docker run --rm notifier:execute path_to_config_file path_to_auth_file

The config file that my notifier instance uses is config/config.toml. A sample auth file with dummy secrets, used for CI tests, can be found at config/auth.ci.toml.

The service will run continuously and activate an automatically-determined set of notification channels each hour.

To activate an automatically-determined set of channels immediately and once only, add the --execute-now switch with no parameter. Note that this must be run during the first minute of an hour to match any channels.

To activate a manually-chosen channel or set of channels immediately and once only, even at a time when such channel would not normally be activated, add the --execute-now switch followed by any of hourly, 8hourly, daily, weekly, monthly and test.

The test channel will never be activated during normal usage. Note that the user config setting for the test channel is hidden, and can be selected by executing the following JavaScript while editing a user config page:

document.querySelector("[name=field-frequency]").value = "test"

To restrict which wikis posts will be downloaded from, add --limit-wikis [list].

Remote deployment

The notifier service is not intended to be executed locally or even to be executed as a continuously-running service during production, but rather to be deployed to the cloud using AWS Lambda and a handler that calls --execute-now.

See docs/deployment.md for more information and instructions.

Development

Produce a sample digest and print it to stdout, where [lang] is the code of any supported language and [method] is either pm or email:

poetry run python3 tests/make_sample_digest.py [lang] [method]

Lint:

poetry run pylint notifier
poetry run black notifier

Typecheck:

poetry run mypy notifier

Testing

Testing locally

To run tests directly on your machine:

poetry run pytest --notifier-config path_to_config_file --notifier-auth path_to_auth_file

"_test" will be appended to whatever database name is configured, as described above. Database tests (tests/test_database.py) require that this database already exist.

I recommend using a MySQL server on localhost for tests.

Testing with Docker

A Docker Compose setup is present that will spin up a temporary MySQL database and run tests against it:

docker compose -f docker-compose.test.yml up --build --abort-on-container-exit

Status

Status frontends are located at:

notifier's People

Contributors

h-lekter avatar hoah2333 avatar rossjrw avatar ulrik54 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

notifier's Issues

Investigate cheaper database using EC2

My cost breakdown so far is as follows:

Resource Service Cost
Aurora Serverless Usage 60%
Tax 20%
Aurora Serverless IO 11%
Secrets Manager Storage 8%
Aurora Serverless Storage ~0%
Secrets Manager IO ~0%
Lambda Usage 0%

While it feels great being fully serverless, Aurora Serverless is still the vast majority of my costs. It'd be real nice to reduce that.

One way I could do that would be to use an EC2 as a database host.

When architecting this initially, I dismissed this option as I was of the understanding that an hour was the minimum billable usage granularity for EC2. The notifier runs every hour, so this would be effectively continuous usage. At that point, EC2 was about equivalent to Aurora Serverless price-wise, which was otherwise the better option.

However, either my assumption was incorrect or that information has changed. The minimum granularity for most instance types is actually one second, with a minimum of 60 seconds. Additionally, there are several ways to reduce the cost further, including savings plans and reserving dedicated instances - but by far the most efficient is Spot Instances, which only come with the caveat that AWS may withdraw their compute capacity to give to other On-Demand customers. However, they give 2 minutes' notice for this, so if I can fit all of my database operations into a two-minute timeframe, this won't affect me at all.

That is unlikely, though, especially considering that I'll need to first boot up and also shut down the instance with the main lambda. More likely I'd need some sort of spot instance management system - I'll have to look up what the standard way of going about it is.

I would also need to a) attach an EBS instance to give the EC2 a persistent storage, and b) install MySQL (or use Docker). But it should be cheaper than Aurora Serverless regardless.

Purge old data

Execution time tends to increase:

image

Linear increase suggests to me that this is growing with the number of downloaded posts (as opposed to the number of users, which I would expect to elicit irregular jumps). (It's worth noting that the large jumps depicted are from me refining and optimising the application.)

If this assumption is true, could it be the case that a lot of the posts I have stored are junk data which will never be useful?

If so, and if they can be detected, could clearing them out on a regular basis help to flatten out the execution time increase?

Autosubscribe users to their pages' discussions

The frontmatter of a thread, if the thread is a page discussion, contains the slug of the page that it discusses. This is the only available information about the page.

From this, I can in theory track the page and who created that page, and autosubscribe that user to it if they're not already. This is easier said than done for several reasons:

  • There are several Wikidot calls in between me and the author information.
  • Several of the wikis that use the notification service have attribution metadata that is entirely external, and not consistent between wikis.

It doesn't make much sense to reinvent the wheel here (like I have before... rossjrw/tars#170 rossjrw/tars#221).

If I ask SMLT nicely I might be able to use Crom as a centralised source of attribution data. The wikis that Crom covers broadly overlaps with the wikis that notifier covers.

I will need to bear in mind the following...

  • Attribution metadata can change over time. How will I keep my internal records as up-to-date as they need to be while minimising load on Crom?
  • Should I be recording attribution metadata in my database at all?
    • I think yes, because I need to know it in order to know who to notify - especially after #59 where all of that work happens in the database
  • What happens if attribution metadata changes, but I don't refresh my cache in time, and the wrong person gets notified?
    • In theory the right person would be notified once the cache refreshes, because I track whether someone has been notified about a post per user rather than per post. BUT that wouldn't happen if they were notified in the meantime, because I only track the date of the most recent post they were notified about, and assume they were already notified about anything posted before then.
  • I could reduce load generally by only handling author stuff, say, in a cleanup task that runs daily. Effectively these new autosubscriptions would be run at daily intervals. Hourly users wouldn't receive these notifications for 23 of 24 runs, 8hourly users for 2 of 3 runs, and daily+ users would be unaffected.
    • This will be really hard to document in a way that makes sense to users.
    • I also just flat out don't like it. I want each run of each channel to be the same.

I will need to discuss with SMLT the best ways of doing these things.

Notify about mentions of user's pages

After #77, and similarly to #88, it should be possible to be notified about URL mentions of pages the user has created.

Similarly to #88, there is a possible attack vector here; I'll contain discussion about it to #88.

Notify about mentions by name

The service could notify users when their name is mentioned in forum posts.

First suggested here by Kufat: https://scp-wiki.wikidot.com/forum/t-14209085/forum-notifications-service#post-5094544

This would be a new type of configuration. The UI would probably be a separate checkbox on the user config page.

Unsubscribed threads should be respected where possible.

Not sure what the correct approach is re grandfathering in existing users. I forsee a possible attack vector where a malicious user can ping lots of subscribed users to spam them - in fact this does already happen on e.g. IRC. It might end up being the case that this is a poor idea.

Indicate how many notifications are waiting for error users

When a user has an error in their config that prevents Notifier from sending them anything, current behaviour is to tag their config page with a reason for the error. The config wiki reacts to the tag by adding an explanation of what the error is, but still cannot notify the user.

An error user might be encouraged to fix their error if they were able to see how many notifications were actually waiting for them. It could be worth exploring adding some metadata to the page to show that value, which would not only encourage an ambivalent user to fix the issue, but may also encourage other users stumbling across the page to let them know. As I am myself an other user who frequently stumbles across config pages, having a citable number would give me some authority in being able to justify why I am reaching out to a user.

The value being added to the page must account for the following:

  • The user must not be able to edit the value.
    • That rules out storing the value in a form field on the page, as the user will be able to edit that, even if the field is hidden.
  • It should be easy for Notifier to add this value as part of an automated process.
    • This should be pretty easy - almost anything you can do on Wikidot is exposed as a module.
  • The information added by the value must be able to be expressed verbosely on the page.
    • This would be easy if, for example, the value was a tag - I could add a note that says 'Hi %%user%%, you have %%tag%% notifications waiting for you.'
    • This rules out storing this value solely in the database, as it would never be exposed to the user.

A few ideas on where this value could be stored:

  • As a tag
    • Easily extend extant effects
    • I don't know if tags can be parsed. Ideally I'd have something like 'waiting-notifs:20'. But as far as I know tags are just immutable strings, and you can't filter a ListPages by pages with tags that start with a string. I could add 'waiting-notifs 20', filter for 'waiting-notifs', and then no parsing is needed - but how do I then select the numeric tag for display?
    • There is the option of adding the value as a hidden tag and then using %%_tags%% to show all of them - which would just be the value, assuming it's the only one. I think there would still be an underscore to remove though.
      • The leading underscore can actually be removed by interpolating it as part of underline formatting. CSS could then be used to strip the underline.
  • As a page HTML meta attribute
    • Not queryable with ListPages, so not useful for reporting.

It looks like using a hidden tag is the way to go:

[[module ListPages range="."]]
_%%_tags%%__
[[/module]]

[[module CSS]]
span[style="text-decoration: underline;"] {
  text-decoration: none !important;
}
[[/module]]

Stats frontend: Provide minimal stats file

Requires #46

The status frontend only needs timing information from the single latest run. I expect this to be downloaded a lot more than the graphs frontend. A separate stats dataset should be created for it to minimise how much data is transferred.

Regression testing for notification accuracy

As part of #85 I need to regression test that the notifications retrieved for a given user are the same as they were.

This obviously needs to be done outside of those changes so that it has something to test against.

Graphs frontend: sort channels by timestamp

I'm pretty sure I made channels execute in reverse frequency order (58943c9) - therefore the hourly channel should be stacked on top of the others, but it's on the bottom:

image

This makes the graph inaccurate as it's supposed to show the order of stages that the process goes through.

I don't think I can sort the stacked bars on a per-column basis because they're all different chartjs datasets, but I can sort the datasets so that they're in the right order based on what notifier is supposed to do.

Fix user config titles

There's a bug in user config creation where if the username has a space in it, the title of the resulting user page will be the text of their username up until the space.

This is a limitation of Wikidot's link syntax so I don't think I can do anything about it.

Instead I should correct these broken titles as part of the cleanup task.

I want to allow users to have any title they want. I only want to correct this one specific break, so the cleanup rule should be very permissive.

Stats frontend: better caching

After #81, the stats frontend now uses a lot of data while checking for new updates - if the stats are about 500 kB, then it can use up to 7.5 MB as it redownloads the file each minute.

Instead, better caching should be leveraged.

There must be some sort of way to get fetch() and CloudFront to play nicely with each other, letting the browser know whether the content has actually changed so that only a minimal request needs to be made.

Offset weekly channel by 1hr

The deletion cleanup task for the weekly channel, which happens on the run that precedes it, currently intersects with the daily channel. Runs that contain the daily channel last the longest and adding this cleanup task to it increases the risk of timeout. (#9)

I don't feel there's any risk involved in stacking the weekly and monthly channels, but in principle the monthly channel should be offset by 1hr as well. Possibly 2 hours so that its cleanup task doesn't intersect with the weekly channel execution.

  • Offset weekly channel
  • Offset monthly channel
  • Update FAQ

Threads missing first post entries

The following query reveals 333 posts in the test dataset (2023-06-07) that are lacking an entry in the first post table:

select id, substring(title, 1, 20), wiki_id, creator_username, created_timestamp, (select count(*) from post where post.thread_id = thread.id) as post_count, is_deleted from thread where not exists (select null from thread_first_post where thread_first_post.thread_id = thread.id) order by post_count;

Of those, 16 have zero posts associated with them. Most have 1 or 2 posts; some have more, with the top having 47 and the second having 28.

Many, but not all, of these posts have actually been deleted from their wikis. None of them are marked as deleted in the database.

I don't know if this is causing problems or not.

It's possible that this is caused by the first post in a thread being deleted, but the thread remaining in place. This is entirely speculation, though. I actually think I would only have marked the post as deleted but not actually delete it in that case, so maybe not.

I assume re-running migration 1 will one-off fix the issue for the current dataset, but I'd really like to discover what the root cause is first.

Service notifications

From time to time I may wish to notify users of changes to the notification service. A big one will be #77.

Couple of ways I could go about this:

  • A thread that all users are always autosubscribed to.

    • I like the idea of using the service in a meta way like this, with minimal added functionality.
    • Users would still be able to manually unsubscribe if they really wanted to.
    • I'd have to be absolutely perfectly 100% certain that only I can post in that thread. I'd have to make sure to only do it for really important things.
    • I'd want the OP of that thread to not be notified about, only posts in it. A global override could work for this.
    • The concept of 'has user X seen service notification Y?' is always answered, provided that my logic around notification guaranteeability is sound.
    • Batching service notifications to users who don't receive notifications very often is handled by the current notification system.
    • Downside: users may go ahead and ignore the notification, as only the snippet would be visible.
    • Downside: there's no way to passively delay the service notification until the user would have been notified by something else. It's a notification, so they receive it immediately.
  • A custom thingy that inlines service notifications directly.

    • I can do whatever I want with the notification, e.g. send immediately or wait until the user would be notified.
    • I'll need to track on a per-user basis what service notifications they've seen, so I can always catch them up to speed.
    • I can put the full body of the notification right into the digest.
    • Downside: I can't quickly go and fix a mistake I made in the text and hope that there's still a lot of users who won't have seen it yet.
    • Downside: Service notifications will be commited to version control unless I find some way to store them elsewhere.
  • A mix of the above - a forum thread somewhere, with special logic natively implemented.

    • The database doesn't have the option to store the full thread text, so it would still need to be a snippet.
    • The special logic could make the 'notification' not appear when checking if a user has notifications, but only when gathering notifications for a user, so it would not prompt a digest if it's the only item.
    • The special logic could put the notification at the top of the digest.
    • I can write the post with the snippet's restrictions in mind, to make it appear intentional.
    • The special logic could add decoration to the notification, e.g. 'Read more'

Upgrade to MySQL 8

After #57 I can use whatever database software I like. I should upgrade to MySQL 8.

  • MySQL 8 is compatible with ARM.
  • If I can run on ARM, I can run on AWS Graviton servers
  • Graviton servers are cheaper than their non-Gravity equivalents.

Additionally, while there are non-burstable instance types that I can switch to in order to get both a cost saving and a performance increase, all instance types with both of those criteria are Graviton.

Status frontend: refresh automatically

Most of the states the status frontend can be in involve some sort of 'X mins until Y' statement, which quickly becomes inaccurate. It would be better if those value were automatically updated to reflect the current time, and if the status frontend overall refreshed completely as new runs happen.

Reduce stats file size by reducing repetition

Requires #46

The size of the stats file can be optimised

Most of the keys end with "_timestamp" - this could be replaced with "_t" or even omitted

There might be a better way of sending the timestamp data - e.g. I might only send the smallest timestamp, and then each other timestamp could be replaced with the difference between it and that, and the real timestamps calculated on the client

Custom crontabs

Other than the first term needing to be 0, there's no reason a custom crontab per user wouldn't work.

I'd need to be sure to honour it only if the frequency is set to custom, even if the field is filled.

I'd need to only show the field if the custom frequency is selected. Possible with CSS alone?

Condense list of error users by removing inactive users

I currently publicly list error users as a way of making them visible to myself and other users who may then go on to let them know they have an error.

However, a significant portion of these users are inactive, estimated based on their account history - e.g. an inactive SCP user might have a sandbox page, a forum thread about it, their notification config, and no activity since those pages were created. There's not much point listing these users as they're unlikely to ever notice that they have an error and it makes active users in the list more difficult to notice.

I should remove these users from the list. I could maybe list them separately in a collapsible.

From the site side, this is easy - filter the first list for those lacking a tag that indicates the user is inactive, and filter for that tag's presence in the second list.

From the backend side, adding that tag is easy - I already do this for some situations:

Programmatically determining exactly when a user is considered inactive may take some nuance. I might consider conditions like:

  • Number of waiting notifications > 0
    • Difficulty: trivial
    • This metric might be a red herring. After #59 nothing is evaluated for a user who doesn't have any waiting notifications, so they won't even be marked as an error user (other than users marked before then).
  • Recent forum activity
    • Difficulty: medium - no current implementation, but I have the data already
    • Note that I would still have this data even after #65 as I need to track forum posts created by subscribed users regardless
    • I don't think this is fair - an inactive user wanting notifications from old posts is a perfectly valid use case.
      • Note to self: this change doesn't prevent that user from getting notifications, just reduces the visibility of their error report. It's not that big a deal
  • Some metric for 'has ever been regularly active'
    • E.g. a broad filter could be 'has ever received 5 replies to their posts/threads'.

Current best filter I can think of:

  • has made 5 posts OR has recieved 5 replies OR subscribed < 3mo ago OR last posted < 1mo ago
  • That filters out users who have been active at some point (either by making lots of posts or making few well-recieved posts), while not filtering out new users who only became active in the last 3 months, and not filtering out users who are provably active right now
  • Account creation date doesn't matter, history starts from when they subscribed to notifier

Record end timestamp after fail

If something fails during the notifications process, before the lambda terminates, an end timestamp should be recorded. This will prevent the duration graph from appearing more successful than it is.

Show users unnecessary manual subscriptions

Some users subscribe to threads that they don't need to, as they're already covered by the automatic subscription rules. It would be nice to show them a list of such subscriptions such that they can remove them if they choose.

Find alternative to IPv4 addresses on the Lambda ENIs

Amazon have recently announced that they will be soon be charging for all IPv4 addresses, rather than just those reserved but unassigned. I currently use two such addresses which are bound to the Lambda ENIs to enable Lambda execution internet access. These will soon be charged the same as unused addressed, which will significantly increase my hosting costs.

Off the top of my head, one alternative could be to use the database server as a NAT gateway - but this seems like a pretty bad idea - keeping the database on a private server with no public internet access feels like the right call. I'll have to see if there are any security implications, or if it's possible to do it safely.

Amazon recommends adopting IPv6; I don't know how feasible this is or what the ramifications are.

Remove 'automatic subscriptions' key phrase

I don't like 'automatic subscriptions' as a key phrase that generically covers all the ways a user might be automatically subscribed to something. I don't think it even makes sense; subscription implies an action taken to get to that point, it should be synonymous with 'manual subscription', but here we are.

All places that mention 'automatic subscription' should be replaced with a list of what that actually means. It's like 2 things right now, more soon, and it'd be good to be as explicit as possible at all times.

Requires:

  • Changes to documentation
  • Changes to lexicon

Fetch notifiable user list before iterating users

Currently, the notifier filters users by the selected channel and then iterates one-by-one to notify all of them.

However, on any given run, the vast majority of users do not have any notifications waiting for them. Iterating them one-by-one, in this case, is exceptionally stupid.

Instead, the database should first be asked for a list of users that have notifications waiting for them. Only this list of users should then be iterated.


Requirements:

  • The query should not be slow. It can take probably about 100x the duration of the query for a single user and still be faster overall, but a faster query will have greater benefits.
  • The query must return all users that have notifications waiting, otherwise notifications will be missed.
    • 'Notifications waiting' is defined as 'if a notification digest were to be compiled for this user right now, it would contain at least one notification'
  • The query should not return users that do not have notifications waiting. It should return as few as possible false positives, but the only downside is time wasted on these users.

Counterpoint: this might not be possible, because the bottom bounding timestamp for the post search for each user varies between users - it's the timestamp at which they were last notified. There's not a catch-all query that could answer this question for all users regardless of timestamp.

Countercounterpoint: a given user's last notification timestamp is recorded in the database, and I can just use that. This will possibly need a subquery to retrieve that data for immediate use, and that'll be real slow, but I don't care, because it's competing against running main query 400 times in a row and that's not a very high bar.


Manual subscriptions are going to be difficult.

I don't need to worry about manual unsubscriptions, because it doesn't matter if there are some users in the resulting list who shouldn't be. I do need to worry about manual subscriptions, because the list must contain everyone that it needs to.

Let users request a list of their automatic subscriptions

The conditions for what constitutes an automatic subscription are complicated enough, but if I'm going to add Wikidot authoring and Crom as additional factors, it'll be almost too complex to document.

I can ease the user experience by letting users request a list of exactly what they're subscribed to.

Requesting a list should be entirely automated. One possible way for the user to communicate their intent would be to add a checkbox to their config. When notifier processes the request, it could edit the page and uncheck it.

The list should contain information about the post or thread AND the reason that it is autosubscribed. The list might also contain autosubscribed posts that have been unsubscribed from too.

I don't think there's much point batching requests until e.g. the daily channel. They should be processed in the next run. This would also make the feature appear responsive and would let users get and action the info faster.

There might be some benefit to staggering how many requests can be processed in a single run, as a way to protect against an attack where many requests are made simultaneously. In that case, requests should be processed in page edit order, stalest first.

Store activation timestamps one-by-one

Currently I log all activation timestamps to the database in a single query at the end of the process.

However, if the process overruns the time limit and is cancelled, this logging does not get a chance to happen even if most of the other steps did. There is therefore no stats-based evidence that the service ran at all.

By logging activation timestamps as soon as they're recorded, then during the activation after a failed activation, I can upload the logs for the failed activation as well.

Track thread page for each post to optimise deletion

Deletion currently runs on an entirely per-post basis and isn't aware of anything else. If it scans a remote thread page and observes that a post seen there other than its target post is in it check queue, it can optimise by marking that post as not deleted which prevents it from checking that thread page another time. But that only happens coincidentally, and it can't select posts to check from the database to optimise how often that case happens.

By tracking thread page for each post, the posts that are selected to check for deletion can be optimised to minimise the number of unique pages that will be checked. This will enable the above optimisation to have as much impact as possible; additionally, it will be able to guarantee that a post not seen in that thread page was actually deleted (which the optimisation doesn't account for, needing to check for that post individually every time).

However, thread page isn't immutable. If a post earlier in the thread is deleted, posts after it in the thread move up a slot; there are 12 slots per page so necessarily 1/12 of the remaining top-level posts will change page number. After #85 we do not track posts that are not notifiable so this deletion will most likely be missed on our end. The consequence of that will be that every time an untracked post is deleted, 1/12 of the tracked top-level posts in that thread would be erroneously marked as deleted on our end.

What's the solution for that?

Store last notification time with user config

Tables user_config and user_last_notified track user configs and last notification date for users respectively. This information is gathered together into a unified object for the Python side by get_user_configs_for_frequency.sql which left joins the two tables. This process takes 1-2 mins on my local machine, indicating that it is a significant time sink.

The reason the two tables are separate is because the user_config table is wiped with each run and redownloaded from Wikidot. This fact also makes adding indexes to speed up the process useless, because rewriting the data is guaranteed.

A better approach would be:

  1. Track last notification date in the same table as the rest of the user info.
  2. Instead of wiping the data, selectively update/add/remove rows as needed according to the new data.

As only a tiny minority of users will update their config between each run, this should make both writes and reads vastly more efficient.

Fix wiki secure config automatically

Whether a wiki is http or https is set as a bool flag on the remote config. When a wiki changes whether or not it's secure (which is rare but happens), this is breaking because Wikidot will only accept requests with the correct protocol. The fix is that I have go to the config and switch the flag, but only after having observed the error at least once.

Instead, the notifier account could do this edit for me when it hits such an error.

Set up metrics for database server

I currently don't have any metrics set up for the database server after #57, beyond what AWS provides by default. The default metrics don't touch the EBS volume - it will at some point run out of storage. I will need to have metrics set up before that happens.

The EBS volume is 8GB. Last I checked, the database was about 300MB? I think? It should be a while before it starts hitting a limit, but I want to be on top of things by the time that happens.

Potential optimisation for hourly channel

Currently I treat the hourly channel the same as all others, by asking the database if there are any users subscribed to that channel with waiting notifications. This results in a task being performed for each user on that channel.

I could reduce that calculation by recording a list of usernames that notifier saw during new post collection, and only asking the database about those users.

Stats frontend: indicate if current user has errors

Requires #46

In the status frontend, the UI could attempt to determine whether the current user has configuration errors, as another medium by which the user might discover the error.

This will be a lot easier said than done. It is possible but nontrivial to load the current user's username into an iframe, but the SCP Wiki forbids iframes on its site from transmitting information about the user, so using that value as part of a request is a non-starter.

Here's a couple of ideas:

  • A list of users with errors is part of the public log, which is downloaded into the frontend. If the current user is in that list, show the value.
    • It will not be trivial to add this data to the log, as it would be an entirely new field. I don't know if I want this data to be stored in the database even.
    • I suppose this field could just be a second public log, and would get uploaded as its own file during the maintenance stage.
    • Probably the minimal amount of information that needs to be downloaded.
  • http://notifications.wikidot.com/redirect-to-your-config exists to automatically send users to their own config page. The frontend could fetch that URL, follow redirects, and use the results to determine whether the user has errors.
    • This does not technically involve transmitting information about the user, although it does involve downloading information about them.
    • I have no idea whether this redirect actually works when initiated this way.
    • HTTP request probably doesn't work in an iframe loaded on a HTTPS site.
    • Uses more data than might be necessary (it will load at least 2 entire pages behind the scenes).

Display a reason for each notification

#85 added a set of flags that indicate which context/contexts is/are responsible for a notification being raised:

-- Flags indicating the reasons that a post emits a notification
CASE WHEN (
thread_sub.sub = 1
) THEN 1 ELSE 0 END AS flag_user_subscribed_to_thread,
CASE WHEN (
post_sub.sub = 1
) THEN 1 ELSE 0 END AS flag_user_subscribed_to_post,
CASE WHEN (
context_thread.first_post_author_user_id = %(user_id)s
) THEN 1 ELSE 0 END AS flag_user_started_thread,
CASE WHEN (
context_parent_post.author_user_id = %(user_id)s
) THEN 1 ELSE 0 END AS flag_user_posted_parent

These should be used in the digest to justify each notifcation's existence, which prompts the user to deduplicate contexts and also lets them know exactly how to e.g. unsubscribe from a given notification specifically. Will be increasingly important as new contexts are added (#77, #86, etc).

Subscription confirmation

A user should receive a subscription confirmation on the next hourly channel after their initial sign up.

Documentation on the site should lead the user to expect to receive this confirmation that their subscription is set up correctly.

I'll need to research how best to implement this, as it's crucial I only ever send each user one confirmation, no matter what. This is absolutely a solved problem. That being said, instinct tells me I should track a positive value when I add a new user and then remove it when that user is confirmed. This makes the 'default case' a user who is already confirmed, so should hopefully prevent me from ever accidentally sending a confirmation - because to send one, I need to explicitly opt in.

Hourly channel not recorded after monthly channel

Monthly channel ran extremely quickly (100ms). Hourly channel took the normal amount of time after that. Channel timestamp log, as depicted in the graph, only shows the monthly channel, and it shows it as taking the amount of time that both channels combined took. This doesn't happen for any of the other non-hourly channels.

The hourly channel timestamps were not recorded - they're not in the logged outbound S3 storage request, and they're not in the inbound CloudFront json response - only the monthly. The frontend naively interprets the gap between monthy and the start of cleanup as belonging to the monthly channel.

Might be worth checking if the hourly channel was recorded to the database. I should also start logging timestamp record events so I don't have to do that in future.

Impact: almost none, slight inaccuracy in graph.

Crom support for Backrooms Branches

Since Crom support for Backrooms branches has gone out of beta and in stable phase, the Crom field of "backrooms-wiki" wiki in the Supported Wikis wiki page should be ticked and Crom should be used on that wiki

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.