Current Logic
- When the user goes to the homepage, pull all the user's flashcard sets.
- The flashcard sets pulled in a paginated way (you fetch 3 at time).
- The sync is purely additive as update/delete were initially mega broken.
User Problems
- This process is really slow with large flashcard sets, as they can be massive going over the wire (imagine 1,000 flashcards in a single flashcard set). Each pull of 3 flashcard sets, writing it into local DB, and redrawing the UI can take up to 5 seconds. The query is also expensive on the server side as we need to do a join.
- Users are complaining about their flashcard changes being overwritten. This is because for existing users (users using the same device over and over again), they sign into the app, go into a flashcard set, make a change, and then the server-side fetch completes too late and overrides their changes.
- People are noticing that the syncing of content across several devices is janky in general, since it's really slow and the update task doesn't continue in the background if the app is closed too early.
Lay of the Land
The vast majority of users will never need to read, only write. They have the app on a single device and make updates to their content locally, which is then pushed to server (so for them, client is always ahead of server). We need to make sure that for these users, we minimize bugs and load for their device. And of course, while doing this, the users who do have multiple devices or end up migrating from 1 device to another (old phone) should have their content stay in sync.
Proposed Solution
Decouple flashcard sets from flashcards
The problem is that a single flashcard set can be too heavy for the system, but you don't need the actual flashcards up front (i.e. rendering the homepage). When refreshing the homepage, we should only pull the data needed to keep the flashcard set models up to date, which is updating counts and learned percentage, adding new flashcard sets, and removing deleted flashcard sets. This will be tricky as the client is currently inferring the number of flashcards from the actual number of flashcards in the system per set. The client will need to introduce a new field like "superficial_count" (passed down by the server) that it renders until it gets the full list of flashcards and can then render the actual count. Overall, this means we can probably get rid of pagination given that the flashcard set model is extremely light (just a few fields), and most users don't even have 10 flashcard sets.
Fetch the flashcards on demand
We should refresh the flashcards only when we need them. It is very possible that a user makes a flashcard set and then goes on to never use it again, especially given that studying is seasonal. Luckily, there is a gap between the user clicking into a flashcard set and actually interacting with their flashcards due to the flashcard set landing page. We should kick off the API call to update the flashcards at that stage. The only real user experience loss is if they are on a new device, they may see a blank screen as we "lose the race" and are still fetching their flashcards.
Only update what's needed
Right now, the app refreshes everything all the time, which is wasteful. Users who never to rarely update their flashcard sets in particular should be rewarded for not introducing too much strain on our system. We should introduce the concept of last updated time on each relevant model, so we can only pull what's needed.
Flashcard Sets Update (Library)
- The client will have a timestamp of when it updated all of its flashcard set models. It will be initialized at 0 (maximum time in the past).
- When fetching flashcard sets (i.e. rendering the homepage), the client will pass this time and the server will only pass back the flashcard set objects with a delta since that time.
- The server will have to keep track of last updated time for each flashcard set. This means that on any endpoint updating the flashcard set (like renaming) or flashcard count (add/delete flashcard endpoints), the server will update this new "time_last_updated" field for the relevant flashcard set.
- When the client updates a flashcard set locally, we will overwrite the timestamp in the ACK for the update. This way, the single device users aren't having the client unnecessarily "catch up" to the server.
- We will need to get rid of hard delete and have a concept of update type from the server "added, updated, deleted".
- When a flashcard set is deleted, mark it as deleted and hard delete all the flashcards within it.
Flashcards Update (Flashcard Set)
- Similarly, we will need to do the same for flashcards. The client will need to remember when it last updated all the flashcards within a set and send that to the server when updating the flashcard pool.
- When the client finishes a flashcard sync, update the local timestamp.
- When the client updates a flashcard locally and gets the ACK, update the local timestamp.
- The server will need to know when each flashcard was last updated within the database (new column). It will be updated whenever the flashcard changes (term update, definition update, delete).
- This way, when the client asks the server for any updated flashcards, the server only passes down what changed. This will have huge gains for large flashcard sets, as you won't need to rewrite 1,000 objects in the client database every single time the user opens their favorite giant flashcard set.
- We will need to get rid of hard delete and have a concept of update type from the server "added, updated, deleted".
- When a flashcard is deleted, mark it as deleted.
Client Action Items
- Add new "superficial_flashcards_count" field to flashcard set model. Render it on homepage and flashcard set landing page.
- Add new "superficial_learned_percent" field to flashcard set model. Render it on homepage and flashcard set landing page.
- Add new timestamp for last library (flashcard sets) update. Store in SharedPreferences. Update on any flashcard set change (add, delete, field change).
- Add new timestamp for flashcards update on each flashcard set. Put on flashcard set model. Update on any flashcard update alongside the mass learn/unlearn.
- Integrate new flashcard sets fetching API on homepage.
- Integrate new flashcards fetching on flashcard set landing page.
- [P1] Loading state on browse and add/edit page if we lost the race for a large flashcard set on a new device
Server Action Items
- Change flashcard set deletion endpoint to be soft delete (but hard delete flashcards still). This is an existing endpoint change. We will also need to change the current flashcard set fetching endpoint so it filters out deleted flashcard sets (otherwise they will keep coming back!).
- Change flashcard deletion endpoint to be soft delete (update existing endpoint).
- Add timestamp of last update to flashcard set model. It will need to be updated on any flashcard set field change alongside add/delete flashcard (these are all existing endpoints).
- Add timestamp of last update to flashcard model. It will need to be updated when it's learned/unlearned, has its term changed, has its definition changed, is added, or is deleted (these are all existing endpoints).
- New endpoint (non-paginated to start?) to fetch flashcard sets changed since a timestamp. We will need a "change_type" field to let the client know if it's "added, "updated", or "deleted". On top of the flashcard set fields (like name), we will need a "flashcards_count" and "learned_percent" fields, since the client needs that to render the homepage UI.
- New endpoint (non-paginated to start) to fetch flashcards changed since a timestamp. We will need a "change_type" field to let the client know if it's "added, "updated", or "deleted".
NOTE: We should do all this for folders eventually, but it's not a big deal as folders usage is very, very low.