open-spaced-repetition / fsrs-rs Goto Github PK
View Code? Open in Web Editor NEWFSRS for Rust, including Optimizer and Scheduler
Home Page: https://crates.io/crates/fsrs
License: BSD 3-Clause "New" or "Revised" License
FSRS for Rust, including Optimizer and Scheduler
Home Page: https://crates.io/crates/fsrs
License: BSD 3-Clause "New" or "Revised" License
There are two methods that can be used to search the optimal retention for users.
Simulator-based: https://huggingface.co/spaces/open-spaced-repetition/fsrs4anki_simulator
SSP-MMC: https://github.com/open-spaced-repetition/fsrs-optimizer/blob/95694b787bb71ac9883db1201af09e334ee5ee0b/src/fsrs_optimizer/fsrs_optimizer.py#L739-L843
@L-M-Sherlock I have a question about FSRSDataset before I refactor it.
Currently there are ::train() and ::test() methods that return the same dataset. Is there a plan to make them return separate data in the future? Or would one method be enough?
Likewise with dataloader_train/_test - will they differ in the future? Or should they always have the same source data?
The outlier filter has been applied in benchmark dataset. But current FSRS-rs applies the outlier filter inside the API. So the outlier filter is applied twice, which is unintended. @dae, could help me add an option to turn off outlier filter when benchmark?
@L-M-Sherlock, you mentioned that you plan to implement the online learning version of the FSRS optimizer in Anki. (open-spaced-repetition/fsrs-vs-sm15#36 (comment))
But does this version support online learning? I am asking this I didn't see any mention of experience replay in this repo.
The python implementation: open-spaced-repetition/fsrs-optimizer#61
In Anki 23.10 beta 3, for cards with incomplete review history, Anki is using the last revlog entry to calculate memory states from SM-2 ivl and ease. However, I recommend using the first revlog entry. The reason is simple: The earlier we can calculate the memory states, the more accurate the results will be.
I described the algorithm for approaching such cards here: #63 (comment)
I implemented the same thing in the Helper add-on (with help from @L-M-Sherlock) in this PR: open-spaced-repetition/fsrs4anki-helper#248
Screenshots:
Python implementation:
It is useful to control the sampler of batch. Sometime we need sort the time-series samples by their created time. Sometime we can collect samples with the same seq_len to reduce padding and speed up training.
The weights generated by Anki are very different from those generated by the python optimizer.
This issue is not caused by the bug where Anki calculates weights for all the reviews (not respecting the preset).
Originally posted by @user1823 in ankitects/anki#2443 (comment)
@L-M-Sherlock @asukaminato0721 I'm a bit confused by part of the revlog importing:
This line adds one to all revlog types. Since the minimum type is 0 before this, the minimum type afterwards will be 1:
This line looks like it will never match, since review_kind will never be 0:
Is there something wrong here? Maybe it would be clearer without adding 1 to everything?
It would require to implement an inference module.
Python implementation:
https://github.com/open-spaced-repetition/fsrs-optimizer/blob/95694b787bb71ac9883db1201af09e334ee5ee0b/src/fsrs_optimizer/fsrs_optimizer.py#L845-L949
I'm starting to learn Rust and as soon as possible, I'd like to contribute to the project.
What are the initial motivations for the project? Where do you intend to integrate it?
just a note for someone interests.
warning: the following packages contain code that will be rejected by a future version of Rust: nom v1.2.4
❯ cargo tree -i [email protected]
nom v1.2.4
└── meval v0.2.0
└── textplots v0.8.0
└── burn-train v0.9.0 (https://github.com/burn-rs/burn.git?rev=a4a9844da3c906e3d4795a6c72ae8c099b7187bb#a4a9844d)
├── burn v0.9.0 (https://github.com/burn-rs/burn.git?rev=a4a9844da3c906e3d4795a6c72ae8c099b7187bb#a4a9844d)
│ └── fsrs-optimizer-rs v0.1.0 (/home/w/gitproject/fsrs-optimizer-burn)
└── fsrs-optimizer-rs v0.1.0 (/home/w/gitproject/fsrs-optimizer-burn)
not a big deal though.
This version doesn't use splits, so instead I propose a different approach. Obtain 2 different sets of initial parameters by running this version on 70+ collections of users, taking the number of reviews and the optimal parameters, and calculating optimal parameters weighted by n_reviews
and ln(n_reviews)
. Since the weighting is different, those 2 sets will also be different.
Then use these 2 sets of parameters to run the optimizer twice. First, using one set (say, the one that was obtaiend by weighting by n_reviews) of initial parameters, then using the other set. Then average the result.
The key idea is that instead of using one starting point in the space of all possible configurations of parameters, we use two starting points. This could improve convergence without using splits.
Preparing the sets of initial parameters:
Deployment:
[2023-08-24T07:47:59.094608+00:00 - ERROR - /Users/jarrettye/.cargo/git/checkouts/burn-acfbee6a141c1b41/8808ee2/burn-train/src/learner/log.rs:82] PANIC => panicked at 'ndarray: could not broadcast array from shape: [2] to: [1]', /Users/jarrettye/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ndarray-0.15.6/src/lib.rs:1529:13
Keep track:
pub struct FSRSRview {
pub reviews: Vec<Review>,
pub delta_t: f32,
pub label: f32,
}
struct Review {
pub rating: i32,
pub delta_time: i32
}
Originally posted by @L-M-Sherlock in #3 (comment)
Edit: new PR is here: ankitects/anki#2633
In the current beta, the bounds for desired retention are 0.8 and 0.97. However, it seems that in optimal_retention.rs, they are 0.75 and 0.95 instead.
Unrelated to this issue, but I noticed that changing the number of days to simulate in the beta has such a big impact on optimal retention that just by changing that setting alone, I can make the simulator output any number between 0.8 and 0.95. This makes me question whether the simulator is accurate.
@L-M-Sherlock I presume this will be licensed as MIT like your other repos; if so, please add this line to the Cargo.toml file, and add the LICENSE file to the repo:
diff --git a/Cargo.toml b/Cargo.toml
index c9cd01d..6d8e711 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -2,6 +2,7 @@
name = "fsrs-optimizer-rs"
version = "0.1.0"
edition = "2021"
+license = "MIT"
PyTorch API:
Python implementation:
def cosine_annealing_lr(lr, step_count, T_max, eta_min = 0):
lr = eta_min + (lr - eta_min) * (1 + math.cos(math.pi * step_count / T_max)) / (1 + math.cos(math.pi * (step_count - 1) / T_max))
return lr
Reference: https://pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.CosineAnnealingLR.html
FSRS doesn't accept interval < 1, so we need to filter out those review logs with the same date and only keep the first one.
reschedule cards based on my answers in this deck
is disabled.These revlogs should be filtered out from review history.
These revlogs' review_kind
is REVLOG_CRAM
, so it's easy to filter out them.
There are two kinds of manual reset: Set Due Date
and Forget
. Their review_kind
is REVLOG_RESCHED
. But the fronter's ease (factor) is nonzero. The latter's ease (factor) is zero.
These revlogs should be filtered out from review history, too. Because I think it's equal to CTRL+J (Reschedule repetition) in SuperMemo: https://supermemopedia.com/wiki/Ctrl%2BJ_vs._Ctrl%2BShift%2BR
We should remove the review logs in front of forget
and forget
itself. Then we need to treat the first review log after forget
as the first learning, whose rating will be used to initialize the memory state.
Due to some legacies, The manual reset were not recorded in the old version of Anki/AnkiDroid. So we need to deal with this case via another way. Normally, the previous revlog of a REVLOG_LRN
shouldn't be REVLOG_REV
or REVLOG_RELRN
. If so, we need to remove those revlogs. In a word, we should find out the the first REVLOG_LRN in the last successive REVLOG_LRNs
Besides, some users may use Transfer scheduling data from one card to another, which would generate more weird review history.
We need to deal with it case by case. Another more convenient method is to skip these cards.
@user1823 and @Expertium, if I miss anything, please let me know.
Context: #95 (comment)
It seems that the previous conversation ended before the idea was evaluated well enough.
The current loss function is CrossEntropyLoss, which has already included a softmax layer inside. Its input is expected to contain the unnormalized logits for each class.
However, FSRS model's output is the stability, which is used to calculate the retention at certain delta_t
. The retention is a form of probability. The softmax layer inside the CrossEntropyLoss will process the retention in an unexpected way. So the weights couldn't be optimized as expected.
Reference:
Currently we require at least 1000 reviews to train. A very enthusiastic user could potentially do that in a few days if they introduced many new cards, but the review history would not result in useful weights. Perhaps we should also require a minimum time span of, say, 1 month between earliest and latest review?
It requires at least two features: calculate stability and output interval
To calculate stability, there are two methods:
Method one: pass the current stability and the latest review can update the stability.
Method two: pass full review history of a card and calculate the stability
To output interval, it requires stability and request retention.
Running locally in the browser and no longer depend large company server.
Support WASM will be nice, is it possible?
As I mentioned in #88 (comment), the current outlier filter is too aggressive (it removes too many reviews). This is fine for the pretrain function (because the pretrain method is fragile). However, the train function should have access to more reviews.
I agree that some sort of outlier filter is required for the trainset also. But, I think that the same outlier filter should not be used for both.
Originally posted by @user1823 in #119 (comment)
Taking the example of my collection (data: stability_for_pretrain.zip)
- For first_rating = 1, I think that it is reasonable to keep delta_t = 1 and delta_t = 2 (in my collection) in the trainset.
- For first_rating = 2, the data in my collection is insufficient. So, I will not discuss about it.
- For first_rating = 3, the initial stability is calculated as about 14. So, it doesn't make sense to ONLY keep the data with delta_t < 5 in the trainset.
- For first_rating = 4, there is no major problem in how the outlier filter is working with my data.
Originally posted by @user1823 in #119 (comment)
If a card is reviewed too early or too late, the first response will be unreliable.
So, what if create another condition based on the ratio (or something similar) of the delta_t and stability and then remove the reviews of cards that fulfill BOTH the conditions. (By both, I mean the ratio condition and the pretrain outlier condition.)
Note: This suggestion is for trainset. I recommend keeping the pretrain filter unchanged.
Originally posted by @user1823 in #119 (comment)
As we know, Anki set a limit of reviews for optimizing FSRS:
According to my recent benchmark, only pre-training is significantly better than the default parameters:
So I think we could allow users to optimize their FSRS parameters even when the number of their reviews is less 1000.
It's pretty easy. Because pre-training only provides the first four weights of FSRS. The related code is:
Lines 260 to 263 in 51d38b4
We can pass an option to FSRS and let FSRS skip the rest of code, replace the first four weights of the default parameters and return the final weights.
A threshold of number of reviews is still necessary. 100~200 is fine. Then the user will enjoy the benefit of personalized parameters sooner.
@dae, what do you think?
pub(crate) const DECAY: f64 = -0.5;
/// (9/10) ^ (1 / DECAY) - 1
pub(crate) const FACTOR: f64 = 19f64 / 81f64;
/// This is a slice for efficiency, but should always be 17 in length.
pub type Weights = [f32];
pub static DEFAULT_WEIGHTS: [f32; 17] = [
0.27, 0.74, 1.3, 5.52, 5.1, 1.02, 0.78, 0.06, 1.57, 0.14, 0.94, 2.16, 0.06, 0.31, 1.34, 0.21,
2.69,
];
Hopefully this will be fixed before 23.12 is released.
Very minor - what do you think about renaming this repo to fsrs-optimizer-rs? The crate name would remain the same, and GitHub should automatically redirect references to the old name.
I noticed that there exists the other repo: https://github.com/open-spaced-repetition/rs-fsrs
What's the difference?
If I understand things correctly, we may be able to remove the .reshape() calls after tracel-ai/burn#686 is merged.
Original issue: tracel-ai/burn#680
I'd love to integrate this work into my iOS projects however I'm unfamiliar with Rust and the mechanics of the FSRS optimizer. Is there documentation or another project integrating the optimizer from this that I can reference to understand how to use this? Thanks
Apply a more efficient method to calculate the initial stability. Then pass the initial stability to the initialization of FSRS model, and freeze the first four parameters.
Python implementation:
Input:
delta_t = [1, 2, 3, 4, 5]
recall = [0.866842, 0.907582, 0.733485, 0.767769, 0.687690]
count = [435, 97, 63, 38, 28]
Note: this input is generated by:
S0_dataset = df[df['i'] == 2]
self.S0_dataset_group = S0_dataset.groupby(by=['r_history', 'delta_t'], group_keys=False).agg({'y': ['mean', 'count']}).reset_index()
We can create it from Vec<FSRSItem>
.
Output:
stability = 1.0671915877802147
The output will minimize the loss:
def power_forgetting_curve(t, s):
return (1 + t / (9 * s)) ** -1
def loss(stability):
y_pred = power_forgetting_curve(delta_t, stability)
rmse = np.sqrt(np.sum((recall - y_pred)** 2 * count) / total_count)
l1 = np.abs(stability - init_s0) / np.sqrt(s0_size) / total_count
return rmse + l1
Python implementation:
Some weights could be calculated by other methods (not gradient descent). We should freeze them in the training process.
Reference:
Currently the generated weights from this crate are slightly behind the Python optimizer. Not urgent, but in the long run, it would be nice if this crate could perform as well.
https://github.com/open-spaced-repetition/fsrs-benchmark#weighted-by-number-of-reviews
Any thoughts on what might be driving the differences?
The current version doesn't support batch_size > 1, because:
Tensor::cat() requires all tensors to have the same shape. But in current version, the length of t_history
is various.
Python implementation:
The following code removes revlogs before the last continuous group of LEARN type ratings. So, it (partially) deals with cards that have been reset ("Forget").
https://github.com/ankitects/anki/blob/afa84b5f5299eda1fda09c4f9571993aaebbc75b/rslib/src/scheduler/fsrs/weights.rs#L116
However, what if a card is reset while it is still in its learning stage? In this case, we will need to find the "Forget" ratings and remove the reviews before such ratings.
Secondly, the following code filters out reviews before the instances where the card goes from Review to Learning. But, isn't this case already dealt with when we are looking for the last continuous group of LEARN type ratings? So, I can't think of any reason to include the following code:
Thirdly, does "last continuous group of LEARN type ratings" actually need to be a "group" (> 1 LEARN type ratings)? In that case, we will miss many cases where there was only a single LEARN rating.
By the way, I also put forward the first two issues in #63 (comment). But, they were not addressed and I also forgot about them. I was reminded of the issues by https://forums.ankiweb.net/t/anki-23-10-release-candidate/35967/108.
This look promising!
Are there plans to write a user guide? If so, any timeline in mind?
In the past, I suggested excluding the incomplete revlogs from training because we didn't have any reasonable formula to calculate their memory states. However, now we do have a formula. So, including these revlogs makes sense now.
Corruption of the weights is not really a problem (now) because the effect of these revlogs on the weights will depend on their number. And if a user has a large number of such cards (for e.g. they deleted the old revlogs for most of their cards), they will want the optimizer to ensure that the RMSE is minimized for these cards as well.
The only problem I see is that the weights will depend on the input value of SM-2 retention.
It would be helpful to have a example file with a single demonstration of how the library is meant to be used
Initial weights:
[
0.4, 0.6, 2.4, 5.8, // initial stability
4.93, 0.94, 0.86, 0.01, // difficulty
1.49, 0.14, 0.94, // success
2.18, 0.05, 0.34, 1.26, // failure
0.29, 2.61, // hard penalty, easy bonus
]
Trained weights:
[src/training.rs:122] &model_trained.w.clone() = Param {
id: ParamId {
value: "f8c5b41d-186f-4b76-9760-c166e3e0df97",
},
value: Tensor {
primitive: ADTensor {
primitive: NdArrayTensor {
array: [0.524094, 0.9928922, 2.9164228, 5.49887, 4.93, 0.94, 0.86, 0.01, 1.49, 0.14, 0.94, 2.18, 0.05, 0.34, 1.26, 0.29, 2.61], shape=[17], strides=[1], layout=CFcf (0xf), dynamic ndim=1,
},
node: Node {
parents: [],
order: 0,
id: NodeID {
value: "31dc7e00-851a-4c20-aa78-dd1f8d1fae2a",
},
requirement: Grad,
},
graph: Graph {
steps: Mutex { data: {
NodeID {
value: "31dc7e00-851a-4c20-aa78-dd1f8d1fae2a",
}: RootStep {
node: Node {
parents: [],
order: 0,
id: NodeID {
value: "31dc7e00-851a-4c20-aa78-dd1f8d1fae2a",
},
requirement: Grad,
},
},
}},
},
},
},
}
Some weights are not trained. Need to check the gradients during the optimization.
error[E0308]: arguments to this method are incorrect
--> main.rs:57:31
|
57 | ... let new_d = new_d.clamp(1.0, 10.0); // TODO: consider constraining the associated type `<B as Backend>::FloatElem` to `{float}` or ...
| ^^^^^
|
note: expected associated type, found floating-point number
--> main.rs:57:37
|
57 | ... let new_d = new_d.clamp(1.0, 10.0); // TODO: consider constraining the associated type `<B as Backend>::FloatElem` to `{float}` or ...
| ^^^
= note: expected associated type `<B as Backend>::FloatElem`
found type `{float}`
= help: consider constraining the associated type `<B as Backend>::FloatElem` to `{float}` or calling a method that returns `<B as Backend>::FloatElem`
= note: for more information, visit https://doc.rust-lang.org/book/ch19-03-advanced-traits.html
note: expected associated type, found floating-point number
--> main.rs:57:42
|
57 | ... let new_d = new_d.clamp(1.0, 10.0); // TODO: consider constraining the associated type `<B as Backend>::FloatElem` to `{float}` or ...
| ^^^^
= note: expected associated type `<B as Backend>::FloatElem`
found type `{float}`
= help: consider constraining the associated type `<B as Backend>::FloatElem` to `{float}` or calling a method that returns `<B as Backend>::FloatElem`
= note: for more information, visit https://doc.rust-lang.org/book/ch19-03-advanced-traits.html
note: method defined here
--> /Users/jarrettye/.cargo/git/checkouts/burn-acfbee6a141c1b41/8808ee2/burn-tensor/src/tensor/api/numeric.rs:414:12
|
414 | pub fn clamp(self, min: K::Elem, max: K::Elem) -> Self {
| ^^^^^
Got this idea from: open-spaced-repetition/fsrs4anki#540 (comment)
But, this suggestion also solves other issues (there can be other causes of Time = 0 too).
Rationale: No card can be answered in 0 s. So, such a time indicates that there is something abnormal with it. So, it is better to not include it in the average answer times.
To be clear, I want such reviews to be used for calculating the average Again/Hard/Good/Easy rates. Just don't use the time in the calculation of average times.
I don't know what's the appropriate repo for asking, so I'll ask here.
As I mentioned in another issue, just by adjusting "Days to simulate", it's possible to make the simulator output any value of retention within the allowed range. This raises the question: how to choose the appropriate value of days to simulate? Right now I don't know whether 1 year is "better" (in some sense) than 10 years.
Also, I believe this deserves it's own entry in the wiki. I'm sure a lot of users would like to know the inner workings of the simulator.
The current rating_prob is fixed:
fsrs-rs/src/optimal_retention.rs
Lines 104 to 110 in 95ea10a
It's better to use the real Answer Buttons probability from Anki:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.