Python implementation: <a href="https://github.com/open-spaced-repet

Thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

Hey again <a class="user-mention notranslate" data-hovercard-type="user" data-hovercar

The test file is available at <a href="https://github.com/open-spaced-repetition/fsrs-

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

grad_replace was added in <a class="issue-link js-issue-link" data-error-text="Failed

grad_replace was added in <a href="https://github.com/burn-rs/burn/pull/6

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

I'll need to defer to <a class="user-mention notranslate" data-hovercard-type="user" d

[TODO] feature: freeze weights,about open-spaced-repetition/fsrs-rs

Comments (10)

GraesonB commented on June 11, 2024 1

Thanks @dae. Just wanted to say on behalf of all my language learning friends/med school friends, we appreciate all you've done with Anki. :-)

from fsrs-rs.

GraesonB commented on June 11, 2024

Hey again @L-M-Sherlock,

still familiarizing myself with burn, seems like a neat project. This one looks pretty simple, poking around their repo it looks we can modify the grads in the step function before returning the TrainOutput struct.

Their example:
https://github.com/burn-rs/burn/blob/2fefc820996085c7e96763d96437876075e0f6ba/examples/text-generation/src/model.rs#L97-L106

So the step here:
https://github.com/open-spaced-repetition/fsrs-optimizer-burn/blob/938cc9286469c8b3e565109d95bf3e146172266f/src/training.rs#L46-L57

would change to something like this:

impl<B: ADBackend<FloatElem = f32>> TrainStep<FSRSBatch<B>, ClassificationOutput<B>> for Model<B> {
    fn step(&self, batch: FSRSBatch<B>) -> TrainOutput<ClassificationOutput<B>> {
        let item = self.forward_classification(
            batch.t_historys,
            batch.r_historys,
            batch.delta_ts,
            batch.labels,
        );
        let grads = item.loss.backward();
        // Change the grads to zero for the weights we want to freeze


        TrainOutput::new(self, grads, item)
    }
}

From what I can tell when we run learner.fit, it creates an TrainEpoch struct, and calls the run method, which calls the model.step above. It uses the grads that come from the model.step and passes them into optim.step which just does the optimizer specific stuff. So I'd imagine if we zero the gradients out in model.step it should result in frozen weights?

TrainEpoch run method:
https://github.com/burn-rs/burn/blob/2fefc820996085c7e96763d96437876075e0f6ba/burn-train/src/learner/epoch.rs#L108-L121

I might be misunderstanding some stuff, but these are my findings. I'd like to try and build an implementation, but I don't have the SQLite file for testing. How might I get my hands on that?

from fsrs-rs.

dae commented on June 11, 2024

The test file is available at https://github.com/open-spaced-repetition/fsrs-optimizer-burn/files/12394182/collection.anki21.zip

from fsrs-rs.

GraesonB commented on June 11, 2024

@L-M-Sherlock at the current moment we are a little stuck. We can get the grads in tensor form and modify it similar to how you are doing it in Pytorch, but once we modify it we can't get it back into the Gradients type which is required to pass into TrainOutput. I've talked to Nathaniel on discord about it and he said he could add a grad_replace method for this particular use case.

// training.rs l52
impl<B: ADBackend<FloatElem = f32>> TrainStep<FSRSBatch<B>, ClassificationOutput<B>> for Model<B> {
    fn step(&self, batch: FSRSBatch<B>) -> TrainOutput<ClassificationOutput<B>> {
        let item = self.forward_classification(
            batch.t_historys,
            batch.r_historys,
            batch.delta_ts,
            batch.labels
        );
        let mut gradients = item.loss.backward();
        let grad_tensor = self.w.grad(&gradients).unwrap();
        let updated_grad_tensor = grad_tensor.slice_assign([0..4], Tensor::zeros([4]));
        // Can't get grad_tensor back into B::Gradients type, so Nathaniel said he could create something like this:
        self.w.grad_replace(&mut gradients, updated_grad_tensor);

        TrainOutput::new(self, gradients, item)
    }
}

from fsrs-rs.

dae commented on June 11, 2024

grad_replace was added in tracel-ai/burn#688

The clipping might better be done after tracel-ai/burn#689 is merged?

from fsrs-rs.

GraesonB commented on June 11, 2024

grad_replace was added in burn-rs/burn#688

Awesome, I'll make a PR for this later today

from fsrs-rs.

GraesonB commented on June 11, 2024

@L-M-Sherlock @dae is this something you want configured in the ModelConfig struct or is this something that should be baked right into model.step? From what I can tell in the Pytorch implementation it looks like it's just hardcoded in.

from fsrs-rs.

dae commented on June 11, 2024

I'll need to defer to @L-M-Sherlock on whether it should be hard-coded or would make sense to be changeable by the user

from fsrs-rs.

L-M-Sherlock commented on June 11, 2024

It should be hard-coded, because it's related to the details of optimization, which not allow users to hack in.

from fsrs-rs.

L-M-Sherlock commented on June 11, 2024

By the way, freeze weights is only used after we have implemented the pre-train stage. In current version, the training process shouldn't apply freeze weights.

from fsrs-rs.

[TODO] feature: freeze weights about fsrs-rs HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent