Coder Social home page Coder Social logo

Comments (10)

GraesonB avatar GraesonB commented on June 11, 2024 1

Thanks @dae. Just wanted to say on behalf of all my language learning friends/med school friends, we appreciate all you've done with Anki. :-)

from fsrs-rs.

GraesonB avatar GraesonB commented on June 11, 2024

Hey again @L-M-Sherlock,

still familiarizing myself with burn, seems like a neat project. This one looks pretty simple, poking around their repo it looks we can modify the grads in the step function before returning the TrainOutput struct.

Their example:
https://github.com/burn-rs/burn/blob/2fefc820996085c7e96763d96437876075e0f6ba/examples/text-generation/src/model.rs#L97-L106

So the step here:
https://github.com/open-spaced-repetition/fsrs-optimizer-burn/blob/938cc9286469c8b3e565109d95bf3e146172266f/src/training.rs#L46-L57

would change to something like this:

impl<B: ADBackend<FloatElem = f32>> TrainStep<FSRSBatch<B>, ClassificationOutput<B>> for Model<B> {
    fn step(&self, batch: FSRSBatch<B>) -> TrainOutput<ClassificationOutput<B>> {
        let item = self.forward_classification(
            batch.t_historys,
            batch.r_historys,
            batch.delta_ts,
            batch.labels,
        );
        let grads = item.loss.backward();
        // Change the grads to zero for the weights we want to freeze


        TrainOutput::new(self, grads, item)
    }
}

From what I can tell when we run learner.fit, it creates an TrainEpoch struct, and calls the run method, which calls the model.step above. It uses the grads that come from the model.step and passes them into optim.step which just does the optimizer specific stuff. So I'd imagine if we zero the gradients out in model.step it should result in frozen weights?

TrainEpoch run method:
https://github.com/burn-rs/burn/blob/2fefc820996085c7e96763d96437876075e0f6ba/burn-train/src/learner/epoch.rs#L108-L121

I might be misunderstanding some stuff, but these are my findings. I'd like to try and build an implementation, but I don't have the SQLite file for testing. How might I get my hands on that?

from fsrs-rs.

dae avatar dae commented on June 11, 2024

The test file is available at https://github.com/open-spaced-repetition/fsrs-optimizer-burn/files/12394182/collection.anki21.zip

from fsrs-rs.

GraesonB avatar GraesonB commented on June 11, 2024

@L-M-Sherlock at the current moment we are a little stuck. We can get the grads in tensor form and modify it similar to how you are doing it in Pytorch, but once we modify it we can't get it back into the Gradients type which is required to pass into TrainOutput. I've talked to Nathaniel on discord about it and he said he could add a grad_replace method for this particular use case.

// training.rs l52
impl<B: ADBackend<FloatElem = f32>> TrainStep<FSRSBatch<B>, ClassificationOutput<B>> for Model<B> {
    fn step(&self, batch: FSRSBatch<B>) -> TrainOutput<ClassificationOutput<B>> {
        let item = self.forward_classification(
            batch.t_historys,
            batch.r_historys,
            batch.delta_ts,
            batch.labels
        );
        let mut gradients = item.loss.backward();
        let grad_tensor = self.w.grad(&gradients).unwrap();
        let updated_grad_tensor = grad_tensor.slice_assign([0..4], Tensor::zeros([4]));
        // Can't get grad_tensor back into B::Gradients type, so Nathaniel said he could create something like this:
        self.w.grad_replace(&mut gradients, updated_grad_tensor);

        TrainOutput::new(self, gradients, item)
    }
}

from fsrs-rs.

dae avatar dae commented on June 11, 2024

grad_replace was added in tracel-ai/burn#688

The clipping might better be done after tracel-ai/burn#689 is merged?

from fsrs-rs.

GraesonB avatar GraesonB commented on June 11, 2024

grad_replace was added in burn-rs/burn#688

Awesome, I'll make a PR for this later today

from fsrs-rs.

GraesonB avatar GraesonB commented on June 11, 2024

@L-M-Sherlock @dae is this something you want configured in the ModelConfig struct or is this something that should be baked right into model.step? From what I can tell in the Pytorch implementation it looks like it's just hardcoded in.

from fsrs-rs.

dae avatar dae commented on June 11, 2024

I'll need to defer to @L-M-Sherlock on whether it should be hard-coded or would make sense to be changeable by the user

from fsrs-rs.

L-M-Sherlock avatar L-M-Sherlock commented on June 11, 2024

It should be hard-coded, because it's related to the details of optimization, which not allow users to hack in.

from fsrs-rs.

L-M-Sherlock avatar L-M-Sherlock commented on June 11, 2024

By the way, freeze weights is only used after we have implemented the pre-train stage. In current version, the training process shouldn't apply freeze weights.

from fsrs-rs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.