Coder Social home page Coder Social logo

Comments (6)

CameronBieganek avatar CameronBieganek commented on July 1, 2024 1

If we set aside learning networks, it seems like we don't really need evaluate!. The non-mutating evaluate is sufficient and more intuitive.

Is there a way to write a non-mutating evaluate that works on learning networks? Or is there already a way to use evaluate on learning networks that is not immediately obvious to me? I'm rusty on MLJ, since I unfortunately have to use Python at work. :(

from mlj.jl.

ablaom avatar ablaom commented on July 1, 2024

@CameronBieganek @tylerjthomas9

from mlj.jl.

tylerjthomas9 avatar tylerjthomas9 commented on July 1, 2024

I think that this is a fantastic change. As you said, the current mach returned by evaluate! is not useful. It may add some complications that evaluate and evaluate have different default values for retrain, but I think that it makes logical sense to not retrain by default with evaluate(model, data...).

from mlj.jl.

OkonSamuel avatar OkonSamuel commented on July 1, 2024

If we set aside learning networks, it seems like we don't really need evaluate!. The non-mutating evaluate is sufficient and more intuitive.

Is there a way to write a non-mutating evaluate that works on learning networks? Or is there already a way to use evaluate on learning networks that is not immediately obvious to me? I'm rusty on MLJ, since I unfortunately have to use Python at work. :(

I agree with @CameronBieganek that it would have been best if evaluate just returned a performance evaluation object and does not mutate the machine.
We could still have the evaluate! method as an internal method for convenience.
For learning networks now that the data isn't attached to the network anymore, we can make it such that calling evaluate on a machine passes a deepcopy of the machine to the internal evaluate! method.
I don't know how this will affect models that wraps external c libraries.

from mlj.jl.

ablaom avatar ablaom commented on July 1, 2024

This is great feedback, thanks.

I'm torn between dumping mutation altogether (advantage: simplicity) and proceeding as I propose above (convenience). Some context: In a summer project we are working on auto logging of workflows using MLFlow, and in this context it seemed natural to log a training score, and a serialised set of learned parameters, each time a model is "evaluated". This saves having to specify metrics a second time if wanting the training score.

Is there a way to write a non-mutating evaluate that works on learning networks? Or is there already a way to use evaluate on learning networks that is not immediately obvious to me?

@CameronBieganek Not sure what you mean here. Perhaps you mean calling evaluate! on "learning network machines" (a special kind of machine), now deprecated? Generally the idea with learning networks is that they should be "exported" as stand-alone composite models in serious use. There was a simplification to this exporting process you may have missed. Learning network machines, once an intermediate step in the export process, have been deprecated as seen as unnecessary complication.

from mlj.jl.

ablaom avatar ablaom commented on July 1, 2024

Okay, I now remember the reason for the existing behaviour. The use case is evaluating models that support "warm restart". If using Holdout() (or other single test/train pair of row indices) and you re-evaluate the machine after changing a model hyperparameter then, for some hyperparameters like n_iterations, the re-evaluation will not require retraining the machine from scratch.

On another matter, perhaps a better way to get training scores, is to add a resampling strategy InSample() which has the single train/test pair of row indicestrain=rows and test=rows.

from mlj.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.