Comments (6)
If we set aside learning networks, it seems like we don't really need evaluate!
. The non-mutating evaluate
is sufficient and more intuitive.
Is there a way to write a non-mutating evaluate
that works on learning networks? Or is there already a way to use evaluate
on learning networks that is not immediately obvious to me? I'm rusty on MLJ, since I unfortunately have to use Python at work. :(
from mlj.jl.
@CameronBieganek @tylerjthomas9
from mlj.jl.
I think that this is a fantastic change. As you said, the current mach
returned by evaluate!
is not useful. It may add some complications that evaluate
and evaluate
have different default values for retrain
, but I think that it makes logical sense to not retrain by default with evaluate(model, data...)
.
from mlj.jl.
If we set aside learning networks, it seems like we don't really need
evaluate!
. The non-mutatingevaluate
is sufficient and more intuitive.Is there a way to write a non-mutating
evaluate
that works on learning networks? Or is there already a way to useevaluate
on learning networks that is not immediately obvious to me? I'm rusty on MLJ, since I unfortunately have to use Python at work. :(
I agree with @CameronBieganek that it would have been best if evaluate
just returned a performance evaluation object and does not mutate the machine.
We could still have the evaluate!
method as an internal method for convenience.
For learning networks now that the data isn't attached to the network anymore, we can make it such that calling evaluate on a machine passes a deepcopy of the machine to the internal evaluate!
method.
I don't know how this will affect models that wraps external c libraries.
from mlj.jl.
This is great feedback, thanks.
I'm torn between dumping mutation altogether (advantage: simplicity) and proceeding as I propose above (convenience). Some context: In a summer project we are working on auto logging of workflows using MLFlow, and in this context it seemed natural to log a training score, and a serialised set of learned parameters, each time a model is "evaluated". This saves having to specify metrics a second time if wanting the training score.
Is there a way to write a non-mutating evaluate that works on learning networks? Or is there already a way to use evaluate on learning networks that is not immediately obvious to me?
@CameronBieganek Not sure what you mean here. Perhaps you mean calling evaluate!
on "learning network machines" (a special kind of machine), now deprecated? Generally the idea with learning networks is that they should be "exported" as stand-alone composite models in serious use. There was a simplification to this exporting process you may have missed. Learning network machines, once an intermediate step in the export process, have been deprecated as seen as unnecessary complication.
from mlj.jl.
Okay, I now remember the reason for the existing behaviour. The use case is evaluating models that support "warm restart". If using Holdout()
(or other single test/train pair of row indices) and you re-evaluate the machine after changing a model hyperparameter then, for some hyperparameters like n_iterations
, the re-evaluation will not require retraining the machine from scratch.
On another matter, perhaps a better way to get training scores, is to add a resampling strategy InSample()
which has the single train/test pair of row indicestrain=rows
and test=rows
.
from mlj.jl.
Related Issues (20)
- `evaluate` errors HOT 3
- Add AutoEncoderMLJ model (part of BetaML) HOT 10
- need a tutorial for using logger with dagshub and mlflow HOT 4
- Document how to add plot recipes in a new model implementation HOT 4
- Add new model descriptors to fix doc-generation fail HOT 1
- Models that fail integration tests but defy isolation
- Update list of BetaML models HOT 1
- Reinstate CatBoost integraton test HOT 1
- Upate ROADMAP.md HOT 1
- Improve documentation by additional hierarchy HOT 5
- Include support for MixedModels.jl HOT 2
- Deserialisation fails for wrappers like `TunedModel` when atomic model overloads `save/restore` HOT 2
- feature_importances for Pipeline including XGBoost don't work HOT 2
- Current performance evaluation objects, recently added to TunedModel histories, are too big HOT 2
- Update cheat sheet instance of depracated `@from_network` code
- Requesting better exposure to MLJFlux in the model browser HOT 4
- Reexport `CompactPerformanceEvaluation` and `InSample`
- Remove `info(rms)` from the cheatsheet HOT 4
- Re-instate integration tests for scikit-learn models
- [tracking] Add default logger to MLJ
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mlj.jl.