Comments (3)
@vboussange Thanks for putting MLJ through it's paces and for the positive feedback.
The issue here is that you are using TunedModel
to wrap two models with different prediction type. One predicts probability distributions, while the other predicts point values:
julia> prediction_type(LinearRegressor)
:probabilistic
julia> prediction_type(NN)
:deterministic
This should be disallowed, but isn't, and we can see the wrapped model decides, without any good reason, to be :deterministic
:
julia> prediction_type(multi_model)
:deterministic
So training the TunedModel
tries to apply rms
directly to probabilistic (Poisson) distributions and so fails.
Below is a workaround. The changes I've made are marked A
and B
:
A
force predictions of the linear model to be deterministic by applyingmean
to the probabilistic predictionsB
makes sure that NN receivesContinuous
data, instead ofCount
data, to suppress the scitype warning you have been getting (not a critical correction)
using Distributions
using LinearAlgebra
using DataFrames
using UnPack
using MLJ
## Generating synthetic data
n_features = 5
datasize = 1500
T = Float64
TI = Int64
# covariance matrix, needs to be symmetric
Σ = rand(T, n_features, n_features) * 2 .- 1
Σ = Σ*Σ'
μ = randn(T, n_features)
X = DataFrame(rand(MvNormal(μ, Σ), datasize)',:auto);
a = randn(T, n_features)
y = TI[]
for i in 1:datasize
Xi = X[i,:] |> Vector
push!(y, rand(Poisson(exp.(a' * Xi))))
end
# building a GLM
LinearRegressor = MLJ.@load LinearCountRegressor pkg=GLM
linearregressor = LinearRegressor()
linearregressor = linearregressor |> (y -> mean.(y)) # <--------- A
mach = machine(linearregressor, X, y)
fit!(mach)
# works
# building a neural network regressor
using Flux, MLJFlux
mutable struct MyBuilder <: MLJFlux.Builder
nhidden # number of neurons in hidden layers
σ1 #hidden layers activation function
σ2 #output activation function
end
function MLJFlux.build(nn::MyBuilder, rng, n_in, n_out)
init = Flux.glorot_uniform(rng)
@unpack nhidden, σ1, σ2 = nn
return Chain(Dense(n_in, nhidden, σ1, init=init),
BatchNorm(nhidden),
Dense(nhidden, nhidden, σ1, init=init),
BatchNorm(nhidden),
Dense(nhidden, n_out, σ1, init=init),
σ2)
end
NN = MLJ.@load NeuralNetworkRegressor pkg=MLJFlux
nnflux = NN(builder = MyBuilder(64, relu, softplus),
batch_size=100,
epochs=200,
loss = Flux.poisson_loss)
nnflux = ContinuousEncoder() |> nnflux # <------------ B
mach = machine(nnflux, X, y)
fit!(mach) # works
# comparing multiple models
mymodels = [nnflux, linearregressor]
multi_model = TunedModel(
models=mymodels,
resampling = CV(nfolds=3),
measure = rms,
check_measure = false,
)
e = MLJ.evaluate(multi_model, X, y, resampling = CV(nfolds=2),
measure=rms,
verbosity=6,
# acceleration = CPUThreads()
)
# PerformanceEvaluation object with these fields:
# model, measure, operation, measurement, per_fold,
# per_observation, fitted_params_per_fold,
# report_per_fold, train_test_rows, resampling, repeats
# Extract:
# ┌────────────────────────┬───────────┬─────────────┬─────────┬──────────────┐
# │ measure │ operation │ measurement │ 1.96*SE │ per_fold │
# ├────────────────────────┼───────────┼─────────────┼─────────┼──────────────┤
# │ RootMeanSquaredError() │ predict │ 30.2 │ 34.5 │ [39.9, 15.0] │
# └────────────────────────┴───────────┴─────────────┴─────────┴──────────────┘
from mlj.jl.
Closing in favour of JuliaAI/MLJTuning.jl#200
from mlj.jl.
Sweet, thanks for the inputs!
from mlj.jl.
Related Issues (20)
- Out-of-fold predictions for unsupervised models? HOT 2
- Remove "experimental" label for acceleration API docs
- Effect of `evaluate!(mach, ...)` is unpredictable. Retrain on all data and re-evaluate? HOT 6
- Proposal for mlflow integration HOT 5
- Julia crashes when fitting a SVC HOT 2
- SymbolicRegression.jl — registry update HOT 2
- Is the Averager documentation deprecated? HOT 2
- Update deprecated document example in "Transformers ..." section of manual
- `fit!` not exported in 0.19.3/0.19.4? HOT 2
- is there support for segmented or nested models? HOT 1
- Doc generation is failing silently HOT 2
- A de-correlation model for feature exclusion HOT 1
- [Tracking] Migration of measures MLJBase.jl -> StatisticalMeasures.jl HOT 1
- Add more examples of exported learning networks
- Confusing Julia code in adding_models_for_general_use.md HOT 1
- Include MLJBalancing.jl in MLJ and re-export it's names.
- Update docs for new class imbalance support
- Add new sk-learn models to the docs
- Export the name `MLJFlow` HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mlj.jl.