Comments (11)
I added a function to insert the default values in getResults.R of the master branch, you can use this for the database extraction @DanielKuehn87.
from omlbots.
why do you have NAs?
from omlbots.
if it is because of subordinate params we already have a standard approach for this.
and no, coding numerics just as -1 is very certainly a bad idea (if the param can become neg. as well)
from omlbots.
Different reasons:
- We have hierarchical hyperparameters. If we us
gblinear
we do not have hyperparameters for
max_depth min_child_weight colsample_bytree colsample_bylevel
. - If the booster is
gbtree
it is not written in the output in OpenML and hence NA. We can also solve this manually. - In some cases I do not know why the information of some hyperparameters is not available in OpenML.
e.g. in rpart sometimes the information for maxdepth and minsplit is missing, maybe because it is the default value?
We have no hyperparameters that can get negative. What is the standard approach?
from omlbots.
for 2) and 3): I checked it, it is because it is set to the default! We can reset this manually.
from omlbots.
@DanielKuehn87 should we do this before we put the data into the database? I think this would be cleaner...
from omlbots.
We have hierarchical hyperparameters
ok that is normal, happens in MBO all the time.
If the booster is gbtree it is not written in the output in OpenML and hence NA. We can also solve this
manually.
that sounds weird and worrisome. is it because it is the default?
i really hope oml does not "swallow" important info?
In some cases I do not know why the information of some hyperparameters is not available in OpenML.
again, have you figured out what goes wrong here?
We have no hyperparameters that can get negative.
of course you have neg values. your model must use the logscale of params where you optimize on logscale.
that is hopefully clear?
What is the standard approach?
please read the mbo paper and properly read ?makeMBOLearner
from omlbots.
that sounds weird and worrisome. is it because it is the default?
i really hope oml does not "swallow" important info?
Yes it is because it is default.
again, have you figured out what goes wrong here?
Yes, read my post above, please. For 2) and 3) the problem is that they are the defaults.
of course you have neg values. your model must use the logscale of params where you optimize on logscale.
Ok, I only thought at the transformed values, you are right here.
please read the mbo paper and properly read ?makeMBOLearner
I found this in the mbo paper, we can do it the same way (it is something similar like my -1 approach ;)):
For the surrogate we need a regression model that is more flexible and can
handle categorical features as well as missing values to support dependent pa-
rameters. A slightly modified random forest can be used for this purpose. If a
hyperparameter is not active in a design point in the training set (due to unful-
filled conditions), we will mark its value as missing. Although the random forest
could potentially directly handle missing values, many implementations do not.
Hence, we impute these values in the following way: For categorical parameters
we code missing values as a new level, and for numerical parameters we code the
imputed value out of the range of the box-constraints of the parameter under
consideration. This is known as the separate-class method and was shown to
perform best for decision trees in a prediction-oriented study, when missingness
is related to the outcome
from omlbots.
I updated the functions in getResults.R. Now it really works.
from omlbots.
@PhilippPro: I justed merged my database branch in PR #30.
In this PR I also updated the functions in getResults quite a bit to catch different problems. Could you check, if this destroys your addDefaultValues function? :)
from omlbots.
It seems ok. I readded it into your functions.
from omlbots.
Related Issues (20)
- Why is there a number behind flows? HOT 1
- Make model with ranks instead of measure
- Data conversion for xgboost HOT 9
- Do not hardcode cluster functions HOT 1
- Literature HOT 12
- Error runs HOT 5
- do not use print sprintf HOT 1
- please clean up which functions go into which file
- document all functions at least briefly
- Use OpenML Snapshot database HOT 2
- Find a way to upload errors
- runTime not downloadable for some runs HOT 1
- Run defaults implementieren, falls defaults noch nicht vorhanden
- Rpart & svm fails? HOT 3
- please link from the OML user account to this repo HOT 2
- Prevent runs of same configurations on same dataset HOT 6
- Regression datasets
- min.node.size in ranger HOT 1
- n does not equal n in resampling HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from omlbots.