Coder Social home page Coder Social logo

evaluationengine's Introduction

License

OpenML: Open Machine Learning

Welcome to the OpenML GitHub page! 🎉

Contents:

Who are we?

We are a group of people who are excited about open science, open data and machine learning. We want to make machine learning and data analysis simple, accessible, collaborative and open with an optimal division of labour between computers and humans.

What is OpenML?

Want to learn about OpenML or get involved? Please do and get in touch in case of questions or comments! 📨

OpenML is an online machine learning platform for sharing and organizing data, machine learning algorithms and experiments. It is designed to create a frictionless, networked ecosystem, that you can readily integrate into your existing processes/code/environments, allowing people all over the world to collaborate and build directly on each other’s latest ideas, data and results, irrespective of the tools and infrastructure they happen to use.

As an open science platform, OpenML provides important benefits for the science community and beyond.

Benefits for Science

Many sciences have made significant breakthroughs by adopting online tools that help organizing, structuring and analyzing scientific data online. Indeed, any shared idea, question, observation or tool may be noticed by someone who has just the right expertise to spark new ideas, answer open questions, reinterpret observations or reuse data and tools in unexpected new ways. Therefore, sharing research results and collaborating online as a (possibly cross-disciplinary) team enables scientists to quickly build on and extend the results of others, fostering new discoveries.

Moreover, ever larger studies become feasible as a lot of data are already available. Questions such as “Which hyperparameter is important to tune?”, “Which is the best known workflow for analyzing this data set?” or “Which data sets are similar in structure to my own?” can be answered in minutes by reusing prior experiments, instead of spending days setting up and running new experiments.

Benefits for Scientists

Scientists can also benefit personally from using OpenML. For example, they can save time, because OpenML assists in many routine and tedious duties: finding data sets, tasks, flows and prior results, setting up experiments and organizing all experiments for further analysis. Moreover, new experiments are immediately compared to the state of the art without always having to rerun other people’s experiments.

Another benefit is that linking one’s results to those of others has a large potential for new discoveries (see, for instance, Feurer et al. 2015; Post et al. 2016; Probst et al. 2017), leading to more publications and more collaboration with other scientists all over the world.

Finally, OpenML can help scientists to reinforce their reputation by making their work (published or not) visible to a wide group of people and by showing how often one’s data, code and experiments are downloaded or reused in the experiments of others.

Benefits for Society

OpenML also provides a useful learning and working environment for students, citizen scientists and practitioners. Students and citizen scientist can easily explore the state of the art and work together with top minds by contributing their own algorithms and experiments. Teachers can challenge their students by letting them compete on OpenML tasks or by reusing OpenML data in assignments. Finally, machine learning practitioners can explore and reuse the best solutions for specific analysis problems, interact with the scientific community or efficiently try out many possible approaches.


Get involved

OpenML has grown into quite a big project. We could use many more hands to help us out 🔧.

  • You want to contribute?: Awesome! Check out our wiki page on how to contribute or get in touch. There may be unexpected ways for how you could help. We are open for any ideas.
  • You want to support us financially?: YES! Getting funding through conventional channels is very competitive, and we are happy about every small contribution. Please send an email to [email protected]!

GitHub organization structure

OpenML's code distrubuted over different repositories to simplify development. Please see their individual readme's and issue trackers of you like to contribute. These are the most important ones:

  • openml/OpenML: The OpenML web application, including the REST API.
  • openml/openml-python: The Python API, to talk to OpenML from Python scripts (including scikit-learn).
  • openml/openml-r: The R API, to talk to OpenML from R scripts (inclusing mlr).
  • openml/java: The Java API, to talk to OpenML from Java scripts.
  • openml/openml-weka: The WEKA plugin, to talk to OpenML from the WEKA toolbox.

evaluationengine's People

Contributors

janvanrijn avatar joaquinvanschoren avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

evaluationengine's Issues

AutoCorrelation meta-feature: order of instances

I am looking into how AutoCorrelation is computed. And it looks like order of instances is important for this meta-feature. Isn't this a bit strange? Order of instances is really something which changes often. Or is this useful for time-based datasets? But shouldn't then we first order by time column?

Compute feature distributions for nominal attributes

Currently the website shows no information on nominal attributes in a dataset with a numeric target.

See: https://www.openml.org/d/41022
Season, Series,... have no information om their distribution.

Looking at the code, we could extend models.AttributeStatistics
with a new function that returns something of the form
[[v1,v2,v3],[123],[234],[354]], i.e. the list of possible values and their corresponding counts.
That way they are in the same format as the class distributions.

Of course, something like [[v1,v2,v3],[123,234,354]] is also fine.

Add MCC to evaluation measures for classification

As requested in openml/OpenML#191

I received a request to add the Matthews correlation coefficient to the evaluation engine for classification. It can be straightforwardly derived from the confusion table: MCC = (TP * TN – FP * FN)/sqrt((TP+FP) (TP+FN) (FP+TN) (TN+FN)).

I propose we check if it is available in Weka, in that case it should be easy to add.

regression datasets handled as classification datasets

Please check the following:
7be41b0

numValues use to be 0 for regression. It is now 1?

This trips up the AttributeStatistics, which wants to compute the class distribution of a feature if numClasses > 0.

https://github.com/openml/EvaluationEngine/blob/master/src/main/java/org/openml/webapplication/models/AttributeStatistics.java#L56

The result is that regression datasets aren't parsed correctly anymore, e.g.:

[19-06-2019 23:30:14] [OK] [Process Dataset] Processing dataset 41936 - obtaining features.
java.lang.ArrayIndexOutOfBoundsException: Index 3600 out of bounds for length 1
	at org.openml.webapplication.models.AttributeStatistics.addValue(AttributeStatistics.java:72)
	at org.openml.webapplication.features.ExtractFeatures.getFeatures(ExtractFeatures.java:79)
	at org.openml.webapplication.ProcessDataset.process(ProcessDataset.java:55)
	at org.openml.webapplication.ProcessDataset.<init>(ProcessDataset.java:32)
	at org.openml.webapplication.Main.main(Main.java:115)

Unclear how to create the evaluation engine jar file for the server

I tried exporting the Evaluation Engine to a jar (including all required libraries), but now I get xstream errors when trying to evaluate a run. What am I doing wrong?

[30-09-2018 00:49:42] [OK] [Process Run] Start processing run: 24360
com.thoughtworks.xstream.converters.ConversionException: name : name
---- Debugging information ----
message             : name
cause-exception     : java.lang.IllegalArgumentException
cause-message       : name
class               : org.openml.apiconnector.xml.Run
required-type       : org.openml.apiconnector.xml.Run
converter-type      : com.thoughtworks.xstream.converters.reflection.ReflectionConverter
path                : /oml:run/oml:uploader_name
version             : not available
-------------------------------
	at com.thoughtworks.xstream.core.TreeUnmarshaller.convert(TreeUnmarshaller.java:79)
	at com.thoughtworks.xstream.core.AbstractReferenceUnmarshaller.convert(AbstractReferenceUnmarshaller.java:65)
	at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:66)
	at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:50)
	at com.thoughtworks.xstream.core.TreeUnmarshaller.start(TreeUnmarshaller.java:134)
	at com.thoughtworks.xstream.core.AbstractTreeMarshallingStrategy.unmarshal(AbstractTreeMarshallingStrategy.java:32)
	at com.thoughtworks.xstream.XStream.unmarshal(XStream.java:1185)
	at com.thoughtworks.xstream.XStream.unmarshal(XStream.java:1169)
	at com.thoughtworks.xstream.XStream.fromXML(XStream.java:1040)
	at com.thoughtworks.xstream.XStream.fromXML(XStream.java:1031)
	at org.openml.apiconnector.io.HttpConnector.wrapHttpResponse(HttpConnector.java:116)
	at org.openml.apiconnector.io.HttpConnector.doApiRequest(HttpConnector.java:85)
        ...

Caused by: java.lang.IllegalArgumentException: name
	at sun.misc.URLClassPath$Loader.getResource(URLClassPath.java:729)
	at sun.misc.URLClassPath.getResource(URLClassPath.java:239)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:365)
        ...

{"error":"name : name
---- Debugging information ----
message             : name
cause-exception     : java.lang.IllegalArgumentException
cause-message       : name
class               : org.openml.apiconnector.xml.Run
required-type       : org.openml.apiconnector.xml.Run
converter-type      : com.thoughtworks.xstream.converters.reflection.ReflectionConverter
path                : /oml:run/oml:uploader_name
version             : not available
-------------------------------"}

OpenML Feature mismatch data

The OpenML Dataset Features have their nominal_values normalized, while the data itself is not:

import openml
dresses = openml.datasets.get_dataset(23381)
df, *_ = dresses.get_data(dataset_format="dataframe")
print(d.features[1].nominal_values)
print(df[d.features[1].name].unique())

output:

['Average', 'high', 'low', 'Medium', 'very-high']

['Low', 'High', 'Average', 'Medium', 'very-high', 'low', 'high', NaN]
Categories (7, object): ['Average' < 'high' < 'High' < 'low' < 'Low' < 'Medium' < 'very-high']

edit: This is a problem already in the XML as @mfeurer indicates below.

confusion matrix makes no sense as evaluation measure

There's "confusion matrix" in the evaluation measure drop-down on the frontend. I'm not sure if that's a frontend issue or a backend issue. This makes no sense in the context of the leaderboard and is not computed anyway.

NullpointerExceptions in dataset processing when target cannot be found.

Some datasets fail during processing. They all fail with a NullpointerException that occurs while setting the target feature.

Checking the datasets (uploaded by Guillaume), they have a feature such as:
@Attribute "Is Public Domain" {True, False}
while in the database this is stored as 'Is_Public_Domain'.
I assume the underscores are added during dataset upload.

What is more troubling is that there seems to be no trace of this error. The message says that the error is marked in the database but I could not find it. I assume it tries to store an error message but that fails, too?

[23-12-2018 23:23:43] [OK] [Process Dataset] Processing dataset 41249 - obtaining features.
java.lang.NullPointerException
	at weka.core.Instances.setClass(Instances.java:1532)
	at org.openml.webapplication.features.ExtractFeatures.getFeatures(ExtractFeatures.java:42)
	at org.openml.webapplication.ProcessDataset.process(ProcessDataset.java:65)
	at org.openml.webapplication.ProcessDataset.<init>(ProcessDataset.java:41)
	at org.openml.webapplication.Main.main(Main.java:120)
[23-12-2018 23:23:43] [Error] [Process Dataset] Error while processing dataset. Marking this in database.
[23-12-2018 23:23:43] [Error] [Process Dataset] Dataset 41249 - Error: null

Evaluation Engine policy on datasets without a default target

Several datasets do not have a specific target. Also, multitask datasets do not have a single target, which complicates the calculation of meta-features such as: classcount, entropy, landmarkers and mean mutual information. Several things that we can do:

  • in case of no single/valid class, do not calculate these features
  • define meta-features on task level. We should do so anyway at some point. This does not solve the multitarget problem though
  • ... ?

@mfeurer @amueller @joaquinvanschoren @berndbischl @giuseppec @ja-thomas

meta-features store vector of numbers instead of aggregates

Currently we store (for numeric columns):

  • Mean X of numeric atts
  • Stdev of X of numeric atts
  • Quartile {1, 2, 3} of X of numeric atts
  • Min of X of numeric atts
  • Max of X of numeric atts

Where X = {mean, stdev, kurtosis, skewness}. Something similar for information theoretic measures of nominal atts.

This selection is arbitrary and not well supported in the literature.

Much better would be to store a vector of each value per attribute, giving the possibility to researchers to calculate these values client-side.

Evaluation measures duplicated or not present / no measure for imbalanced data available

Related: #20

Currently no measure is computed that's useful for highly imbalanced classes.
Take for example sick:
https://www.openml.org/t/3021

I would like to see the "mean" measures be computed in particular (they also are helpful for comparison with D3M, cc @joaquinvanschoren).

On the other hand, the "weighted" measures are not computed but seem to be duplicates of the measure without prefix, which is also weighted by class size:
https://www.openml.org/a/evaluation-measures/mean-weighted-f-measure
https://www.openml.org/a/evaluation-measures/f-measure

Though that's not entirely clear from the documentation. If the f-measure documentation is actually accurate (which I don't think it is), that would be worse because it's unclear for which class the f-measure is reported.

Compute histograms for numeric attributes

Currently the website shows a box plot for numeric attributes. This does not always look good, plus it hides a lot of information.

It would be better to store a histogram of the distribution. This can be computed beforehand.
I.e. Something like this: https://www.mathworks.com/help/examples/matlab/win64/AdjustHistogramPropertiesExample_01.png

For categorical targets we could also compute it per class value: https://3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com/wp-content/uploads/2014/03/histograms.png

Looking at the code, we could extend models.AttributeStatistics
with a new function that returns something of the form
[[b1,b2,b3],[123],[234],[354]], where b1, b2 are the bucket values.

For categorical targets, we could compute something like
[[b1,b2,b3],[123,12,23],[234,23,34],[354,34,45]] for a 3-class dataset.

What do you think would be the best way to implement this?

Add deployment documentation

I feel unsure how to create a new .jar that runs correctly on the server.
Is it just a fat jar? Which versions of the dependent libraries can/should we use?

Unclear how to compile the code

Using Java 1.8 and the latest apiconnector from Maven, I get several compilation errors. For instance:

The method getNominalValues() is undefined for the type DataFeature.Feature
The constructor DataFeature(Integer, int, DataFeature.Feature[], String) is undefined
Unhandled exception type JSONException

Probably a version mismatch but it is not clear what to do exactly.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.