Coder Social home page Coder Social logo

Comments (4)

mathurinm avatar mathurinm commented on June 2, 2024 1

Passing options via command line may clutter the CLI, maybe we could have plot options passed through a plot_config.yml file ?

from benchopt.

tomMoral avatar tomMoral commented on June 2, 2024 1

There is a embryo-code for plot configuration on a per-benchmark basis:

DEFAULT_BENCHMARK_CONFIG = {
'plots': list(PLOT_KINDS),
}
"""
* ``plots``, *list*: Select the plots to display for the benchmark. Should be
valid plot kinds. The list can simply be one item by line, with each item
indented, as:
.. code-block:: ini
plots =
suboptimality_curve
bar_chart
"""

The original goal was to select the default plot to display for the benchmark so it is probably easy to repurpose this to configure the default plot.

One thing that is not so easy is how to define bad outliers from good ones. What happens if one solver is very fast while the others are slow? I think this kind of magic trick of filtering should be possible easily, but not by default. WDYT?

Another question is what should be configurable? In my list right now:

  • default plot (type, objective_column, dataset, parameters)
  • default scale (linear, semilogN, loglog)
  • default axis limits (x/ylim)
    Do you see other stuff?

Finally, could you share the CSV that produces the HTML above? this will help debug the functionality :)

from benchopt.

jeandut avatar jeandut commented on June 2, 2024

Note that 1. would only work for objective_value and not for other metrics potentially.

from benchopt.

jeandut avatar jeandut commented on June 2, 2024

Yes outlier detection is a research domain in itself I agree and it depends heavily on what the user wants.
It would be much simpler from an implementation perspective to allow the user to give default config parameters however this has the limitation that it would apply to only one metric aka the first one (or the one indicated in the default config file) and only to its absolute value (not distance to optimum etc.), one user might log metrics that are on different scales...

In my very narrow view of things I would like some kind of simple mechanism for outlier detection based on quantile of the y values (and maybe for time as well ? but if operating on the x then it might create edge case or whatnot).

Since it would only apply to a view and obviously not to the parquet itself, the only bad things I see is if it hides some information from the user aka the user thinks there is no outlier. Therefore if outlier detection is performed then one should display clearly that some outliers are not visible and that the user can click on a button to make the autoscale not remove outliers. Here a toggle might be a good option with a default parametrizable by the user, with some users wanting that by default and some others not with the default default being not I guess ; )

Regarding what the user could pass I agree with your list. Also filters for curves displayed would be nice (regexp or hyperparameters based). Ofc one could envision customizing labels or axis in the very very long-term ala matplotlib.

Here is a csv file to reproduce the effect:
outliers.csv

from benchopt.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.