Coder Social home page Coder Social logo

Comments (3)

baraline avatar baraline commented on August 17, 2024

When using the benchmark script (3cv) on the Rock dataset, we notice a very high standard deviation for RDST Ensemble Prime:

n_timestamps  RDST Prime RDST Ensemble Prime RDST RDST Ensemble Rocket MultiRocket DrCIF TDE STC HC2 RDST Prime_std RDST Ensemble Prime_std RDST_std RDST Ensemble_std Rocket_std MultiRocket_std DrCIF_std TDE_std STC_std HC2_std
474 1.0630292184650898 25.052939302287996 1.4401322677731514 4.770994765684009 9.273215881548822 23.353670642711222 49.786722905002534 77.79590922128409 80.43357326462865 7884.829313875176 0.004903493449091911 16.0698274159804 0.01821163296699524 3.447608422487974 0.021983223967254162 0.48081249091774225 7.353978189639747 18.233609943650663 0.43430908396840096 3.822393278591335
948 2.377154231071472 25.898940067738295 3.0750416861847043 3.0481695402413607 13.261304871179163 24.991533936932683 55.76158287934959 108.96176979038864 101.37790459487587 7960.439127580263 0.006226222962141037 16.065012263134122 0.06886849086731672 0.03800296410918236 0.03066807147115469 0.5928529351949692 7.1618244629353285 12.518209223635495 3.9045204231515527 0.2534934086725116
1422 3.4417717000469565 27.04298056382686 5.201300728134811 8.40116765908897 18.263901693746448 25.862254047766328 64.07761400006711 119.52752438280731 102.72296567447484 7970.518767527305 0.011878960765898228 15.908938153646886 0.006814070977270603 3.253817331045866 0.9779286533594131 0.002123715355992317 7.804655771702528 24.37082715984434 5.647393397986889 11.92994621861726
1896 4.6401264341548085 28.24125813692808 8.04301328677684 11.572580952197313 21.337463438510895 27.698388851247728 67.51543310005218 150.7656864784658 111.24626373499632 8078.189195295796 0.08419013861566782 15.739565890282393 0.008268513716757298 4.111701520159841 0.04452386498451233 0.7664412679150701 9.207885961048305 18.048286149278283 0.7535054516047239 3.2518416047096252
2370 6.071481054648757 27.885636082850397 11.334959764964879 13.973823537118733 25.491070554591715 29.93132807407528 77.2411882840097 177.13270619604737 122.8379875915125 8132.091950537637 0.0660868901759386 15.293029426597059 0.4737072614952922 2.480648464523256 0.006583829410374165 0.9237984782084823 9.473532510921359 0.10296206641942263 0.4684769967570901 11.067898195236921

A possible cause would an issue with numba not caching the function correctly, and having to recompile some before each new step of validation for RDST Ensemble Prime. This may also be true for RDST Ensemble.

The problem also appears in UCR cross validation run, (only) the first dataset had a high standard deviation for timing, despite a first run on a synthetic dataset before.

from convst.

baraline avatar baraline commented on August 17, 2024

Code to reproduce the issue :

from convst.classifiers import R_DST_Ensemble
from convst.utils.dataset_utils import load_sktime_dataset_split
from sktime.classification.kernel_based import RocketClassifier
from timeit import default_timer as timer
import pandas as pd

def time_pipe(pipeline, X_train, y_train, X_test):
    t0 = timer()
    pipeline.fit(X_train, y_train)
    pipeline.predict(X_test)
    t1 = timer()
    return t1-t0


X_train, X_test, y_train, y_test, _ = load_sktime_dataset_split('GunPoint')
rdst = R_DST_Ensemble(n_shapelets_per_estimator=1, n_jobs=-1)
rkt = RocketClassifier(rocket_transform='minirocket', n_jobs=-1)
df = pd.DataFrame()

for i in range(10):
    df.loc[i,'rdst'] = time_pipe(rdst, X_train, y_train, X_test)
    df.loc[i,'rkt'] = time_pipe(rkt, X_train, y_train, X_test)

Results in

        rdst       rkt
0  49.011466  0.335395
1   0.110011  0.351882
2   0.090319  0.370477
3   0.106159  0.346266
4  10.088280  0.368425
5   0.120686  0.353858
6   0.147703  0.352415
7   0.127688  0.335258
8   0.127785  0.354415
9   8.584705  0.341209

If Rocket is not called, the results are similar. The problem would then come from some numba function that need to be compiled again after some runs.

from convst.

baraline avatar baraline commented on August 17, 2024

Fixed by #36

from convst.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.