Coder Social home page Coder Social logo

Comments (5)

vtoliveira avatar vtoliveira commented on June 6, 2024

So, I was writting this dataset implementation based on pdmarima structure: example

However, I noticed you have a file named load_data inside utilities. I think it is better to have a different structure following the design above, what you think? @carlomazzaferro

from scikit-hts.

carlomazzaferro avatar carlomazzaferro commented on June 6, 2024

I do prefer the structure you suggest, and I'd be in favour of that implementation. If you would like to proceed with an it, keep in mind that:

  • We do need to provide a cache checking implementation, so data is not downloaded twice
  • Refactor will be extensive, as conftest.py and some of the examples use the current data loading capibility
  • The function: load_hierarchical_sine_data doesn't really fit within this structure, but we might just rename it so that it can be imported as from hts.datasets import synthetic_sine_data or something like that @vtoliveira

from scikit-hts.

vtoliveira avatar vtoliveira commented on June 6, 2024

Yes, I checked the structure and this is something to discuss.

For visnights dataset I do not have a place to download data, I just got the one from R and transformed into a long data format because I think this is the standard use case and how data come from databases, look what I did here: visnights.ipynb

I could not find so many open datasets with hierarchical structure, but I think creating a datasets folder would be best anyway. I already started creating one bases on pmdarima folder, however, I would rather have your help in guiding the best way to do that, since it is one of first contributions. @carlomazzaferro

from scikit-hts.

vtoliveira avatar vtoliveira commented on June 6, 2024

Also, we could do something like:

from hts.datasets import load_visnights
visnights = load_visnights(long_format=True) # default
>>> visnights.head(5)
         date state   zone  total_visitors_nights
0  1998-01-01   NSW  Metro            9047.095397
1  1998-04-01   NSW  Metro            6962.125890
2  1998-07-01   NSW  Metro            6871.963047
3  1998-10-01   NSW  Metro            7147.292612
4  1999-01-01   NSW  Metro            7956.922814

And

visnights = load_visnights(long_format=False)
visnights.head(5)


state_zone | NSW_Metro | NSW_NthCo | NSW_NthIn | NSW_SthCo | NSW_SthIn | OTH_Metro | OTH_NoMet | QLD_Cntrl | QLD_Metro | QLD_NthCo | ... | WAU_Coast | WAU_Inner | WAU_Metro | NSW | QLD | SAU | VIC | WAU | OTH | total
-- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | --

9047.095397 | 8565.677890 | 2977.506662 | 5818.028804 | 2679.537963 | 3437.923945 | 2073.469372 | 2748.373689 | 12106.052376 | 2137.234012 | ... | 3066.555070 | 694.995372 | 3075.778941 | 29087.846716 | 16991.660077 | 6368.246298 | 18641.013782 | 6837.329383 | 5511.393317 | 83437.489573
6962.125890 | 7124.468362 | 3477.702717 | 2466.436706 | 3010.732155 | 2677.081107 | 1787.938519 | 4040.915256 | 7786.686688 | 2269.595619 | ... | 3334.405408 | 557.679575 | 2154.928814 | 23041.465829 | 14097.197563 | 4479.766589 | 12427.741460 | 6047.013796 | 4465.019627 | 64558.204863
6871.963047 | 4716.893116 | 3014.770331 | 1928.052834 | 3328.869005 | 3793.742887 | 2345.020634 | 5343.964347 | 11380.023616 | 4890.227020 | ... | 4365.844091 | 1006.184417 | 2787.286174 | 19860.548333 | 21614.214983 | 4344.740630 | 11167.789164 | 8159.314681 | 6138.763522 | 71285.371313
7147.292612 | 6269.299065 | 3757.972112 | 2797.555974 | 2417.772236 | 3304.231082 | 1943.688721 | 4260.418878 | 9311.460272 | 2621.548165 | ... | 4521.995729 | 1172.551447 | 2752.909841 | 22389.891997 | 16193.427315 | 4792.987183 | 12898.831518 | 8447.457017 | 5247.919803 | 69970.514833
7956.922814 | 9493.901336 | 3790.759900 | 4853.680680 | 3224.285428

Or something suggestive to get data in a format that scikit-hts requires. (This would help us in tests as well)
Sorry for above, I just have the idea, I will put images later and edit the comment.

from scikit-hts.

carlomazzaferro avatar carlomazzaferro commented on June 6, 2024

#70

from scikit-hts.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.