Coder Social home page Coder Social logo

[FEATURE REQUEST]: about superduperdb HOT 8 CLOSED

makkarss929 avatar makkarss929 commented on May 20, 2024
[FEATURE REQUEST]:

from superduperdb.

Comments (8)

anitaokoh avatar anitaokoh commented on May 20, 2024

Hey @makkarss929 ,

Thank you for reaching out.

Yes, Darts does look promising based on our quick scan

An integration idea makes sense as well.

However, at the moment, we do not support time dimensions in our data-layer

Theoretically, a function to "initialize" a table or a collection as a time-series object must be defined. Also, some customizations on the .predict API call need to be done.

How do you plan on integrating with our framework?

from superduperdb.

makkarss929 avatar makkarss929 commented on May 20, 2024

It’s simple. We can add a time dimension to our data layer.

In darts, We need 2 things time and value in a table or collection to specify a series. darts will do parsing and sorting according to time for us.

There will be different parameters for different models.

  1. Generally, deep learning models need input_chunk_length, and output_chunk_length while initializing models
  2. .fit() needs epochs and series and covariates
  3. .predict() needs n (number of forecasts) and covariates.

NOTE: covariates are those who are not part of the forecasting, but help in the forecasting, like day, week, month, temperature, and sales, It can be anything that helps to forecast.


See the below example of LSTM



my_model = RNNModel(
    model="LSTM",
    hidden_dim=20,
    dropout=0,
    batch_size=16,
    n_epochs=300,
    optimizer_kwargs={"lr": 1e-3},
    model_name="Air_RNN",
    log_tensorboard=True,
    random_state=42,
    training_length=20,
    input_chunk_length=14,
    force_reset=True,
    save_checkpoints=True,
)


my_model.fit(
    train_transformed,
    future_covariates=covariates,
    val_series=val_transformed,
    val_future_covariates=covariates,
    verbose=True,
)


pred_series = my_model.predict(n=26, future_covariates=covariates)

There are lots of models and Darts has very neat and clear documentation for all of them like SKlearn.

  1. we can start with simple models with fewer parameters.
  2. Later we can add on Deep learning models.

What are your thoughts? @anitaokoh

from superduperdb.

blythed avatar blythed commented on May 20, 2024

Hi @makkarss929 it's a great idea to potentially add a time-dimension to the Datalayer, but how would you do this concretely?

Currently, when we do predictions, we use single data points. So the documents look like this:

{
    "input_data": [0, 1, 3, 6],
    "_outputs": {"input_data": {"my_model": {"0": <output-of-model>}}}
}

However with time-series, I would think you have multiple inputs relevant to a prediction. How would you handle that?

from superduperdb.

makkarss929 avatar makkarss929 commented on May 20, 2024

Hi @blythed we can something like this see the example below

{
    "input_data": [
        {"time": "2024-01-01", "values": [0, 1, 3, 6]},
        {"time": "2024-01-02", "values": [1, 2, 4, 7]},
        {"time": "2024-01-03", "values": [2, 3, 5, 8]}
    ],
    "_outputs": {
        "my_model": {
            "2024-0-01": <output-of-model-at-2024-01-01>,
            "2024-01-02": <output-of-model-at-2024-01-02>,
            "2024-01-03": <output-of-model-at-2024-01-03>
        }
    }
}

from superduperdb.

blythed avatar blythed commented on May 20, 2024

Ok @makkarss929 that's fine, but it doesn't really reflect the real world scenario that new time series data is probably inserted into new records.

from superduperdb.

makkarss929 avatar makkarss929 commented on May 20, 2024

We can do something, like this, user will provide input_chunk_length and output chunk length we need to modify according to that

{
    "data": [
        {
            "time": "2024-01-01",
            "input_data": [0, 1, 3, 6],
            "_outputs": {
                "my_model":   [4, 5] # <output-of-model-at-2024-01-01>
            }
        },
        {
            "time": "2024-01-02",
            "input_data": [1, 2, 4, 7],
            "_outputs": {
                "my_model":  [10, 17]  # <output-of-model-at-2024-01-02>
            }
        },
        {
            "time": "2024-01-03",
            "input_data": [2, 3, 5, 8],
            "_outputs": {
                "my_model": [1, 5] # <output-of-model-at-2024-01-03>
            }
        },
    ]
}

from superduperdb.

blythed avatar blythed commented on May 20, 2024

That won't solve the problem. Imagine you have new data coming in? What do you do with it? Do you add it to an existing document, or put in a new document? What would happen if you keep putting in 1 document?

from superduperdb.

makkarss929 avatar makkarss929 commented on May 20, 2024

HI @blythed

splitting the data into separate documents based on a time range criterion

from superduperdb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.