Comments (8)
Hey @makkarss929 ,
Thank you for reaching out.
Yes, Darts does look promising based on our quick scan
An integration idea makes sense as well.
However, at the moment, we do not support time dimensions in our data-layer
Theoretically, a function to "initialize" a table or a collection as a time-series object must be defined. Also, some customizations on the .predict
API call need to be done.
How do you plan on integrating with our framework?
from superduperdb.
It’s simple. We can add a time dimension to our data layer.
In darts, We need 2 things time
and value
in a table or collection to specify a series
. darts will do parsing and sorting according to time for us.
There will be different parameters for different models.
- Generally, deep learning models need
input_chunk_length
, andoutput_chunk_length
while initializing models .fit()
needsepochs
andseries
andcovariates
.predict()
needsn
(number of forecasts) andcovariates
.
NOTE:
covariates
are those who are not part of the forecasting, but help in the forecasting, like day, week, month, temperature, and sales, It can be anything that helps to forecast.
See the below example of LSTM
my_model = RNNModel(
model="LSTM",
hidden_dim=20,
dropout=0,
batch_size=16,
n_epochs=300,
optimizer_kwargs={"lr": 1e-3},
model_name="Air_RNN",
log_tensorboard=True,
random_state=42,
training_length=20,
input_chunk_length=14,
force_reset=True,
save_checkpoints=True,
)
my_model.fit(
train_transformed,
future_covariates=covariates,
val_series=val_transformed,
val_future_covariates=covariates,
verbose=True,
)
pred_series = my_model.predict(n=26, future_covariates=covariates)
There are lots of models and Darts has very neat and clear documentation for all of them like SKlearn.
- we can start with
simple models
withfewer parameters
. - Later we can add on Deep learning models.
What are your thoughts? @anitaokoh
from superduperdb.
Hi @makkarss929 it's a great idea to potentially add a time-dimension to the Datalayer
, but how would you do this concretely?
Currently, when we do predictions, we use single data points. So the documents look like this:
{
"input_data": [0, 1, 3, 6],
"_outputs": {"input_data": {"my_model": {"0": <output-of-model>}}}
}
However with time-series, I would think you have multiple inputs relevant to a prediction. How would you handle that?
from superduperdb.
Hi @blythed we can something like this see the example below
{
"input_data": [
{"time": "2024-01-01", "values": [0, 1, 3, 6]},
{"time": "2024-01-02", "values": [1, 2, 4, 7]},
{"time": "2024-01-03", "values": [2, 3, 5, 8]}
],
"_outputs": {
"my_model": {
"2024-0-01": <output-of-model-at-2024-01-01>,
"2024-01-02": <output-of-model-at-2024-01-02>,
"2024-01-03": <output-of-model-at-2024-01-03>
}
}
}
from superduperdb.
Ok @makkarss929 that's fine, but it doesn't really reflect the real world scenario that new time series data is probably inserted into new records.
from superduperdb.
We can do something, like this, user will provide input_chunk_length
and output chunk length
we need to modify according to that
{
"data": [
{
"time": "2024-01-01",
"input_data": [0, 1, 3, 6],
"_outputs": {
"my_model": [4, 5] # <output-of-model-at-2024-01-01>
}
},
{
"time": "2024-01-02",
"input_data": [1, 2, 4, 7],
"_outputs": {
"my_model": [10, 17] # <output-of-model-at-2024-01-02>
}
},
{
"time": "2024-01-03",
"input_data": [2, 3, 5, 8],
"_outputs": {
"my_model": [1, 5] # <output-of-model-at-2024-01-03>
}
},
]
}
from superduperdb.
That won't solve the problem. Imagine you have new data coming in? What do you do with it? Do you add it to an existing document, or put in a new document? What would happen if you keep putting in 1 document?
from superduperdb.
HI @blythed
splitting the data into separate documents based on a time range
criterion
from superduperdb.
Related Issues (20)
- [SERIALIZE] Make `Schema` an option for MongoDB HOT 1
- [SERIALIZE] Replace `cls`, `module`, `dict` with `_path` and the rest
- [SERIALIZE] Cleanup extraneous serialization methods
- [BUG]: AttributeError: module 'os' has no attribute 'uname' in Windows
- [MISC] Use the NO CHANGELOG label to optimize the changelog detection logic in CI. HOT 1
- [BUG-0.2.0]: Failed to connect to the MongoDB database using the usename and password. HOT 1
- [SERIALIZE] Auto-infer `Schema` from data
- [SERIALIZE] Make `Document` wrapping optional on insert
- [SERIALIZE] Lazy-creation of output tables for `ibis` to enable auto-inference of output schema
- [REL-CLT][CONTRIBUTE] Remarks
- Handle `super()` doc-string parameters in a clever/ simple way. HOT 1
- [BUG] Ibis error HOT 1
- [BUG] MSSQL 2022 connection error HOT 1
- [Feature] Default location of runtime files
- [BUG] Default data backend HOT 2
- [SERIALIZE] Fix CI and integration tests for new serialization
- Remove all __post_init__ from Component/Model subclasses. HOT 1
- Ensure init Method call `db.load` to Restore leaf Instances HOT 3
- For insertions and queries, use a unified interface HOT 5
- Automatically Infer Data Schema for Inserted Data
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from superduperdb.