Comments (9)
Gotcha, all the above makes sense. Thanks again! Might check in again with you a bit later when I've been able to play around more :)
from autots.
@catchlui
Most models should be retrained when re-run in the future. This is because most time series datasets are continually evolving - this isn't image classification where a cat is still a cat. You could pickle the model, but only if you plan to use it to generate forecasts for the exact same time period.
My usual plan is to export the top dozen or so models, then each new forecast time, run a much smaller number of generations, this way making sure the model used is always the best fit for whatever the data currently looks like. This is especially true for the use of seasonal
cross validation.
You can import and export results like so:
# train your model then:
# set n=1 if you only want your best model
model.export_template("my_models.csv", models='best', n=15, max_per_model_class=3)
# later on a new session
# you can set `max_generations=0` in model, and then it will only attempt the imported models
model = model.import_template("my_models.csv", method='only')
from autots.
best fit for whatever the data currently looks like. T
Thanks a lot ..
so this is the sequence ..
initialize the autots
fit the data to the model..
export the best models..
then intialize the autots with max_generation=0
then fit with the new data...
forecast ??
It would be good if you can put a code snippet ..that will be very helpful...
from autots.
I have plans to build a production code example with non-proprietary data at some point soon, see #45.
Yes, your sequence sounds correct: fit, export best .... import best, run with 0 generations (or more than 0 is fine, if you want active learning) and output the prediction/forecast.
@catchlui
from autots.
Hey! I also have the same question.
Above points is understood about the online training aspects of time series.
A pickled model can still be useful though - for instance, training a model one system1 and inferencing on system2 (this is my current use case). System2 is specialized & can really just inference.
It would be awesome if you can still provide guidance on pickling here.
from autots.
the .fit(result_file) only saves the model result values, not the actual models. In case for some reason it crashes or you want to restart your computer part way through, it can be reloaded.
I understand your concern about inference, probably on an edge device. For perspective, I have run a full .fit() on a Raspberry Pi 3 with 1 GB of RAM, and it worked fine, although only with the less memory intensive models. .fit for most models can be run just fine for most models on tiny devices. I haven't tried running any of these on a microcontroller, but I've got a Pi Pico and an ESP32 here if that's your use case, I'd be happy to test on that.
The exceptions are GluonTS and the ~Regression models (which sometimes use Tensorflow). I have a Coral Edge board that can only inference Tensorflow, and can't train. Best just to use one of the excellent non-neural network models. The Edge board will run Numpy and pandas just fine. But here's the thing, the Neural Network models are.... not that great. At best they equal the other models but ten times slower.
The ultimate problem is that most time series models here (from other packages like Statsmodels as well as some I have written) simply can't be picked and refreshed on new data. They will only forecast jumping off from the most recent data point given in training. Which makes it impossible to write a generalizable pickable API. I could do so for a small subset of models, but that is a niche and not much use. And it stands that it is always best to retrain on fresh data for time series, because the markets and world are always changing and models drift to unusable very quickly - unlike image recognition or something where a picture of cat is always just a cat.
Long response, sorry. Happy to try to make something work but I feel that the current api style is the best for time series and other approaches will just lead to problems.
from autots.
I should add, have you seen the model_forecast
function? It's in the extended_tutorial under Running Just One Model.
Basically, you do the AutoTS.fit() on a more powerful device. Then you take the model parameters and run model_forecast. Super fast and lightweight for all but the neural nets. @asgeorges
from autots.
Hey! Thanks for such a comprehensive answer!
I'm looking to deploy this model into production - not an edge device. I'm not a pro in deploying to prod, but I'd imagine the above solution you listed above won't suffice (a lot more moving parts than a static binarized file). My approach would be to:
- Batch train (for instance every day)
- Re-pickle model
- Run QA/sanity checks on dev system
- If model passes, send pickle to prod system
- Run online inference using pickle for 1 day
Above is how I"m currently thinking about it, but I'll think a bit more on whether your above approach can be used instead.
Sidenotes
- I'm a big fan of this project...it's pretty awesome
- I've found a few nuggets in the code :)
from autots.
I should be clear, the .predict() is entirely determined by the best_model params and internally calls the model_forecast
function. Using a wide
style dataframe, your best_model params, and any other keyword args, you can exactly duplicate the AutoTS.predict() with the model_forecast
function and save space and time. You don't need to pickle the entire AutoTS object. That will be potentially massive because the AutoTS class includes in it an entire copy of the original dataset, among other things.
Here's a simplified approach that should do the same as the above.
- Batch train (on dev or otherwise) to select your best_model params (model_name, model_params, transformation_params). Make sure there's plenty of validations so that the chosen model looks good for the entire year.
- Setup a production script with model_forecast and drop in your best model params. You could pickle the best model params but a simple plain text json file will do just fine for those.
- Include sanity checks on inputs and outputs. Be aware there are
constraints
(about to get a major update in 0.4.1) and no_negatives that can help enforce expected forecasts. I would say the top sanity check to perform for inputs is that there aren't a bunch of missing data, or massive shifts (like a definition change) in the most recent data. You could use something like the Great Expectations package but usually I do something much simpler. - Whenever you feel a refresh is needed, manually train in dev and then update the model parameters in prod when ready.
Otherwise I personally use a slightly variation on the production_example.py for my own production code. It's a different philosophy (requiring more compute) but works for me.
Glad you're enjoying the easter eggs. Please post feedback of anything you find annoying or difficult!
from autots.
Related Issues (20)
- [Library Info] Multivariate forecasting HOT 4
- Future covariates HOT 1
- Adding regressors HOT 7
- Using autots model for predicting,not forecasting HOT 5
- Can't see 'Contour' metric result HOT 9
- Limited Holiday Calendar Functionality in autots FBProphet for Time Series Prediction HOT 3
- model.predict gives different forecast depending on forecast_length HOT 5
- GluonTS not using all available (CPU) resources HOT 14
- Use only one variable as the target but supply many features to models. HOT 9
- Save best model for each serie instead of best model overall HOT 2
- Running out of RAM in 0.6.3 HOT 6
- adding autos package to https://repo.anaconda.com/pkgs/snowflake/ HOT 2
- In AutoTS class, custom dataframe is not being picked as initial_template HOT 2
- GluonTS model 'hangs' (on second template?) HOT 2
- Theta Template Eval Error HOT 3
- Fatal error on SeasonalityMotifImputer transformer HOT 2
- Additional metrics HOT 1
- if forecast_length == 'self': HOT 1
- AutoTS multiple variables HOT 5
- import_template erro Expecting value: line 1 column 1 (char 0) HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from autots.