Coder Social home page Coder Social logo

felipeangelimvieira / prophetverse Goto Github PK

View Code? Open in Web Editor NEW
28.0 2.0 3.0 39.22 MB

A multiverse of Prophet models for timeseries

Home Page: https://prophetverse.com

License: Apache License 2.0

Python 100.00%
hierarchical-time-series numpyro prophet timeseries-forecasting sktime

prophetverse's Introduction

Prophetverse

PyPI version codecov

Prophetverse leverages the theory behind the Prophet model for time series forecasting and expands it into a more general framework, enabling custom priors, non-linear effects for exogenous variables and other likelihoods. Built on top of sktime and numpyro, Prophetverse aims to provide a flexible and easy-to-use library for time series forecasting with a focus on interpretability and customizability. It is particularly useful for Marketing Mix Modeling, where understanding the effect of different marketing channels on sales is crucial.

Table of Contents

๐Ÿš€ Installation

To install with pip:

pip install prophetverse

Or with poetry:

poetry add prophetverse

๐Ÿ“Š Forecasting with default hyperparameters

Prophetverse model provides an interface compatible with sktime. Here's how to use it:

from prophetverse.sktime import Prophetverse

# Create the model
model = Prophetverse()

# Fit the model
model.fit(y=y, X=X)

# Forecast in sample
y_pred = model.predict(X=X, fh=[1,2,3,4])

๐ŸŒŸ Features & Comparison with Meta's Prophet

Prophetverse is similar to the original Prophet model in many aspects, but it has some differences and new features. The following table summarizes the main features of Prophetverse and compares them with the original Prophet model:

Feature Prophetverse Original Prophet Motivation
Logistic trend Capacity as a random variable Capacity as a hyperparameter, user input required The capacity is usually unknown by the users. Having it as a variable is useful for Total Addressable Market inference
Custom trend Customizable trend functions Not available Users can create custom trends and leverage their knowledge about the timeseries to enhance long-term accuracy
Likelihoods Gaussian, Gamma and Negative Binomial Gaussian only Gaussian likelihood fails to provide good forecasts to positive-only and count data (sales, for example)
Custom priors Supports custom priors for model parameters and exogenous variables Not supported Forcing positive coefficients, using prior knowledge to model the timeseries
Custom exogenous effects Non-linear and customizable effects for exogenous variables, shared coefficients between time series Not available Users can create any kind of relationship between exogenous variables and the timeseries, which can be useful for Marketing Mix Modeling and other applications.
Changepoints Uses changepoint interval Uses changepoint number The changepoint number is not stable in the sense that, when the size of timeseries increases, its impact on forecast changes. Think about setting a changepoint number when timeseries has 6 months, and forecasting in future with 2 years of data (4x time original size). Re-tuning would be required. Prophetverse is expected to be more stable
Scaling Time series scaled internally, exogenous variables scaled by the user Time series scaled internally Scaling y is needed to enhance user experience with hyperparameters. On the other hand, not scaling the exogenous variables provide more control to the user and they can leverage sktime's transformers to handle that.
Seasonality Fourier terms for seasonality passed as exogenous variables Built-in seasonality handling Setting up seasonality requires almost zero effort by using LinearFourierSeasonality in Prophetverse. The idea is to allow the user to create custom seasonalities easily, without hardcoding it in the code.
Multivariate model Hierarchical model with multivariate normal likelihood and LKJ prior, bottom-up forecast Not available Having shared coefficients, using global information to enhance individual forecast.
Implementation Numpyro Stan

๐Ÿค Contributing to Prophetverse

We welcome contributions! Check out our contributing guidelines to get started.

๐Ÿ“š Documentation

Detailed documentation is available here

prophetverse's People

Contributors

dependabot[bot] avatar felipeangelimvieira avatar felipeffm avatar fkiraly avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

prophetverse's Issues

[MNT] Modules at production don't pass pre-commit

Problem

Few modules pass in pre-commit requirements.

Solution

Refactor modules to pass pre-commit requirements.

Impact

Make it easier for new contributions to create PRs and increase the code quality.

[ENH] Add FourierSeriesSeasonality effect to improve user experience

Currently, a user needs to pass fourier terms in X dataframe or specify FourierFeatures in feature_transformer, for example:

model = ProphetNegBinomial(
    trend="linear",
    inference_method="mcmc",
    optimizer_steps=100_000,
    feature_transformer=FourierFeatures(
        sp_list=[7, 365.25],
        fourier_terms_list=[3, 8],
        freq="D",
        keep_original_columns=True,
    ),
    default_effect=LinearEffect(
        prior=dist.Normal(0, 0.1),
        effect_mode="multiplicative",
    ),
    noise_scale=2000,
    mcmc_samples=500,
    mcmc_warmup=1000,
    mcmc_chains=2,
)

Creating an Effect that already includes FourierFeatures could improve the experience here.

model = ProphetNegBinomial(
    trend="linear",
    inference_method="mcmc",
    optimizer_steps=100_000,
    exogenous_effects=[
    FourierSeasonality(
        sp_list=[7, 365.25],
        fourier_terms_list=[3, 8],
        freq="D",
        prior=dist.Normal(0, 0.1),
        effect_mode="multiplicative")

    ]
    noise_scale=2000,
    mcmc_samples=500,
    mcmc_warmup=1000,
    mcmc_chains=2,
)

May consider using skbase and make Effect inherit BaseEstimator to allow one to set params in a sklearn-like API.

[BUG] LiftExperimentLikelihood should use numpyro.handlers.mask instead of obs_mask to handle out-of-sample predictions

Following the explanation of @fehiepsi pyro-ppl/numpyro#1847 and the numpyro documentation,LiftExperimentLikelihood should use numpyro.handlers.mask instead of using obs_mask of numpyro.sample. What obs_mask actually does is imputing missing values by introducing a latent sample site, which is not really needed since the sample site used in LiftExperimentLikelihood is not used in subsequent calculations and the idea would be just to disconsider dates we don't have A/B test results for.

[ENH] `sktime` indexing in `all_estimators` and search function

Efforts to discuss how best to add indexing for prophetverse estimators in sktime.

Here is a draft PR as a basis for discussion: sktime/sktime#6614

I am trying to think of the best pattern for the user and for directly indexed packages.

Currently the PR uses a delegation pattern, which leads to a "copy" of the estimator in sktime. This is unsatisfactory imo as the copy is basically just a database entry.

What we could do is replace the copy by a direct import from prophetverse if the package is installed, avoiding the duplication of classes. However, at current state that would not work because:

  • not all tags are present in prophetverse, e.g., python_dependencies or authors and maintainers tag
  • the docstrings are differently formatted. sktime requires ReStructuredText, and to pass code formatting (see #60)

I also worry about the impact of having different classes depending on whether prophetverse is installed. Though I cannot see a problem because the constructor of the sktime class will failif prophetverse is not present.

Thoughts, @felipeangelimvieira?

[ENH] Refactor BaseBayesianForecaster

Description:

The BaseBayesianForecaster class currently has several responsibilities beyond forecasting using Bayesian inference. These responsibilities include:

Plotting
Selecting an optimizer
Managing exogenous effects
Scaling

It would improve maintainability and documentation if these responsibilities were separated into distinct components. This refactor would allow each component to focus on a single responsibility, making the codebase cleaner and easier to understand.

Proposal:

Create separate classes or modules for:
    Plotting
    Optimizer selection
    Exogenous effects management
    Scaling

Refactor the BaseBayesianForecaster class to utilize these new components, ensuring that it focuses solely on the Bayesian forecasting logic.

Update the documentation to reflect the new structure, providing clear guidelines on how to use and extend each component.

Write unit tests for each new component to ensure they work correctly both in isolation and when integrated with the BaseBayesianForecaster.

Benefits:

Improved code readability and maintainability
Easier to document and understand each component's functionality
Enhanced ability to test individual components independently
Facilitates future enhancements and modifications

Additional Information:
If there are any specific considerations or constraints that need to be taken into account during this refactor, please share them in the comments.

[BUG] The "|" operator for type hints is not supported in Python 3.9

When using the package with Python 3.9, the following error occurs:

def match_columns(self, columns: pd.Index | List[str]) -> pd.Index:
TypeError: unsupported operand type(s) for |: 'type' and '_GenericAlias'

Since only after 3.10 the | was accepted for union of type hints.

[BUG] Model does not raises error when convergence with MAP fails

Some hyperparameters may cause unstable optimization and lead to nans. The code currently ignores this and return the nans without raising the error. An exception with informative message (suggesting the change of learning rate for example) could improve user experience.

[ENH] modularization of effects - design discussion

Design issue related to modularization of effects objects.

Transplanting some discussion from sktime here in relation to modularization of effects, from sktime/sktime#6639 (comment)


@felipeangelimvieira:

Should effects implement get_params() behavior so that their parameters can be set with set_params()?


@fkiraly:

That is not required by the API but I would see it as a good practice suggestion, or users will not have access to tuning and automl related to these in a more granular way than the entire hyperparameter (not the nested ones)

If the objects have a fixed set of parameters, or is a list, you can inherit from various scikit-base classes to give the API a nice sklearn-like flavour with get/set-params.

Concretely, what you could do in this case:

  • have AbstractEffect inherit from skbase BaseObject - that will give it set_params, get_params and a couple other things. You can test conformance with sktime check_estimator, too.
    • idea: probably skbase should have a check_object too, which checks the minimal skbase API - but you can do via sktime atm.
  • For "list of AbstractEffect". you can inherit from BaseMetaObject. If you want lists of abstract effects to also be abstract effects, you can inherit BaseMetaObjectMixin and AbstractEffect.
    • would be keen to hear about the developer experience here, if you are going to try.

[DOCS] Getting started

I think a smother introduction to the library with examples can help new user get used with prophet verse interface.

Suggestion:
Create a getting started guide briefly explaining how to apply the methods, the bigger picture of the main parameters, andreference the notebooks with examples.

โ”œโ”€โ”€ 1-univariate_prediction.md
โ”œโ”€โ”€ 1.1-change_priors.md
โ”œโ”€โ”€ 1.2-exogenous_variables copy.md
โ”œโ”€โ”€ 1.3-non_linear_exogenous_variables.md

โ”œโ”€โ”€ 2-multivariate_prediction.md
โ”œโ”€โ”€ 2.1-change_priors.md
โ”œโ”€โ”€ 2.2-exogenous_variables.md
โ”œโ”€โ”€ 2.3-non_linear_exogenous_variables copy.md
โ”œโ”€โ”€ 2.4-shared_coefficients.md

โ””โ”€โ”€ 5-hierarchical-forecasting.md

โ”œโ”€โ”€ 4-optimization.md

[BUG] _to_positive returning zeros due to float precision.

The function _to_positive, which is used to force a positive mean for Gamma and Neg Binomial likelihoods can return zeros due to float precision. This then breaks the model, leading to "NaNs" in the loss. A solution would be to add a clip to avoid this error.

Code for reproducing the bug

from prophetverse.sktime import Prophetverse
import pandas as pd
import numpy as np


min_val = 3
_y =np.clip(np.linspace(0, 10, 1000), min_val, None) - min_val + 1e-6
y = pd.Series(
    _y,
    index=pd.date_range("2022-01-01", periods=len(_y), freq="D"),
)
Prophetverse(likelihood="gamma").fit(y)

[ENH] Add trend offset prior scale as hyperparameter

The piecewise trend models are sensitive to offset prior scale. Testing the model with some data showed that increasing offset prior scale can have significative positive effects on accuracy. Currently, we cannot tune that offset prior scale unless we pass manually a trendmodel instead of the string. It would be interesting to be able to pass that parameter through init, and maybe adding **kwargs to TrendModels to handle unknown args.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.