Coder Social home page Coder Social logo

saritmaitra / historical-price-index Goto Github PK

View Code? Open in Web Editor NEW
7.0 2.0 3.0 4.7 MB

Predictive Analytics

Jupyter Notebook 100.00%
forecasting-model predictive-analytics predictive-modeling python-3-6 data-science linear-regression arima-model sarimax time-series-analysis feature-engineering

historical-price-index's Introduction

Predictive modeling using Regression Techniques and ARIMA

The project used NYSE stock index of Gold price (2000-2019) to create a predictive model to forecast future price.

In the 1st pahse project used a simple auto correlation and partial auto correlation to forecast future price. In the 2nd phase project used linear regression and ARIMA using grid search technique. Different features.

Data is measured in USD and are linked to Close price index. Therefore related to inflation. Much of the trend is merely due to inflation. Therefore, series has been defalated to de-trend the data. Eliminated seasonal patterns by substracting periodical values. Data here is daily scale and converted to monthly scale for computational ease. Therefore convetrted to 12-month seasonal and used 12-lag difference to get a flatter series. Also used logging to linearise with an exponential trend.

Linear regression

In linear regression vilatility has been taken care with added features:

f01 = (High - Low)/(Low)*100

f02 = (Close - Open)/(Open)*100

f03 = (Close) / (Open)-1

Moreover, moving averages of long and short window was taken:

s01 = Close.rolling(window=30)

s02 = Close.rolling(window=100)

ARIMA

Performed logarithm transofrmation to stabilise (heteroscedasticity) the series. Logarithm transformation equals the sum of the logarithms, i.e., LOG(XY) = LOG(X) + LOG(Y). This converts multiplicative to additive and random walk to geometric random walk i.e.exponential trends to linear trends. This also helps starightening out exponential growth patterns to fit into a linear model. Logging helps to incorporate an explicit forecast of future inflation into the model.

That's typically a series of daily returns where each return is expressed in continually compounded terms. For each day, we take the natural log of the ratio of stock prices (i.e., price today divided by price yesterday, and so on). This series of daily returns where each return is expressed in continually compounded terms. For each day, a natural log was taken of the ratio of stock prices (i.e., price today divided by price yesterday, and so on).

Rt = ln(St/St-1), Rt = return of the day, St = price on day t, St-1 = price day before t.

This produces a series of daily returns, from ui to Rt-n, depending niumber of days (n = days) measuring. This is simple moving average of the squared periodic returns and each squared return is given an equal weight.

Also performed exponentially weighted moving average (EWM). Here, instead of equal weights in moving average, each squared return is weighted by a multiplier. The weakness of simple moving average is that all returns earn the same weight. Recent return has no more influence on the variance than last month's return. Therefore in EWM more recent returns have greater weight on the variance. This ensures a variance that is weighted toward more recent data.

The unpredictable amplitude of seasonal variations suggests multiplicative pattern.

Gold price = trend* seasonal*residual(multiplicative)

Multivariate regression

Univariate time series=

            Yt = m + ɸYt-1 + ɛ

• Yt= stationary time series and a linear combination of its past values t-1, t-2….

• m=coefficient

• ɸ = slope coefficient

• ɛ = error terms(white noise)

Structural model-

            Yt = β0 + β1X

• Y= dependent variable

• X = independent variable

• β0 = intercept

• β1 = slope coefficient

Structural model do not have time component because of auto correlation issue. With time component, the values will be correlated with time, which impacts error term. Therefore, variables here added with cross section.

Multivariate time series which combines above-

Structural model do not have time component because of auto correlation issue. With time component, the values will be correlated with time, which impacts error term. Therefore, variables here added with cross section.

Multivariate time series which combines above-

            Yt = μ + ɸ1Yt-1 + ɸ2Xt-1 + ɛt, 
            both Yt and Xt are stationary time series

historical-price-index's People

Contributors

saritmaitra avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.