Coder Social home page Coder Social logo

forecasting-real-estate-prices's Introduction

Overview

Made for a Flatiron School data science course, the purpose of this project was to find areas of potential investment for investors in housing real estate markets.

There are four Jupyter notebooks in this repository:

  1. EDA, for Exploratory Data Analysis
  2. checking_trends, to examine data and see if data could be made stationary
  3. modeling, to experiment with various time series model types and parameters
  4. forecasting, to predict housing prices for the best-performing models

Data

This project uses time series data from Zillow, an online real estate marketplace.

  • Almost 15,000 zip codes in dataset. About 35% of all zip codes in United States.
    • Data set consists of mean housing prices for each zip code, April 1996 - April 2018
  • For this project, I looked at zip codes in the New Orleans area to investigate and model
    • Ended up using five zip codes

Process

Checking Data for Stationarity

  • Time series models only work if data is already or can be made stationary
    • Meaning the mean, variance and autocorrelation structure do not change over time.
  • Used Dickey-Fuller Test to check various data manipulation strategies
    • Root Transformations (didn’t pass)
    • Rolling Mean Subtraction (passed)
    • Differencing (passed)

Went with differencing as it works well with ARIMA models

Modeling

Process

Found ARMA Baseline

Ran ARMA on each zip code with minimum differencing d and tried various (p, q) parameters to get an idea of minimum AIC

Performed Train-Test Split

Made training set all data except for final 12 months which were reserved as test data

Ran ARIMA models

For some zip codes, experimented with ARIMA and for loops of various (p, d, q) parameters on training data

Ran auto-ARIMA models

Let the module automatically find which parameters yielded lowest AIC on training data. In the end, went with best auto-ARIMA parameters for all five zip codes

Re-ran parameters on SARIMAX models

SARIMAX offers diagnostic and forecasting functionality

Model Assessment

To choose the best models that would yield the most predictable forecasts, I compared the AIC metric and root mean squared errors (between forecasted training data and reserved test data).

Forecasting

I ultimately chose the time series models of three zip codes to run on the full datasets for forecasting.

  • Two zip codes were predicted to yield low ROIs (neither higher than 1.4%) with relatively low risk
  • The third zip code was predicted to fall sharply and was a high risk investment

Conclusion and Next Steps

None of the five zip codes could be recommended as good investment opportunities. Nevertheless, as a student working through this project, my knowledge of time series models was deepened.

Possible next steps were this project to continue:

  • Experiment with other types of time series modeling like Facebook Prophet to see if they yielded different results

  • Try fitting models with different ranges of training data to see if it improves results

  • Investigate other zip codes for investment opportunities for this hypothetical client

forecasting-real-estate-prices's People

Contributors

halpert3 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.