Coder Social home page Coder Social logo

inogii / lacasademickeymouse Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 6.72 MB

Regression models for housing price prediction

Python 100.00%
advanced-regression advanced-regression-techniques housing prediction regression regression-algorithms regression-models

lacasademickeymouse's Introduction

Housing Price Prediction Model

This repository contains an exploration and implementation of various machine learning models to predict housing prices. The project was developed as part of the Machine Learning course at the Carlos III University of Madrid.

Overview

The project was centered around understanding the relationship between various input variables and housing prices. The primary goal was to create a model that could predict the price of houses based on these input features.

Data Preprocessing

  • One-Hot Encoding: Applied to all categorical variables. Irrelevant variables were discarded.
  • Normalization: Used Z-Score normalization on real (float) and integer data. However, special treatment was given to the price data due to its specific distribution.
  • Visualizations: Functions to visualize distributions and relations between variables were utilized.
  • Correlation Matrix: Established to visualize relationships among variables.
  • Data Splitting: The dataset was divided into training (80%) and testing (20%) samples.

Feature Engineering

A key component that set our model apart was the exploration of non-linear relations between features in our dataset. This exploration allowed us to create a unique set of input features, especially with a combination of bathrooms, bedrooms, and ratings (aseos+hab*rating) showing a linear correlation with the price.

Model Exploration

  • Baseline Model: Linear regression from scikit-learn was used as the baseline. Its performance was surprisingly competitive.
  • Other Models Tested: RandomForestRegressor, ElasticNet, Lasso, Ridge, DecisionTreeRegressor, KNeighborsRegressor, GradientBoostRegressor, AdaBoostRegressor, and CatBoostRegressor.
  • Model Optimization: Hyperparameters for GradientBoostRegressor and CatBoostRegressor were particularly optimized.

Metrics

  • MAE (Mean Absolute Error): Used to measure the average of the absolute difference between the predicted and actual values.
  • MAPE (Mean Absolute Percentage Error): Offered a percentage error which is more interpretable and scale-independent.

Results

The standout model in terms of performance was CatBoostRegresso, achieving MAPE values around 10-12% on the sample test set. In the competition held in class, its MAPE was of 12.91%, the best result among all groups, 0.64% better than the second best group.

Usage

You can test the model on your own by running the following command:

cd validator
python validator.py

References

lacasademickeymouse's People

Contributors

fermartinsb avatar inogii avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.