Coder Social home page Coder Social logo

denadai2 / real-estate-neighborhood-prediction Goto Github PK

View Code? Open in Web Editor NEW
72.0 6.0 12.0 208 KB

Code to repeat the experiments of "The economic value of neighborhoods: Predicting real estate prices from the urban environment"

Home Page: http://www.marcodena.it

Python 6.84% Lua 1.92% Jupyter Notebook 87.13% TSQL 4.10%
real-estate datamining urban urban-planning

real-estate-neighborhood-prediction's Introduction

The economic value of neighborhoods: Predicting real estate prices from the urban environment

This repository contains all the code required to reproduce the results presented in the following paper:

  • M. De Nadai, B. Lepri. The economic value of neighborhoods: Predicting real estate prices from the urban environment, 2018.

Input, intermediary and source data can be downloaded from figshare.

Dependencies

Dependencies are listed in the requirements.txt file at the root of the repository. Using Python 3.6 with pip all the required dependencies can be installed automatically.

pip3 install -r requirements.txt

Data

Due to storage constraints, input data are not integrated to this repository. However, input and intermediary files required to run the analysis can be downloaded from a figshare. To run the following code, input and/or the intermediary files must be downloaded and placed in the folder. Then, do:

createdb dsaa
gunzip < intermediate_db_backup.sql.gz | psql dsaa
tar -xf data.tar -C data/

Then place the content of dsaa_census_areas.zip into data/generated_files/.

To produce the intermediary files, go to the section "DIY Instructions".

Code

The code of the analysis in divided in two parts: the Python scripts and modules used to support the analysis, and the notebooks where the outputs of the analysis have been produced.

Scripts

  • data_processing_houses.ipynb : script used for the pre-processing of Immobiliare.it data.
  • compute_walkability.py : script used to generate the walkability scores for each census area.
  • data_processing_neighborhood.py : script used to create all the dataset.
  • predict.py : script used to predict the housing value from the intermediary files.
  • plots.ipynb : script used to produce the images of the manuscript.

License

This code is licensed under the MIT license.

DIY Instructions

Here we generate the entire database from ground. To do so, we have to create the minimal setup from this command:

psql dsaa < data/SQL/minimal.sql
psql dsaa < data/SQL/minimal_materialize.sql

Additional dependencies

Census data

Census data have to complay to the format of the census_areas_onfocus table. Only when you did import data to this table you can proceed with all the steps. When you imported the data, you can generate the spatial matrix here:

psql dsaa < data/SQL/first-DIY-step.sql

Walkability

A OpenStreetMap file has to be downloaded (preferably from here), and placed in data/OSM. Then they are imported in PostGIS with:

osm2pgsql -c -d dsaa --create --style "config/osm2pgsql.style" --multi-geometry --number-processes 5 --latlong -C 30000 [FILENAME].osm.pbf

The same file OSM file can then be used to produce the OSRM database:

osrm-extract -p config/profiles/foot.lua [FILENAME].osm.pbf
osrm-contract [FILENAME].osrm

To run the server, use the command

osrm-routed [FILENAME].osrm

After this everything is set up to create the intermediate data in the database. Import all the materialized view, then run the script. Before running it, personalize line 13 and 35 of compute_walkability.py.

psql dsaa < data/SQL/walkability.sql
python3 compute_walkability.py

Security perception

To create the security perception scores, we use the code and weights of the following paper:

  • De Nadai, M., Vieriu, R. L., Zen, G., Dragicevic, S., Naik, N., Caraviello, M., ... & Lepri, B. Are safer looking neighborhoods more lively?: A multimodal investigation into urban life. In ACM MM 2016.

Everything is available here. All the prediction should be placed inside the placepulse table in PostgreSQL. Then, you can impor/refresh the materialized view present here:

psql dsaa < data/SQL/security.sql

Companies

You can insert a dataset with the census areas (geoid) and a proxy of companies earnings (fatturato) in data/companies.csv. Pay attention that this is included only in the non-open model version.

Land value

You can insert a dataset with the census areas (geoid) and a proxy of land value (assessed_land_value) in data/land_value.csv. Pay attention that this is included only in the non-open model version.

Census

Census data has to be inserted with the same format as the files placed in data/census and data/census/industry. To change this, change the corrisponding code at data_processing_housing.py.

Land use

Download satellite shapefiles from https://land.copernicus.eu/local/urban-atlas/urban-atlas-2012/view. Import them in the urban_atlas PostgreSQL table. Then run the code:

psql dsaa < data/SQL/urban_atlas.sql

Some additional notes to the repository

  • XGBoost 0.72 for some reason is not available anymore. I changed it to 0.71 because many users have contacted me because of this issue.

real-estate-neighborhood-prediction's People

Contributors

denadai2 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.