Coder Social home page Coder Social logo

immoeliza-webscraping's Introduction

challenge-collecting-data ImmoEliza

Date of project

26/06/2023-30/06/2023

Project

A fictional real estate company "ImmoEliza" wants to create a Machine Learning model to make price predictions on real estate sales in Belgium. Herein, a dataset would need to be created to gather information about at least 10.000 properties all over Belgium. This dataset will later be used as a training set for the prediction model.

Data Collection and Scraping

We focused on scraping the data from "Immoweb", a highly utilised real estate platform in Belgium to list new available property. We had gathered data concerning:
  • Price
  • Address
  • Building Condition
  • Construction Year
  • Bedrooms
  • Terrace (surface)
  • Shower rooms
  • Office
  • Toilets
  • Energy Class
  • Type of Kitchen
  • Furnished
  • Parking Space
  • Garden Area
  • Installation

    To run the code, you will need to install/import the following:
  • Requests
  • BeautifulSoup
  • ThreadPoolexecutor
  • Regex
  • Pandas
  • Time
  • Criteria

  • Contains a minimum of 10,000 inputs- yes
  • Contains data for all of Belgium-yes
  • Non-numeric values have been minimized-yes
  • Used threading to speed up the collection-yes
  • Personal situation

  • Repository : `challenge-collecting-data`
  • Type of Challenge : `Consolidation`
  • Team Challenge : `Group`
  • Team Members : `Fré Van Oers`, `Jonathan_Rab`, `Mythili`
  • immoeliza-webscraping's People

    Contributors

    jonathanrabbi avatar mythilipalanisamy avatar

    Stargazers

     avatar

    Watchers

     avatar  avatar

    Recommend Projects

    • React photo React

      A declarative, efficient, and flexible JavaScript library for building user interfaces.

    • Vue.js photo Vue.js

      🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

    • Typescript photo Typescript

      TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

    • TensorFlow photo TensorFlow

      An Open Source Machine Learning Framework for Everyone

    • Django photo Django

      The Web framework for perfectionists with deadlines.

    • D3 photo D3

      Bring data to life with SVG, Canvas and HTML. 📊📈🎉

    Recommend Topics

    • javascript

      JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

    • web

      Some thing interesting about web. New door for the world.

    • server

      A server is a program made to process requests and deliver data to clients.

    • Machine learning

      Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

    • Game

      Some thing interesting about game, make everyone happy.

    Recommend Org

    • Facebook photo Facebook

      We are working to build community through open source technology. NB: members must have two-factor auth.

    • Microsoft photo Microsoft

      Open source projects and samples from Microsoft.

    • Google photo Google

      Google ❤️ Open Source for everyone.

    • D3 photo D3

      Data-Driven Documents codes.