Coder Social home page Coder Social logo

kevinfernaando / realtor-web-scraper Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 48 KB

A Python-based web scraping tool for extracting tailored real estate data from Realtor.com. Customize parameters to filter and collect property listings, ideal for investors, researchers, and market enthusiasts.

Python 100.00%
beautifulsoup data-science python selenium webscraping

realtor-web-scraper's Introduction

Realtor Web Scraper

The Realtor Web Scraper is a powerful tool for extracting real estate data from Realtor.com. This Python-based web scraping project allows you to define custom parameters to filter and collect specific real estate listings and related information according to your preferences. Whether you're a real estate investor, researcher, or simply curious about the market, this scraper simplifies the process of obtaining valuable data for your analysis.

Tech Stack

The Realtor Web Scraper utilizes a combination of Python libraries and tools to provide powerful web scraping capabilities and data analysis. Here's an overview of the tech stack used in this project:

  • Python: The core programming language for building and running the scraper.

  • Beautiful Soup 4 (bs4): A Python library for web scraping and parsing HTML and XML documents. It enables you to extract structured data from web pages.

  • NumPy (numpy): A fundamental library for numerical computing in Python. While not directly related to web scraping, it is valuable for data manipulation and analysis after data extraction.

  • Pandas (pandas): A popular Python library for data manipulation and analysis. It's used for cleaning, transforming, and analyzing the scraped data.

  • Selenium (selenium): A web testing framework that allows you to automate web interactions, such as navigating websites and filling out forms. In this project, Selenium is used to interact with web pages when necessary.

  • Streamlit (streamlit): A Python library for creating interactive web applications with minimal code. You can use Streamlit to build user-friendly interfaces for displaying and analyzing the scraped data.

  • Undetected Chromedriver (undetected_chromedriver): A library that helps bypass bot detection mechanisms on websites that use ChromeDriver. This is particularly useful when using Selenium for web scraping.

These libraries and tools have been carefully chosen to provide a robust and efficient solution for web scraping, data extraction, and data analysis in your Realtor Web Scraper project.

Run Locally

I'm having issue to upload this to streamlit cloud but you still can run it locally, make sure you have streamlit installed on your device.

Clone the project

  git clone https://github.com/kevinfernaando/realtor-web-scraper

Go to the project directory

  cd realtor-web-scraper

Start the server

  streamlit run app.py

Screenshots

This is going to be the first page that you see

Screenshot 2023-09-27 at 11 38 46

You must input the location with [City, State] format like London, KY then you can fill the filter as needed

Screenshot 2023-09-27 at 11 31 34

You just need to clik "Scrape Data" button and the app will scrape the data and return a dataframe, you also can download the scraped data into csv file by pressing "Download Data" button

Screenshot 2023-09-27 at 11 50 08

Authors

realtor-web-scraper's People

Contributors

kevinfernaando avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.