Coder Social home page Coder Social logo

harry-s-grewal / mls-real-estate-scraper-for-realtor.ca Goto Github PK

View Code? Open in Web Editor NEW
46.0 2.0 14.0 11 KB

Python MLS and Real-Estate Data Scraper for the Realtor.ca Website

License: MIT License

Python 100.00%
housing housing-prices real-estate scraper webscraping etl-framework canada mls property

mls-real-estate-scraper-for-realtor.ca's Introduction

Realtor.ca API Wrapper and Scraper

Python wrapper and scraper for the Realtor.ca website. Use it to scrape Canadian real-estate listings easily.

Installation

Use the package manager pip to install the package requirements.

pip install git+https://github.com/harry-s-grewal/mls-real-estate-scraper-for-realtor.ca.git

Local Development

git clone https://github.com/harry-s-grewal/mls-real-estate-scraper-for-realtor.ca.git
python -m venv venv
. venv/bin/activate
pip install -r ./mls-real-estate-scraper-for-realtor.ca/requirements.txt

Context

Realtor.ca has two API endpoints: PropertySearch_Post and PropertyDetails. Querying PropertySearch_Post will return a list of properties in a .json format, including some limited details. Querying PropertyDetails will provide detailed information on each property. Depending on what you're looking for, you can query one or the other, but be aware that getting details on each property is slow. That's because Realtor.ca is rate limited (boo). If you make too many queries too often, you'll receive an Error 403: Unauthorized error. It's not clear what the rate limit is, but waiting an hour or so between limits stops the freeze-out.

Usage

In queries.py you will find queries to Realtor.ca for both the PropertySearch_Post endpoint and the PropertyDetails endpoint. It also contains a query to get the coordinate bounding box of a city, as that's what Realtor.ca uses to determine which properties to list.

In realtorca.py there are two functions to automate the scraping of Realtor.ca.

get_property_list_by_city() will scrape a list of properties by city and save it as a .csv.

get_property_list_by_city("Calgary, AB")

Result:

CalgaryAB.csv

get_property_details_from_csv() will use that .csv file to get property listing details to enhance the data already available.

get_property_list_by_city("CalgaryAB.csv")

License

MIT License

Follows PEP8 Styleguide.

mls-real-estate-scraper-for-realtor.ca's People

Contributors

harry-s-grewal avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

mls-real-estate-scraper-for-realtor.ca's Issues

New to this!

Hey there,

I'm new to scrapping and would like to learn how to run both the queries.py and realtorca.py.

I've run the queries.py first then realtorca.py, but a HTTPError came up... any pointers on how I can get this up and running to export to a csv file?

get_property_list_by_city("Calgary, AB")
Traceback (most recent call last):

Cell In[3], line 1
get_property_list_by_city("Calgary, AB")

File c:\users\kelvi\onedrive\desktop\web scrapper\mls-real-estate-scraper-for-realtor.ca-main\realtorca.py:13 in get_property_list_by_city
coords = get_coordinates(city) # Creates bounding box for city

File ~\OneDrive\Desktop\Web Scrapper\mls-real-estate-scraper-for-realtor.ca-main\queries.py:10 in get_coordinates
response.raise_for_status()

File ~\anaconda3\Lib\site-packages\requests\models.py:1021 in raise_for_status
raise HTTPError(http_error_msg, response=self)

HTTPError: 400 Client Error: Bad Request for url: https://nominatim.openstreetmap.org/search?q=Calgary,%20AB&format=json&country=Canada

Question: is it possible to pull data by municipality name, property type and land size?

As I was looking for a way to access realtor.ca data, I stumbled upon this awesome repo! I work for the Nature Conservancy of Canada where one of our goals is to purchase private land for conservation. Apologies for the specific question, but with this scraper you built, is it possible to pull data by municipality name, property type and land size? This API has serious potential to help us improve how we prioritize available private lands.

Thanks!

CurrentPage and MaxRecordsPerPage

Hi, thanks for this. Haven't tried this, but I'm using the RapidApi/ApiDojo endpoint for this data.

It looks like the CurrentPage value gets reset to 0 after 50 pages.

And the maximum records per page is 50. Anything above 50 is ignored. Anything below 50 is respected.

Is your experience the same as well?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.