Coder Social home page Coder Social logo

airbnbscrape's Introduction

AirbnbScrape

Python Function To Scrape Airbnb

Purpose: As a host of airbnb, we wanted to optimize the price of our listing, and wanted to understand things like:

  • How other people priced around me, relative to dimensions such as amenities, reviews, instant booking status, etc?
  • Can I learn something about looking at other properties who are "successful" on airbnb - with success being defined as having many reviews and able to charge competitive prices?
  • Optimize the price for our listings by studying the data of similar properties around us
  • Learn interesting things from outliers.

We wanted to be able to study this data, visualize it and see if we could glean additional insights than what is available on airbnb.

###Table of Contents ####Scraping

  • ScrapingAirbnb.py: this is the code that was used to scrape Airbnb.com this code is very modular and can be re-used to scrape airbnb data for any location.

####Analysis

  • DataCleanAirbnb.py: this file contains supporting functions for AirbnbWrapup.ipyb. Used to clean the dataset and parse/remove features as appropriate.
  • DummyOneHot.py: this file contains items that are located in AirbnbWrapup.ipynb. Used to dummy code categorical variables.

###How To Use The Scraping Code (ScrapingAirbnb.py): The main functions are:

  1. IterateMainPage() this function takes in a location string, and page limit as a parameter and downloads a list of dictionaries which correspond to all of the distinct listings for that location. For example, calling IterateMainPage('Cambridge--MA', 10) will scrape all of the distinct listings that appear on pages 1-10 of the page listings for that location. The output from this function will then be a list of dictionaries with each dictionary item corresponding to one unique listing on each page. The location string is in the format of 'City--State', as that is how the URL is structured.

  2. iterateDetail() this reads in the output of the function IterateMainPage() and visits each specific listing to get mroe detailed information. If more detailed information is found, then the dictionary is updated to contain more values.

  3. writeToCSV() this function takes care of writing the output to a csv file.

Example of how to run this code:

    #Iterate Through Main Page To Get Results
    MainResults = IterateMainPage('Cambridge--MA', 1)
    
    #Take The Main Results From Previous Step and Iterate Through Each Listing
    #To add more detail
    DetailResults = iterateDetail(MainResults)
    
    #Write Out Results To CSV File, using function I defined
    writeToCSV(DetailResults, 'CambridgeResults.csv')

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.