Coder Social home page Coder Social logo

customer-review-scraper's Introduction

This repository contains the code for a web scraping engine that scrapes customer review data from Yelp. For a google colaboratory or local implementation the scraped data is stored in a csv file. For AWS cloud implementation the scraped data is stored in a dynamodb table.

Below is a guide to implement the scraping engine on google colaboratory, local computer or on the cloud using AWS.

Google colaboratory implementation of Yelp scraper:

For a simple implementation of Yelp scraper visit the following google colaboratory link:

https://colab.research.google.com/drive/1tg5S5zII2eMw9JRCbM1sZvTsalHciwCJ?usp=sharing

Local desktop implementation of Yelp scraper:

The libraries needed to run the code are listed in the requirements.txt file.

The following is a guide to install the necessary libraries and run the code in a virtual environemnt:

  • Open a terminal application and clone this repository in your current directory using the following command:

$ git clone https://github.com/Erfangithub/Customer-Review-Scraper.git

Change directory into the downloaded (or cloned) folder Customer-Review-Scraper.

Alternatively you can download the zipfile of the repository, open the zipfile and change directory into the main folder: Customer-Review-Scraper-main

  • create a virtual environement using the command:

$ virtualenv v-env

  • activate the virtual environment:

$ source v-env/bin/activate

  • upgrade pip to latest version:

$ pip install --upgrade pip

  • install the packages using the command:

$ pip3 install -r requirements.txt

  • set the scraping parameters in lambda_function.py and run the code:

$ python3 lambda_function.py

After the execution of the code is finished the scraping results are stored in the file scrapeddata.csv.

AWS cloud implementation

The AWS diagram of this web scraper is as follows:



To implement this web scraper on AWS visit the following notion link for a step by step guide:

https://erfan-shalchian.notion.site/Web-Scraping-Engine-for-Collecting-Customer-Review-Data-from-Yelp-475e314412934094bc66e3e42b645243

customer-review-scraper's People

Contributors

erfangithub avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.