Coder Social home page Coder Social logo

akotovets1 / amazon_vine_analysis Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 1.0 1.52 MB

Big Data using PySpark, Amazon Web Service (AWS), Google Colaboratory, and pgAdmin

Jupyter Notebook 100.00%
aws aws-rds aws-s3 colab-notebook jupyter-notebook pandas-dataframe pgadmin4 postgresql pyspark python

amazon_vine_analysis's Introduction

Amazon_Vine_Analysis

Big Data using PySpark, Amazon Web Service (AWS), Google Colaboratory, and pgAdmin

Overview

Data analysts were tasked with analyzing Amazon reviews written by members of the paid Amazon Vine program. Data from Amazon's Shoes department was analyzed to determine if having a paid Vine review makes a difference in the percentage of 5-star reviews.
The Extract, Transform, and Load (ETL) process was used on the Amazon Shoes dataset.
pgAdmin was utilized to connect to AWS, and pySpark and postgreSQL were used against the data set to create four separate DataFrames to match the table schema in pgAdmin.
The transformed data was then uploaded into AWS RDS.

Results

Pic 1

How many Vine reviews and non-Vine reviews were there?

The total number of reviews was 4366916

The total number of 5-star reviews was 2639935

How many Vine reviews were 5 stars? How many non-Vine reviews were 5 stars?

The total number of Vine 5-star reviews was 13

The total number of non-Vine 5-star reviews was 14475

What percentage of Vine reviews were 5 stars? What percentage of non-Vine reviews were 5 stars?

The Percentage of Vine 5-star reviews was 59%

The Percentage of non-Vine 5-star reviews was 54%

Summary

Based on analysis of the sample selected from the Amazon Shoes reviews, Vine reviews did not appear to have affected the 5-star reviews. There are slightly more 5-star reviews from unpaid reviews.

Additional analysis might be performed on all other Amazon review datasets with the same parameters that were used for the Shoes dataset.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.