FaultFindr is a web app located at http://faultfindr.xyz that uses a predictive model to identify laptop failures in Amazon product reviews in order to create a summarization of failure modes for various laptops and to estimate failure rates. This web app is built using Flask, Bootstrap, and D3 that is hosted on AWS. On the backend, text processing is done using regular expressions, nltk, and textblob, and the predictive models are based on sklearn. The web application relies on a PostgreSQL database that is not supplied here. The application can be run with the faultfindr.py
script.
faultfindr.py
: main applicationfaultfindr
: application module__init__.py
views.py
: primary flask function for generating web. pagesreview_processing.py
: script that contains some objects and functions used for text cleaning and processing.Reviews
: Object that imports reviews, cleans the text, removes stopwords, identifies n-grams above a certain frequency threshold, and tokenizes the text with respect to these specified n-grams.
review_analysis.py
: a script containing some objects that house the predictive models.ClassificationModel
: an object that wraps a binary sklearn classification model, and contains a bag of words vectorizer, and includes some methods for evaluating the performance of the model and plotting.FaultFindr
: an object that trains and houses an array ofClassificationModel
objects.
product_results.py
: a script containing some objects that are used to obtain information about Laptops, and to obtain results of the failure mode analysis for each set of reviews corresponding to a given laptop for the web application.Laptops
: houses a list of all of the laptops, and includes a search function for obtaining a list of laptops that match a given search query.ProductResults
: houses the set of reviews for a given laptop, applies a predictive model for classifying each sentence within each review, and has methods to generating json data for feeding into D3 plots.
models
: a directory with pickled pretrainedFaultFindr
modelsstatic
: a directory that contains mostly standard bootstrap files and the agency bootstrap template for creating the webpagetemplates
: a directory containingindex.html
which is a flask/jinja html-template for generating the webpage with flask.
Here is a rough overview of the database structure for reference, since access to the database is not provided.
-
Laptops
: a list of laptops with propertiesasin
(str) amazon product review IDtitle
(str) name of the laptopdescription
(str) description of the laptopprice
(float) price in USDbrand
(str) brand of the laptoprelated
(int) number of related itemsimUrl
(str) url to the image of the objectrefurbished
(boolean) whether or not it is refurbishedreviews
(int) number of reviewsscreen_size
(double) screen size in inches
-
Messages
: a table of all of the reviewsasin
: (str) amazon product review corresponding to the laptopoverall
: (int) amazon star ratingTime
: (timestamp) time of reviewsummary
: (str) title of the reviewMessage
: (str) text of the review combined with the titlereviewerName
: (str) name of the reviewerTokenized
: (str) cleaned text tokenized by unigramsTokenizedNSW
: (str) cleaned text tokenized by unigrams with stopwords removedMWTokenized
: (str) cleaned text tokenized by n-grams with stopwords removed.
-
Vocab
: a table of the n-grams in the n-gram vocabulary for tokenizationWord
: (str) text of the wordFrequency
: (str) number of occurences in the whole corpusLength
: (str) number of unigrams in the n-gram
-
Related
: Table of mappings between laptops and related laptops (as specified by Amazon)asin
: (str) amazon product idrelated_asin
: (str) amazon product id for a related item
-
TrainSentences
: a table with the training data.sentence
: (str) n-gram tokenized sentence with stop words removedreadable
: (str) untokenized text.- Columns with Outcomes (all boolean):
audio
,battery
,build_design_quality
,camera
,charger
,cooling_system_fan_noise
,cost
,customer_service_returns
,disk_drive
,failure
,freeze_crash_boot_issue
,hard_drive
,keyboard
,motherboard_gpu_memory_processor
,operating_system_bios
,ports
,screen
.