Iron Hack Final Project: Book Recommender

This project aims to run a book recommender based on the inputted book title and outputs 3 book title recommendations, its respective authors, genres, and book covers. This data was accumulated through two Kaggle datasets, and web-scraping using Selenium and BeautifulSoup. Here is the link to my Google Slides presentation.

Motivation

Decided to further enquire on this topic, as before the Iron Hack Data Analytics Bootcamp my friends and I were organising a book club and since being in the bootcamp I haven't been able to enjoy said books. Now that the Bootcamp is over, I would be very much interested in restarting this book club and receiving recommendations based on titles and genres of books we previously liked.

Build Status

The code for this Jupyter Notebook is divided into parts:

Web Scraping: Web Scraping was done through the use of Selenium off of the Barnes & Noble website (link below), information such as title, author, year_published, isbn, image_link, genre, description, publisher, page_count and rating were pulling and organized into a DataFrame. For the other 2 sets, they were pulled from the Kaggle website(links below).
Data Cleaning and EDA: Divided into different book databases, then ultimately concatenating the data with columns: title, author, year_published, isbn, image_link, genre, description, publisher, page_count and rating.
Analysis/Recommender: Used the CountVectorizer recommender/model to analyse, cluster, and recommend books based on the user inputing a book title.
A large part of the book has repearted code

Code Style

Used the Python 3 code in my Jupyter Notebook.

Screenhots

Tech/Framework used

I used Python, and ran the following librairies to help execute the code: import time import requests import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns from sklearn.metrics.pairwise import cosine_similarity from sklearn.feature_extraction.text import CountVectorizer import matplotlib.pyplot as plt import imageio from selenium import webdriver from selenium.webdriver.chrome.options import Options from selenium.webdriver.common.by import By from webdriver_manager.chrome import ChromeDriverManager from selenium.webdriver.common.keys import Keys import time import requests from bs4 import BeautifulSoup as bs4

Features

The features that were the main focus for the book recommender are: title, author, rating, genre, description, isbn, publisher, year_published, page_count, and image_link.

Code Examples

*%%time
*bookdata = final_data.drop(['isbn','year_published','page_count','description'],axis=1)
*bookdata['data'] = bookdata[bookdata.columns[1:]].apply(
- *lambda x: ' '.join(x.dropna().astype(str)),
- *axis=1
)*
print(bookdata['data'].head())

Installation

I already had Jupyter Notebook installed through my Anaconda Navigator, but I had to "pip install ChromeDriverManager" and "pip install selenium.webdriver" for the web-scraping.

API reference

Didn't use an API reference for this project as I used Selenium to web-scrape.

Tests

As previously mentioned, I used the CountVetorizer for my recommender/model.

Here is an example of the code used to run the recommender:
*%%time
try:
- recommendations = pd.DataFrame(df.nlargest(4,input_title)['title'])
- recommendations = recommendations[recommendations['title']!=input_title]
- a = recommendations.index.values.tolist()
- print('Here are some fun recommendations for you:')
- display(pd.concat([recommendations,final_data.iloc[a][['author','genre']]],axis = 1))
- b = bookdata.iloc[a]['image_link'].values.tolist()
- for i in b:
  - plt.imshow(imageio.imread(i))
  - plt.show()*
*except:
- print("Sorry, I don't have any book recommendations. You should go for a walk instead!")*

How to use?

I would recommend first reading the "Readme" before opening the .ipynb file. When going through the latter file, it is apparent to see how and why I decided to keep and delete certain values when cleaning the data. Upon the analysis, I further concatenated the data to later add to the recommender and to generate book suggestions. The whose process from scraping to recommending is broken up into 3 phases: Phase 1: Web-Scraping, Phase 2: Data Cleaning and EDA, and Phase 3: Recommender.

Contribute

You can contribute by opening the "Readme" file, clicking on "Edit" and leaving a comment at the bottom of the document with your github link and suggested comments. Here is my github repository link

Credits

I used the following websites for datasets:

License

Used Python 3 License.

mmmaxime / fp_book_recommender Goto Github PK

fp_book_recommender's Introduction

Iron Hack Final Project: Book Recommender

Motivation

Build Status

Code Style

Screenhots

Tech/Framework used

Features

Code Examples

Installation

API reference

Tests

How to use?

Contribute

Credits

License

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent