Coder Social home page Coder Social logo

amazon-scrapping's Introduction

Amazon-Scrapping

A project to scrape information about different amazon products using ASIN and country code which are provided in a spreadsheet.

Work flow:

  • Using openpyxl library, the code converts ASIN column from spreadsheet to a python list.
  • Next it does the same for country code column in spreadsheet.
  • Using zip function and amazon product link format ("https://www.amazon.{country}/dp/{asin}"), a list of URLs to look up is created.
  • Headers are added so amazon.com dosen't block the code from accessing the content thinking it as a bot.
  • While loop is initiated to control number of URLs to be processed.
  • URLs from url list are iterated one by one in get function.
  • Code checks wheather the page is valid or not by checking response code.
  • Soup1 variable pulls the HTML of the URL
  • Soup2 variable is prettified version of Soup1
  • A try block tries to pull Product name from the website and throws attribute error if there is none.
  • Three more try blocks try to pull Product price, details and image URL from the website and throws attribute error if they are missing.
  • If no AttributeErrors occured, information about price, details and image url are written in a dictionary which form list of dictionaries in the loop.
  • CSV file is created for better redability in the loop.
  • Json file is created from the list named content which is a list of dictionaries after the while loop ends.

amazon-scrapping's People

Contributors

justa3dobject avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.