Coder Social home page Coder Social logo

zoranpandovski / prodirectscraper Goto Github PK

View Code? Open in Web Editor NEW
14.0 4.0 9.0 114 KB

:necktie: Web scraper for http://www.prodirectselect.com/ :shoe:

License: MIT License

Python 100.00%
scrapy-spider scrapy-crawler python webscraping scrapy scraper spider webscraper

prodirectscraper's Introduction

Build Status Maintainability Codacy Badge BCH compliance Known Vulnerabilities License Coverage Status Total alerts Language grade: Python

ProdirectScraper

Installation

Installing Scrapy inside a virtual environment on all platforms.

Python packages can be installed either globally (a.k.a system wide), or in user-space. We do not recommend installing scrapy system wide.

Instead, we recommend that you install scrapy within a so-called “virtual environment” (virtualenv).

Virtualenvs allow you to not conflict with already-installed Python system packages (which could break some of your system tools and scripts), and still install packages normally with pip (without sudo and the likes).

To install it globally (having it globally installed actually helps here), it should be a matter of running:

$ [sudo] pip install virtualenv

Inside virtual env install ProdirectScraper dependencies:

pip install -r requirements.txt

Config Settings

These are the basic options:

# available currency EUR,USD,GBP
currency =
# Number of pieces to display in the email
pp =

# mailer configuration options
smtp_host =
mail_from =
mail_to =
smtp_user =
smtp_pass =
smtp_port =
smtp_tls =
smtp_ssl =

After that edit the configuration specific to the category of product you would like to scrape.

For trainers:

#available sizes are from 4 to 12, e.g 4 or 4,5,10
size =

For men's clothing:

# available options:  One size, ONE-SIZE, S/M, L/XL, S, M, L, XL, XXL
size =

For women's clothing:

# available options: OSFM,One Size,8,10,12,14,16,7 - 10,4½ - 7½,ONE-SIZE,32C,6,3½,4,4½,5,5½,6½,7,7½,XXS,XS,S,M,L,XL
size =

Running the Spiders

To put our spider to work, go to the project’s top level directory and run:

scrapy crawl SCRAPER

where "SCRAPER" must be one of the following:

  • trainers
  • mensclothing
  • womensclothing

This command runs the spider with name trainers, that will crawl http://www.prodirectselect.com/ website and send mail with lowest prices, model description and link to trainers, which size is specified in configuration.ini

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.