Coder Social home page Coder Social logo

zoranpandovski / prodirectscraper Goto Github PK

View Code? Open in Web Editor NEW
14.0 4.0 9.0 111 KB

:necktie: Web scraper for http://www.prodirectselect.com/ :shoe:

License: MIT License

Python 100.00%
scrapy-spider scrapy-crawler python webscraping scrapy scraper spider webscraper

prodirectscraper's Introduction

Build Status Maintainability Codacy Badge BCH compliance Known Vulnerabilities License Coverage Status Total alerts Language grade: Python

ProdirectScraper

Installation

Installing Scrapy inside a virtual environment on all platforms.

Python packages can be installed either globally (a.k.a system wide), or in user-space. We do not recommend installing scrapy system wide.

Instead, we recommend that you install scrapy within a so-called “virtual environment” (virtualenv).

Virtualenvs allow you to not conflict with already-installed Python system packages (which could break some of your system tools and scripts), and still install packages normally with pip (without sudo and the likes).

To install it globally (having it globally installed actually helps here), it should be a matter of running:

$ [sudo] pip install virtualenv

Inside virtual env install ProdirectScraper dependencies:

pip install -r requirements.txt

Config Settings

These are the basic options:

# available currency EUR,USD,GBP
currency =
# Number of pieces to display in the email
pp =

# mailer configuration options
smtp_host =
mail_from =
mail_to =
smtp_user =
smtp_pass =
smtp_port =
smtp_tls =
smtp_ssl =

After that edit the configuration specific to the category of product you would like to scrape.

For trainers:

#available sizes are from 4 to 12, e.g 4 or 4,5,10
size =

For men's clothing:

# available options:  One size, ONE-SIZE, S/M, L/XL, S, M, L, XL, XXL
size =

For women's clothing:

# available options: OSFM,One Size,8,10,12,14,16,7 - 10,4½ - 7½,ONE-SIZE,32C,6,3½,4,4½,5,5½,6½,7,7½,XXS,XS,S,M,L,XL
size =

Running the Spiders

To put our spider to work, go to the project’s top level directory and run:

scrapy crawl SCRAPER

where "SCRAPER" must be one of the following:

  • trainers
  • mensclothing
  • womensclothing

This command runs the spider with name trainers, that will crawl http://www.prodirectselect.com/ website and send mail with lowest prices, model description and link to trainers, which size is specified in configuration.ini

prodirectscraper's People

Contributors

alecj avatar codacy-badger avatar dependabot[bot] avatar filo01 avatar gigkokman avatar horstmannmat avatar krishnaprasanthg avatar mayela avatar moonpatroller avatar ranc58 avatar sector-f avatar snyk-bot avatar tanmay17061 avatar vinschess avatar zoranpandovski avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

prodirectscraper's Issues

Add CONTRIBUTING.md file

Subject of the issue

CONTRIBUTING.md file would encourage other maintainers to submit well formed PRs.

Upgrade scrappy

Upgrade Scrappy to the newest version available and test implementation.

Need more tests

Currently I got tests only for helpers class. PR's with tests for other implementations are welcome

Generalize parse code

I noticed that all the pages that display the products use the same template so it would be possible to just use the same code for all of them with different starting urls and options.

What do you think?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.