Coder Social home page Coder Social logo

scrapy-dockerhub's Introduction

Scrapy DockerHub

Deploy, run and monitor your Scrapy spiders.

It utilizes Fabric command line utility to manage remote Docker containers that run Scrapy spiders

Installation

To use it in your scrapy project, you only need to create fabfile.py in your project directory with following content:

from fabric.api import env

env.hosts = ['my.scrapy.server.net']
env.project = 'my-project'
env.projects_path = '~/scrapy/projects'
env.items_path = '~/scrapy/items'
env.logs_path = '~/scrapy/logs'
env.jobs_path = '~/scrapy/jobs'
env.files_path = '~/scrapy/files'

from scrapy_dockerhub.fabfile import *

In order to manage deploy hosts and paths, please make appropriate changes to this file. All directories will be created automatically.

Dockerfile will be automatically generated during deploy command (see below), but you can generate it manually by using command prepare_dockerfile

If you need to add some building steps or dependencies, please adjust Dockerfile to your needs

Commands

prepare_dockerfile(force=True)

generates sample Dockerfile in current directory. if force=True, existing file will be replaced

deploy()

uploads project to remote server, builds Docker image

schedule(spider, args)

runs new container with Scrapy spider. args are appended to scrapy command line

Example:

fab schedule:spider=dmoz.com

stop(spider, job)

stops Scrapy container

Example:

fab stop:spider=dmoz.com,job=18

jobs()

list all jobs with their stats (number of requests, items, errors)

Example output:

project          spider   job     state   items   requests   errors
abstracts    nature.com    10   running      54        113        0
abstracts    nature.com     9   running      82        163        0
abstracts    nature.com     8   running   1,346      2,498        0

logs(spider, job)

view tail -f of spider log

Example:

fab logs:spider=dmoz.com,job=66

scrapy-dockerhub's People

Contributors

shirk3y avatar hyp3ri0n-ng avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.