Coder Social home page Coder Social logo

wanghaisheng / weibo_scrapy Goto Github PK

View Code? Open in Web Editor NEW

This project forked from yoyzhou/weibo_scrapy

0.0 3.0 0.0 8.41 MB

WEIBO_SCRAPY is a Multi-Threading SINA WEIBO data extraction Framework in Python.

Home Page: http://yoyzhou.github.io/blog/2013/04/08/weibo-scrapy-framework-with-multi-threading/

weibo_scrapy's Introduction

WEIBO_SCRAPY

WEIBO_SCRAPY是一个PYTHON实现的,使用多线程抓取WEIBO信息的框架。WEIBO_SCRAPY框架给用户提供WEIBO的模拟登录和多线程抓取微博信息的接口,让用户只需关心抓取的业务逻辑,而不用处理棘手的WEIBO模拟登录和多线程编程。

WEIBO_SCRAPY is a Multi-Threading SINA WEIBO data extraction Framework in Python. WEIBO_SCRAPY provides WEIBO login simulator and interface for WEIBO data extraction with multi-threading, it saves users a lot of time by getting users out of writing WEIBO login simulator from scratch and multi-threading programming, users now can focus on their own extraction logic.

=======

###WEIBO_SCRAPY的功能 1. 微博模拟登录

2. 多线程抓取框架

3. 抓取任务接口

4. 抓取参数配置

###WEIBO_SCRAPY Provides 1. WEIBO Login Simulator

2. Multi-Threading Extraction Framework

3. Extraction Task Interface

4. Easy Way of Parameters Configuration

###How to Use WEIBO_SCRAPY #!/usr/bin/env python #coding=utf8

from weibo_scrapy import scrapy

class my_scrapy(scrapy):
	
	def scrapy_do_task(self, uid=None):
	     '''
	    User needs to overwrite this method to perform uid-based scrapy task.
	    @param uid: weibo uid
	    @return: a list of uids gained from this task, optional
	    '''
	     super(my_scrapy, self).__init__(**kwds)
	     
	     #do what you want with uid here, note that this scrapy is uid based, so make sure there are uids in task queue, 
	     #or gain new uids from this function
	     print 'WOW...'
	     return 'replace this string with uid list which gained from this task'
	 
if __name__ == '__main__':
	
	s = my_scrapy(uids_file = 'uids_all.txt', config = 'my.ini')
	s.scrapy()

###相关阅读(Readings) 基于UID的WEIBO信息抓取框架WEIBO_SCRAPY

weibo_scrapy's People

Contributors

yoyzhou avatar

Watchers

James Cloos avatar HeisenBerg? avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.