Coder Social home page Coder Social logo

spiderman's Introduction

Spiderman

运行环境介绍

系统:KaLi 2016

软件:Python3.5 , 基于 PyQuery 和 Requests 2个模块

目的

主要是为了学习这2个模块才写了这个蜘蛛脚本.

好吧其实我是为了看妹子才写的

脚本亮点

  1. 最亮眼的就是这个自动代理功能,煎蛋的反爬机制我没能力越过,只能采取分布式爬取。
  2. 暂时没有想到。

功能介绍

主体功能都在DownHtml.py这个文件里

GetProxy函数

抓取http://www.xicidaili.com/nn

记录到一个list里

CheckProxy函数

检查代理的可用性,设置超时时间为10秒。

GetHtml函数

主要功能保存网页,在meizi.py里面我是先保存HTML再去解析里面的妹子图片的URL,然后再下载的一个过程。

GetPageNum

获取妹子图片的页数,我发现妹子图从1500页以前的都不显示了,所以只能抓取1500到目前最新的页码的妹子图片。

GetImageUrl函数

获取妹子图片的URL从已经下载到本地的HTML中

GetImage函数

下载妹子图到本地

美图鉴赏

spiderman's People

Contributors

hgz6536 avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.