Coder Social home page Coder Social logo

shumes_spider's Introduction

SHUMES_Spider

  • 数据库的部署在setting中 需要根据自己的使用再次修改
  • pipeline中写的是异步的数据库导入
  • main中可以直接调试或者运行需要使用的spider
  • utils中有login的代码:
    • 成就系统(验证码这部分没写上去 比较麻烦 如果是手动写验证码 就很难以后部署scrapy)
    • 熟知网的json格式获取 由于比较简单就没有写成spider 不过后面要部署scrapy就需要修改成spider
比较完善的Spider
  • 图书馆整站爬取
    • librarynews
  • 教务处
    • jwc_tzgg
    • jwc_xw
  • 学生工作办公室
    • stu_affairs_office_tzgg
    • stu_affairs_office_xgxw
  • SHUnews新闻网
  • 本科招生网中的通知公告,还有一个系列是工作动态 目前只有一条有效
    • 需要修改才能爬取另外一个系列
    • enrolnews
正在完善的spider
  • SHUnews 需要修改其中的tag才能爬取相对应的系列
    • 正在调整代码 使其可以直接获取全站新闻
  • workSHU就业信息服务网 这个网站格式不太一样
被放弃的一些网站信息
  • 新闻网中的媒体关注 直接跳转到外网 爬取需要体力 已放弃
  • 成就系统szSHU 大部分转接给熟知网了
elasticsearch正在部署
  • 目前完成了数据库的对接

tags and webnames

# 本科招生网 教务处
"通知公告", "新闻",
# 新闻网
"科研动态", "重要新闻", "综合新闻", "文化信息", "图片新闻",
# 图书馆
"资源动态", "公告信息", "图书馆新闻",
# 学生工作办公室
"学工新闻", "通知通告",  "近期工作", "党建简报", "学工视窗"

shumes_spider's People

Contributors

yihuizhou-zyh avatar

Watchers

James Cloos avatar

Forkers

yihuizhou-zyh

shumes_spider's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.