Coder Social home page Coder Social logo

zxsq's Introduction

知识星球数据抓取

本工具用于自动连接到已经付费的知识星球,下载所有的文章。 后续可以跟根据需要过滤一些数据,生成Word文档,方便打印学习。

源代码基于Python3.6。需要用的第三方库请自行用pip3下载。 需要安装的包有,reqeusts,pymongo,python-docx

学习理财、财经知识可以到知识星球搜索“老齐的读书圈”和“齐俊杰的粉丝群”,都很不错。代码中就拿这两个星球做为例子。

有疑问请发邮件至[email protected]

headers.txt

该文件最为关键,用于存放cookies和其它header里的内容,没有正确的cookies自然不能下载数据。 首先在网页中登录知识星球,然后直接从Network中找到对应的Request,再将Request Hearder复制过来就可以。

group.ini

用于记录每个星球上次下载的时间,避免重复下载数据。

Zsxq.ini

用于配置知识星球的各种URL,其中版本号更新得会快一些。 DOWNLOAD_FILE_FLAG用于配置是否在下载文章的同时下载对应的文件(如果有的话)。

Zsxq.py

用于下载数据,请根据自己的需要修改星球的名称、ID以及星主ID。

DataHandler

用于处理下载来的数据,本代码中为将与星主相关的对话保存成Word文档。

zxsq's People

Contributors

stefanzhong avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.