Coder Social home page Coder Social logo

code-spider's Introduction

Code-Spider

Help

 ____                __                   ____                    __                  
/\  _`\             /\ \                 /\  _`\           __    /\ \                 
\ \ \/\_\    ___    \_\ \     __         \ \,\L\_\  _____ /\_\   \_\ \     __   _ __  
 \ \ \/_/_  / __`\  /'_` \  /'__`\ _______\/_\__ \ /\ '__`\/\ \  /'_` \  /'__`\/\`'__\
  \ \ \L\ \/\ \L\ \/\ \L\ \/\  __//\______\ /\ \L\ \ \ \L\ \ \ \/\ \L\ \/\  __/\ \ \/ 
   \ \____/\ \____/\ \___,_\ \____\/______/ \ `\____\ \ ,__/\ \_\ \___,_\ \____\\ \_\ 
    \/___/  \/___/  \/__,_ /\/____/          \/_____/\ \ \/  \/_/\/__,_ /\/____/ \/_/ 
                                                      \ \_\                           
                                                       \/_/                           
Usage of ./code_spider_darwin_amd64:
  -cookie string
    	Cookie
  -file string
    	URL File
  -gb
    	是否拉取gitlab分支代码
  -p int
    	PageSize 需要爬取的语雀公开搜索页数, 默认第一页 (default 1)
  -pwd string
    	confluence登陆密码
  -q string
    	keyword 需要搜索的语雀内容
  -repo string
    	代码仓库或知识文档 . 目前支持: gitea, gogs, gitlab, gitblit, jenkins, confluence, yuque_open, shimo
  -tar string
    	目标 . example: http://1.1.1.1
  -user string
    	confluence登陆账号

使用须知

Gogs
    /explore/repos 存在鉴权,不存在未授权
    /user/sign_up 注册接口,<label for="user_name">判断是否存在
     鉴权需要cookie i_like_gogs=
     输入cookie时直接下源码,未输入cookie时判断注册接口是否存在
     注册存在验证码无法绕过
    
Gitea
    /explore/repos 存在未授权,无需cookie也可以跑
    鉴权需要cookie i_like_gitea=
    未输入cookie时判断注册接口是否存在
    接口存在->自动注册账号->爬取url->自动删除账号
    注册分为需要邮箱验证以及无需邮箱验证

GitLab
    /explore/projects 存在未授权,无需cookie也可以跑
    鉴权需要_gitlab_session=xxx
    注册接口都存在,但注册分为需要管理员验证以及无需管理员验证,而且存在验证token无法自动注册
    
Gitblit
    /repositories/ 存在未授权
    通过<a class="list"查找节点
    通过http://x.x.x.x/zip/?r=xx.git&format=zip下载
    鉴权需要Gitblit=
    登陆可爆破,自动尝试弱口令admin:admin登陆,若登陆成功后复用cookie获取节点
    
Jenkins
    当未设置鉴权时,dashboard存在未授权,即可访问到接口/api/json?pretty=true
    鉴权的话需要JSESSIONID.xxx=xxx
    注册为admin账号后台管理,无法前台注册

Confluence
    /login.action 登陆

语雀
    公开搜索 https://www.yuque.com/api/zsearch?q=%s&type=content&scope=/&tab=public&p=%s&sence=searchPage&time_horizon=
    知识库内文档列表 https://www.yuque.com/api/docs?book_id=%s
    markdown下载 https://www.yuque.com%s/markdown?attachment=true&latexcode=true&anchor=false&linebreak=false&&book_name=%s
石墨
    企业信息列表 https://shimo.im/lizard-api/org/departments/1/users?perPage=100&page=1
    团队空间列表 https://shimo.im/panda-api/file/spaces?orderBy=updatedAt
        空间文件列表 https://shimo.im/lizard-api/files?folder=%s
        导出markdown接口 https://shimo.im/lizard-api/office-gw/files/export?fileGuid=%s&type=md
        获取下载链接 https://shimo.im/lizard-api/office-gw/files/export/progress?taskId=%s
金山文档

使用指南

指定文件 自动识别指纹后进行相关操作

go run main.go -file url.txt

Gitea

实例1:  拉取代码(含分支)
go run main.go --repo gitea --tar http://x.x.x.x

实例2:  拉取代码(不含分支, 仅master/main)

go run main.go --repo gitea --tar http://x.x.x.x -gb

Gogs

go run main.go --tar http://x.x.x.x --repo gogs

go run main.go --tar http://x.x.x.x -cookie i_like_gogs=xxx

GitLab

实例1:  拉取代码(含分支)
go run main.go --repo gitlab --tar http://x.x.x.x

实例2:  拉取代码(不含分支, 仅master)

go run main.go --repo gitlab --tar http://x.x.x.x -gb

Gitblit

实例1:  拉取代码(含分支)
go run main.go --repo gitblit --tar http://x.x.x.x

实例2:  拉取代码(不含分支, 仅master)

go run main.go --repo gitblit --tar http://x.x.x.x -gb

Jenkins

go run main.go --tar http://x.x.x.x --repo jenkins

Confluence

go run main.go --tar http://x.x.x.x --repo confluence

示例: 登陆后爬取
go run main.go --tar http://x.x.x.x --repo confluence -user xx -pwd xx

语雀

go run main.go --repo yuque_open -q "搜索词" -p pageSize -cookie "cookie"

石墨

go run main.go -cookie "cookie" -repo shimo

Update

  • 2023.12.14 v1.4.3.1 gitlab逻辑更新
  • 2023.11.27 v1.4.3 适配gitea分页, 自动获取代码分支以及是否忽略分支
  • 2023.11.16 v1.4.2 适配gitblit通用下载接口,自动获取代码分支以及是否忽略分支
  • 2023.11.13 v1.4.1 适配gitlab多个版本差异代码逻辑, 自动获取代码分支以及是否忽略分支
  • 2023.10.13 v1.4.0 新增石墨企业通讯录获取、团队空间以及个人空间markdown下载
  • 2023.10.12 v1.4.0 新增语雀公开搜索下载markdown
  • 2023.10.12 v1.3.1 Confluence爬取逻辑修复,新增Confluence登录session后爬取
  • 2023.9.18 v1.3 新增Confluence未授权页面爬取
  • 2023.9.14 v1.2 异步下载文件
  • 2023.9.13 v1.1 新增-file指定文件,自动循环遍历url,以及自动识别url应用
  • 2023.9.12 v1.0 发布

todolist

code-spider's People

Contributors

saltyfishyu avatar

Stargazers

 avatar KZTTTTAZ avatar Xander He avatar Donald John Trump avatar h0ld1rs avatar  avatar  avatar  avatar  avatar pumpkin avatar Dusti Goodwin avatar fdsjklfweopip avatar  avatar woaiqiukui avatar

Watchers

 avatar Mstar avatar  avatar

Forkers

helloyw

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.