Coder Social home page Coder Social logo

gospider007 / gospider Goto Github PK

View Code? Open in Web Editor NEW
90.0 7.0 22.0 55.69 MB

🚀Gospider is a powerful Golang web crawler that includes all the necessary libraries for transitioning from Python to Golang. It provides a fast and seamless transition for Python web crawlers to Golang.

License: GNU Lesser General Public License v3.0

Go 99.43% JavaScript 0.37% Python 0.20%
requests spider golang ja3

gospider's Issues

当ja3id指定为ja3.HelloAndroid_11_OkHttp会发生错误

代码如下:

package main

import (
	"context"
	"gitee.com/baixudong/gospider/ja3"
	"gitee.com/baixudong/gospider/requests"
	"log"
)

func main() {
	reqCli, err := requests.NewClient(context.TODO())
	if err != nil {
		log.Panic(err)
	}
	response, err := reqCli.Request(context.TODO(), "get", "https://tools.scrapfly.io/api/fp/ja3?extended=1", requests.RequestOption{Http2: true, Ja3Id: ja3.HelloAndroid_11_OkHttp})
	if err != nil {
		log.Panic(err)
	}
	log.Print(response.Text())
}

报错如下
2023/03/10 16:44:30 Get "https://tools.scrapfly.io/api/fp/ja3?extended=1": remote error: tls: protocol version not supported panic: Get "https://tools.scrapfly.io/api/fp/ja3?extended=1": remote error: tls: protocol version not supported
更换为其他的id则没问题

功能请求: 自定义ja3指纹

很好用的一个库,感谢作者的付出,不知道大佬后续有无关于自定义ja3指纹的计划:
比方说在requests.RequestOption加入个ja3string参数,使用者即可将获取的标准指纹("771,49195....")填入其中进行伪造。

how install

can you build binary
go build not work and
go get -u gitee.com/baixudong/gospider not work

两次handshake耗时比较久

同样使用代理的情况下,总体耗时是正常使用net/http或fasthttp请求耗时的2到3倍,
看了下好像是这里两次handshake耗时比较久,

if err = utlsConn.HandshakeContext(ctx); err != nil {
		if strings.HasSuffix(err.Error(), "bad record MAC") {
			err = tools.WrapError(err, "检测到22扩展异常,请删除此扩展后重试")
		}
	}

有什么优化方案吗

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.