Coder Social home page Coder Social logo

Comments (18)

seefan avatar seefan commented on July 27, 2024

暂时没有发现你说的情况,你可以在显示错误信息时,调用goerr.Error(err).Trace()来显示更多的内容,我们可以根据仔细的信息来分析。

from gossdb.

zut avatar zut commented on July 27, 2024

暂时没有发现你说的情况,你可以在显示错误信息时,调用goerr.Error(err).Trace()来显示更多的内容,我们可以根据仔细的信息来分析。

详细错误是: Trace:pool is busy,can not get new client in 5 seconds,wait count is 1

本地测试, 基本没有并发.配置也是默认的.

c := &conf.Config{
		Host:      host,
		Port:      port,
		AutoClose: true,
	}

client info信息我打印出一些. 请看最后的回复

from gossdb.

seefan avatar seefan commented on July 27, 2024

把你的代码发我一份吧,我测试一下

from gossdb.

zut avatar zut commented on July 27, 2024

把你的代码发我一份吧,我测试一下

在一个项目中用了. 无法单独出来发.

from gossdb.

zut avatar zut commented on July 27, 2024

为了方便测试,我timeout设置成2s
最后的回复,有 client info 更详细的信息.

image
是不是应为 AutoClose 导致的.你看这张图, cc.index 18 的时候没有自动关闭.

这时候响应时间变成了2s.

这个请求之后的 3分钟都没有问题.


3分钟后继续出现
image

再过1分钟,报错

image

如果可以,只指导排查方向,谢谢

from gossdb.

zut avatar zut commented on July 27, 2024

有更详细的信息了:

特征是:

  1. 时间变长,
  2. 再过一会,就会错误. Trace:pool is busy,can not get new client in 2 seconds,wait count is 1

image
image

GetClient true 19 available is 8,active is 60,wait is 0,connection count is 60,status is 1
Close 1 true 19 available is 8,active is 60,wait is 0,connection count is 60,status is 1
Close 2 false 19 available is 8,active is 59,wait is 0,connection count is 60,status is 1
GetClient true 19 available is 9,active is 60,wait is 0,connection count is 60,status is 1
2020-06-23 16:15:42.866 [WARN] [elapsed > 100ms] 200 /api/meas/dashboard 2.002142606s < 1.middleware.ProcessTime middleware.go:14  2.middleware.CORS mwCors.go:7  3.middleware.Limit mwLimit.go:17 
2020-06-23 16:15:42.867 [DEBU] 200 GET /api/meas/dashboard                       < 1.middleware.Log mwLog.go:17 
GetClient true 19 available is 1,active is 61,wait is 0,connection count is 100,status is 1
Close 1 true 19 available is 1,active is 61,wait is 0,connection count is 100,status is 1
Close 2 false 19 available is 1,active is 60,wait is 0,connection count is 100,status is 1
2020-06-23 16:15:49.364 [DEBU] 200 OPTIONS /api/meas/dashboard                   < 1.middleware.Log mwLog.go:17 
GetClient true 19 available is 1,active is 61,wait is 0,connection count is 100,status is 1
Close 1 true 19 available is 1,active is 61,wait is 0,connection count is 100,status is 1
Close 2 false 19 available is 1,active is 60,wait is 0,connection count is 100,status is 1
2020-06-23 16:15:51.418 [WARN] [elapsed > 100ms] 401 /api/meas/dashboard 2.000636574s < 1.middleware.ProcessTime middleware.go:14  2.middleware.CORS mwCors.go:7  3.middleware.Limit mwLimit.go:17 
2020-06-23 16:15:51.418 [WARN] 401 GET /api/meas/dashboard                       < 1.middleware.Log mwLog.go:15 
GetClient true 0 available is 1,active is 61,wait is 0,connection count is 100,status is 1
Close 1 true 0 available is 1,active is 61,wait is 0,connection count is 100,status is 1
Close 2 false 0 available is 1,active is 60,wait is 0,connection count is 100,status is 1
2020-06-23 16:15:56.840 [DEBU] 200 OPTIONS /api/meas/dashboard                   < 1.middleware.Log mwLog.go:17 
GetClient true 0 available is 2,active is 61,wait is 0,connection count is 100,status is 1
Close 1 true 0 available is 2,active is 61,wait is 0,connection count is 100,status is 1
Close 2 false 0 available is 2,active is 60,wait is 0,connection count is 100,status is 1
2020-06-23 16:15:58.865 [WARN] [elapsed > 100ms] 401 /api/meas/dashboard 2.001977589s < 1.middleware.ProcessTime middleware.go:14  2.middleware.CORS mwCors.go:7  3.middleware.Limit mwLimit.go:17 
2020-06-23 16:15:58.865 [WARN] 401 GET /api/meas/dashboard                       < 1.middleware.Log mwLog.go:15 
GetClient true 1 available is 1,active is 61,wait is 0,connection count is 80,status is 1
Close 1 true 1 available is 1,active is 61,wait is 0,connection count is 80,status is 1
Close 2 false 1 available is 1,active is 60,wait is 0,connection count is 80,status is 1
2020-06-23 16:16:04.832 [DEBU] 200 OPTIONS /api/meas/dashboard                   < 1.middleware.Log mwLog.go:17 
GetClient true 2 available is 1,active is 61,wait is 0,connection count is 100,status is 1
Close 1 true 2 available is 1,active is 61,wait is 0,connection count is 100,status is 1
Close 2 false 2 available is 1,active is 60,wait is 0,connection count is 100,status is 1
GetClient true 1 available is 2,active is 61,wait is 0,connection count is 100,status is 1
Close 1 true 1 available is 2,active is 61,wait is 0,connection count is 100,status is 1
Close 2 false 1 available is 2,active is 60,wait is 0,connection count is 100,status is 1
GetClient true 3 available is 1,active is 61,wait is 0,connection count is 100,status is 1
Close 1 true 3 available is 1,active is 61,wait is 0,connection count is 100,status is 1
Close 2 false 3 available is 1,active is 60,wait is 0,connection count is 100,status is 1
GetClient true 2 available is 2,active is 61,wait is 0,connection count is 100,status is 1
Close 1 true 2 available is 2,active is 61,wait is 0,connection count is 100,status is 1
Close 2 false 2 available is 2,active is 60,wait is 0,connection count is 100,status is 1
GetClient true 4 available is 1,active is 61,wait is 0,connection count is 100,status is 1
Close 1 true 4 available is 1,active is 61,wait is 0,connection count is 100,status is 1
Close 2 false 4 available is 1,active is 60,wait is 0,connection count is 100,status is 1
2020-06-23 16:16:10.863 [WARN] [elapsed > 100ms] 200 /api/meas/dashboard 4.007022137s < 1.middleware.ProcessTime middleware.go:14  2.middleware.CORS mwCors.go:7  3.middleware.Limit mwLimit.go:17 
2020-06-23 16:16:10.863 [DEBU] 200 GET /api/meas/dashboard                       < 1.middleware.Log mwLog.go:17 
GetClient true 3 available is 2,active is 61,wait is 0,connection count is 100,status is 1
Close 1 true 3 available is 2,active is 61,wait is 0,connection count is 100,status is 1
Close 2 false 3 available is 2,active is 60,wait is 0,connection count is 100,status is 1
2020-06-23 16:16:12.899 [DEBU] T2 elapsed =  2.00168s                            < 1.db.HGetTo hash.go:51  2.s.GetPlotData sMeas.go:157  3.handler.(*Handler).GetImg aImg.go:16 
2020-06-23 16:16:12.900 [ERRO] Text: Hget PlotData/100001_101__1_S2401_EIRP_20_LCH_1_1M__NTNV_SISO_2_0_0 error   Trace:pool is busy,can not get new client in 2 seconds,wait count is 1 < 1.db.HGetTo hash.go:53  2.s.GetPlotData sMeas.go:157  3.handler.(*Handler).GetImg aImg.go:16 
2020-06-23 16:16:12.901 [WARN] Hget PlotData/100001_101__1_S2401_EIRP_20_LCH_1_1M__NTNV_SISO_2_0_0 error < 1.handler.(*Handler).GetImg aImg.go:17  2.middleware.Auth mwAuth.go:24  3.middleware.ProcessTime middleware.go:11 
2020-06-23 16:16:12.902 [WARN] [elapsed > 100ms] 400 /api/img/100001_101__1_S2401_EIRP_20_LCH_1_1M__NTNV_SISO_2_0_0 2.004009257s < 1.middleware.ProcessTime middleware.go:14  2.middleware.CORS mwCors.go:7  3.middleware.Limit mwLimit.go:17 
2020-06-23 16:16:12.902 [WARN] 400 GET /api/img/100001_101__1_S2401_EIRP_20_LCH_1_1M__NTNV_SISO_2_0_0 < 1.middleware.Log mwLog.go:15 
GetClient true 5 available is 1,active is 61,wait is 0,connection count is 100,status is 1
Close 1 true 5 available is 1,active is 61,wait is 0,connection count is 100,status is 1
Close 2 false 5 available is 1,active is 60,wait is 0,connection count is 100,status is 1
2020-06-23 16:16:18.108 [DEBU] 200 OPTIONS /api/meas/dashboard                   < 1.middleware.Log mwLog.go:17 
GetClient true 4 available is 2,active is 61,wait is 0,connection count is 100,status is 1
Close 1 true 4 available is 2,active is 61,wait is 0,connection count is 100,status is 1
Close 2 false 4 available is 2,active is 60,wait is 0,connection count is 100,status is 1
2020-06-23 16:16:20.183 [WARN] [elapsed > 100ms] 401 /api/meas/dashboard 2.003259654s < 1.middleware.ProcessTime middleware.go:14  2.middleware.CORS mwCors.go:7  3.middleware.Limit mwLimit.go:17 
2020-06-23 16:16:20.183 [WARN] 401 GET /api/meas/dashboard                       < 1.middleware.Log mwLog.go:15 
GetClient true 6 available is 1,active is 61,wait is 0,connection count is 100,status is 1
Close 1 true 6 available is 1,active is 61,wait is 0,connection count is 100,status is 1
Close 2 false 6 available is 1,active is 60,wait is 0,connection count is 100,status is 1
2020-06-23 16:16:25.514 [DEBU] 200 OPTIONS /api/meas/dashboard                   < 1.middleware.Log mwLog.go:17 
GetClient true 5 available is 2,active is 61,wait is 0,connection count is 100,status is 1
Close 1 true 5 available is 2,active is 61,wait is 0,connection count is 100,status is 1
Close 2 false 5 available is 2,active is 60,wait is 0,connection count is 100,status is 1
2020-06-23 16:16:27.558 [WARN] [elapsed > 100ms] 401 /api/meas/dashboard 2.001992644s < 1.middleware.ProcessTime middleware.go:14  2.middleware.CORS mwCors.go:7  3.middleware.Limit mwLimit.go:17 
2020-06-23 16:16:27.558 [WARN] 401 GET /api/meas/dashboard                       < 1.middleware.Log mwLog.go:15 
GetClient true 7 available is 1,active is 61,wait is 0,connection count is 80,status is 1
Close 1 true 7 available is 1,active is 61,wait is 0,connection count is 80,status is 1
Close 2 false 7 available is 1,active is 60,wait is 0,connection count is 80,status is 1
2020-06-23 16:16:32.844 [DEBU] 200 OPTIONS /api/meas/dashboard                   < 1.middleware.Log mwLog.go:17 
GetClient true 6 available is 1,active is 61,wait is 0,connection count is 100,status is 1
Close 1 true 6 available is 1,active is 61,wait is 0,connection count is 100,status is 1
Close 2 false 6 available is 1,active is 60,wait is 0,connection count is 100,status is 1
2020-06-23 16:16:36.894 [DEBU] T2 elapsed =  2.000896s                           < 1.db.HGetTo hash.go:51  2.s.GetMeas sMeas.go:18  3.handler.(*Handler).GetMeasDashboard aMeas.go:48 
2020-06-23 16:16:36.895 [ERRO] Text: Hget Meas/100001 error      Trace:pool is busy,can not get new client in 2 seconds,wait count is 1 < 1.db.HGetTo hash.go:53  2.s.GetMeas sMeas.go:18  3.handler.(*Handler).GetMeasDashboard aMeas.go:48 
2020-06-23 16:16:36.896 [WARN] Hget Meas/100001 error                            < 1.handler.(*Handler).GetMeasDashboard aMeas.go:49  2.middleware.Auth mwAuth.go:24  3.middleware.ProcessTime middleware.go:11 
2020-06-23 16:16:36.896 [WARN] [elapsed > 100ms] 200 /api/meas/dashboard 2.003202975s < 1.middleware.ProcessTime middleware.go:14  2.middleware.CORS mwCors.go:7  3.middleware.Limit mwLimit.go:17 
2020-06-23 16:16:36.896 [DEBU] 200 GET /api/meas/dashboard                       < 1.middleware.Log mwLog.go:17 
GetClient true 8 available is 1,active is 61,wait is 0,connection count is 100,status is 1
Close 1 true 8 available is 1,active is 61,wait is 0,connection count is 100,status is 1
Close 2 false 8 available is 1,active is 60,wait is 0,connection count is 100,status is 1

然后
我用简单代码测试发现, active 和 connection count 都不会上涨
image

from gossdb.

seefan avatar seefan commented on July 27, 2024

我刚测试了一个小时,10个纯程在跑,并没有出现你说的问题。
我想确认以下内容,
1、你是如何获取连接的,是用NewClient还是GetClient
2、是否持有了获取的连接。

from gossdb.

zut avatar zut commented on July 27, 2024

@seefan 用了GetClient (是调用gossdb.Client())

请看代码:

func HSet(name, key, value interface{}) error {
	name2, err := cc(name)
	if err != nil {
		return err
	}
	key2, err := cc(key)
	if err != nil {
		return err
	}
	b, err := xx.Pack(value)
	if err != nil {
		return err
	}
	return gossdb.Client().HSet(name2, key2, b)
}


func HGetTo(name, key, value interface{}) error {
	name2, err := cc(name)
	if err != nil {
		return err
	}
	key2, err := cc(key)
	if err != nil {
		return err
	}
	xx.IsPointer(value)
	vPack, err := gossdb.Client().HGet(name2, key2)
	if err != nil {
		xlog.Error(goerr.Error(err).Trace())
		return err
	} else if len(vPack.Bytes()) == 0 {
		return ErrNotFound
	}
	return xx.UnpackTo(vPack.Bytes(), &value)
}
  1. 用单个测试没有这样的问题,
  2. 我是 GoFrame 的ghttp 搭建的 API服务, 你看我打印的日志, active 和 connection count 都不断上涨.

from gossdb.

seefan avatar seefan commented on July 27, 2024

1、我看每次出错附近都这类似这样[WARN] [elapsed > 100ms]的提示,能大概描述一下这个错误是什么吗。
2、从日志中可以看到,每秒并发不高,只有1到2个,但有大量的active连接,这个说明连接是直在用,且没有被回收,是有这样的业务逻辑么。还是说是因为连接无法回收导致的active居高不下。
3、每一次获取范围保存的kv的值有多大,是否达到几M

Trace:pool is busy,can not get new client in 5 seconds,wait count is 1
这个提示是说连接池内没有可以使用的连接了,所以我们需要检查一下为什么连接在使用后无法回收。

from gossdb.

zut avatar zut commented on July 27, 2024

1 这个错误是每个请求的处理时间,你可以忽略。

2没有长时间连接的业务。

3 大约几百KB?我可以具体打印出来看看。而且是slob/zstd 压缩过的。

from gossdb.

zut avatar zut commented on July 27, 2024

请问你是如何判断回收的?
@seefan

from gossdb.

seefan avatar seefan commented on July 27, 2024

通过Avaliable来控制,它是一个队列,我会把连接的位置放到队列里管理

from gossdb.

zut avatar zut commented on July 27, 2024

通过Avaliable来控制,它是一个队列,我会把连接的位置放到队列里管理

我看了一段时间log, 还没能解决.
目前已经关闭AutoClose, 还没有发现问题.

from gossdb.

seefan avatar seefan commented on July 27, 2024

看你说的情况,感觉是是autoclose在释放时有问题,我还在分析
另外,你关闭AutoClose后,用普通方式是否存在问题呢?

from gossdb.

zut avatar zut commented on July 27, 2024

看你说的情况,感觉是是autoclose在释放时有问题,我还在分析
另外,你关闭AutoClose后,用普通方式是否存在问题呢?

使用2天情况来看, 没有出现了

from gossdb.

zut avatar zut commented on July 27, 2024

@seefan 最近一次的更新, 我看到 closeClient的更新, 是不是回收方面原来有 BUG, (自动回收)

from gossdb.

seefan avatar seefan commented on July 27, 2024

只是调整回收后复用的规则,把原来的平均复用改成优先头部利用了,不影响回收。

自动回收方面你出现的问题还没有找到原因。

另外我看到你有Close1和Close2两处打印,能告诉我一下大概的位置么?

from gossdb.

zut avatar zut commented on July 27, 2024

image

@seefan 最近我关闭AutoClose,再也没有出现过了问题了, 所以应该是 AutoClose的问题。

我目前的做法是, 做了一个操作(例如Hset),就关闭。

from gossdb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.