Coder Social home page Coder Social logo

grpc-stream-ping-research's Introduction

GRPC "too many pings from client" 排查

背景

Rainbow3.0 于近日阶段性替换上线, 适逢大促运维侧进行批量机器扩容, 发现集中扩容时, 偶现 too many pings from client, 同时断开 stream, 由 Rainbow 侧重连机制重新连接.

排查

func (t *http2Server) handlePing(f *http2.PingFrame) {
 if f.IsAck() {
  if f.Data == goAwayPing.data && t.drainChan != nil {
   close(t.drainChan)
   return
  }
  // Maybe it's a BDP ping.
  if t.bdpEst != nil {
   t.bdpEst.calculate(f.Data)
  }
  return
 }
 pingAck := &ping{ack: true}
 copy(pingAck.data[:], f.Data[:])
 t.controlBuf.put(pingAck)

 now := time.Now()
 defer func() {
  t.lastPingAt = now
 }()
 // A reset ping strikes means that we don't need to check for policy
 // violation for this ping and the pingStrikes counter should be set
 // to 0.
 if atomic.CompareAndSwapUint32(&t.resetPingStrikes, 1, 0) {
  t.pingStrikes = 0
  return
 }
 t.mu.Lock()
 ns := len(t.activeStreams)
 t.mu.Unlock()
 if ns < 1 && !t.kep.PermitWithoutStream {
  // Keepalive shouldn't be active thus, this new ping should
  // have come after at least defaultPingTimeout.
  if t.lastPingAt.Add(defaultPingTimeout).After(now) {
   t.pingStrikes++
  }
 } else {
  // Check if keepalive policy is respected.
  if t.lastPingAt.Add(t.kep.MinTime).After(now) {
   t.pingStrikes++
  }
 }

 if t.pingStrikes > maxPingStrikes {
  // Send goaway and close the connection.
  if logger.V(logLevel) {
   logger.Errorf("transport: Got too many pings from the client, closing the connection.")
  }
  t.controlBuf.put(&goAway{code: http2.ErrCodeEnhanceYourCalm, debugData: []byte("too_many_pings"), closeConn: true})
 }
}

发现客户端在发起 keeplive ping 包超过服务端 EnforcementPolicy MinTime 设置时, 触发该异常

试验

通过 server/client 示例进行调试,

序号 client: ClientParameters server: ServerParameters server: EnforcementPolicy 现象
1 Time : 12 * time.Second Time: 20 * time.Second MinTime : 10 * time.Second 每12s client wrote PING, stream持续正常 (取客户端较小ping周期)
2 Time : 12 * time.Second Time: 20 * time.Second MinTime : 15 * time.Second 每12s client wrote PING (取客户端较小ping周期, server wrote GOAWAY, 断开重连
3 Time : 20 * time.Second Time: 15 * time.Second MinTime : 10 * time.Second 每15s server wrote PING (取服务端较小ping周期), stream持续正常
4 Time : 10 * time.Second Time: 10 * time.Second MinTime : 10 * time.Second 每10s server/clent 交替wrote PING, stream持续正常

结论

结合试验, 目前 client/server keeplive 配置均为10s, server 端 enforcementpolicy 也为10s, 会在空闲时交替发送 ping 包, 在服务端受到压力时, 服务端可能存在不发送 ping 包情况, 同时接收处理客户端前后2个 ping 包出现延迟, 判断小于 policy mintime, 即可能出现当前情况, 可以调整 enforcementpolicy, mintime 从 10s 调整为 7s (10-3)

参考资料

https://pandaychen.github.io/2020/09/01/GRPC-CLIENT-CONN-LASTING/

grpc-stream-ping-research's People

Contributors

lynnsir-102 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.