Coder Social home page Coder Social logo

Comments (11)

iamqizhao avatar iamqizhao commented on June 20, 2024

yup, this is a known issue and one of a few major things we would like to improve in the ongoing performance benchmark and optimization. I had some discussion with @bradfitz a couple of months back on this already -- we could either i) use bufio as you mentioned or ii) add a flush API into http2 framer and instrument buffering inside http2 package if i) still introduces some unnecessary data copies.

It is welcome to contribute on this. But please do it incrementally because I expect it would be somewhat convoluted and error-prone.

from grpc-go.

kenkeiter avatar kenkeiter commented on June 20, 2024

I'll see if I can find some time to dig into it in the next couple of days.

As an aside: this is an excellent project. Really looking forward to using it when we get these perf issues sorted. Thank you for your hard work 👍

from grpc-go.

iamqizhao avatar iamqizhao commented on June 20, 2024

Thanks. Per my reply #89, I am going to try to have a basic benchmark framework ready this week so that you can use it to show the improvement from your pull request. :)

from grpc-go.

iamqizhao avatar iamqizhao commented on June 20, 2024

I made some improvement on client (not checked in yet) and got the significant improvement already:

http client:
2015/03/11 15:39:40 1: 5736.642904 op/sec @ p99=0.336000ms
2015/03/11 15:39:50 2: 11015.068698 op/sec @ p99=0.321000ms
2015/03/11 15:40:00 3: 17261.079718 op/sec @ p99=0.424000ms
2015/03/11 15:40:10 4: 20806.615173 op/sec @ p99=0.674000ms
2015/03/11 15:40:20 5: 917.775790 op/sec @ p99=7.762000ms

improved grpc client:
2015/03/11 15:28:20 1: 3474.533592 op/sec @ p99=0.453000ms
2015/03/11 15:28:30 2: 8111.900305 op/sec @ p99=0.657000ms
2015/03/11 15:28:40 3: 14023.490286 op/sec @ p99=0.663000ms
2015/03/11 15:28:50 4: 21179.714876 op/sec @ p99=0.617000ms
2015/03/11 15:29:00 5: 27564.232005 op/sec @ p99=0.612000ms
2015/03/11 15:29:10 6: 33875.344691 op/sec @ p99=0.603000ms
2015/03/11 15:29:20 7: 39543.547520 op/sec @ p99=0.616000ms
2015/03/11 15:29:30 8: 43567.770711 op/sec @ p99=0.668000ms
2015/03/11 15:29:40 9: 46393.655755 op/sec @ p99=0.711000ms
2015/03/11 15:29:50 10: 47250.902113 op/sec @ p99=0.796000ms
2015/03/11 15:30:00 11: 47733.268011 op/sec @ p99=0.961000ms
2015/03/11 15:30:10 12: 47531.488503 op/sec @ p99=1.130000ms
2015/03/11 15:30:20 13: 50293.363756 op/sec @ p99=0.938000ms
2015/03/11 15:30:30 14: 50614.134504 op/sec @ p99=1.006000ms
2015/03/11 15:30:40 15: 50902.922158 op/sec @ p99=1.061000ms
2015/03/11 15:30:50 16: 51342.060561 op/sec @ p99=1.103000ms
2015/03/11 15:31:00 17: 51172.659114 op/sec @ p99=1.200000ms
2015/03/11 15:31:10 18: 51340.048872 op/sec @ p99=1.230000ms
2015/03/11 15:31:20 19: 51575.161160 op/sec @ p99=1.277000ms
2015/03/11 15:31:30 20: 51478.429739 op/sec @ p99=1.376000ms
2015/03/11 15:31:40 21: 49980.940379 op/sec @ p99=1.738000ms
2015/03/11 15:31:50 22: 51071.717198 op/sec @ p99=1.681000ms
2015/03/11 15:32:00 23: 52140.282287 op/sec @ p99=1.553000ms
2015/03/11 15:32:10 24: 52124.439668 op/sec @ p99=1.593000ms
2015/03/11 15:32:20 25: 52222.698417 op/sec @ p99=1.646000ms
2015/03/11 15:32:30 26: 52365.708092 op/sec @ p99=1.795000ms
2015/03/11 15:32:40 27: 52629.151019 op/sec @ p99=1.771000ms
2015/03/11 15:32:50 28: 52815.300753 op/sec @ p99=1.857000ms
2015/03/11 15:33:00 29: 53304.403154 op/sec @ p99=1.839000ms
2015/03/11 15:33:10 30: 53081.078234 op/sec @ p99=1.986000ms
2015/03/11 15:33:20 31: 53571.344040 op/sec @ p99=1.945000ms
2015/03/11 15:33:30 32: 53348.888919 op/sec @ p99=2.008000ms
2015/03/11 15:33:40 33: 53429.124904 op/sec @ p99=2.147000ms
2015/03/11 15:33:50 34: 53687.364050 op/sec @ p99=2.196000ms
2015/03/11 15:34:00 35: 54029.536526 op/sec @ p99=2.184000ms
2015/03/11 15:34:10 36: 53837.964570 op/sec @ p99=2.264000ms
2015/03/11 15:34:20 37: 53558.114539 op/sec @ p99=2.358000ms
2015/03/11 15:34:31 38: 54276.048533 op/sec @ p99=2.395000ms
2015/03/11 15:34:41 39: 54714.463152 op/sec @ p99=2.424000ms
2015/03/11 15:34:51 40: 54343.651778 op/sec @ p99=2.501000ms
2015/03/11 15:35:01 41: 54334.642340 op/sec @ p99=2.557000ms
2015/03/11 15:35:11 42: 52279.954783 op/sec @ p99=3.224000ms
2015/03/11 15:35:21 43: 54623.572538 op/sec @ p99=2.759000ms
2015/03/11 15:35:31 44: 55033.028392 op/sec @ p99=2.729000ms
2015/03/11 15:35:41 45: 55097.426098 op/sec @ p99=2.797000ms
2015/03/11 15:35:51 46: 54736.133552 op/sec @ p99=2.973000ms
2015/03/11 15:36:01 47: 54809.898913 op/sec @ p99=2.926000ms
2015/03/11 15:36:11 48: 55465.691515 op/sec @ p99=3.009000ms
2015/03/11 15:36:21 49: 55142.363729 op/sec @ p99=3.053000ms
2015/03/11 15:36:31 50: 54523.644725 op/sec @ p99=3.341000ms
2015/03/11 15:36:41 51: 51818.090431 op/sec @ p99=3.874000ms
2015/03/11 15:36:51 52: 55958.880606 op/sec @ p99=3.230000ms
2015/03/11 15:37:02 53: 55770.858429 op/sec @ p99=3.363000ms
2015/03/11 15:37:12 54: 55643.909388 op/sec @ p99=3.520000ms
2015/03/11 15:37:22 55: 53500.566081 op/sec @ p99=3.894000ms
2015/03/11 15:37:32 56: 55789.631428 op/sec @ p99=3.722000ms
2015/03/11 15:37:42 57: 56027.047875 op/sec @ p99=3.556000ms
2015/03/11 15:37:52 58: 56038.120343 op/sec @ p99=3.638000ms
2015/03/11 15:38:02 59: 56292.927614 op/sec @ p99=3.793000ms
2015/03/11 15:38:12 60: 55018.049146 op/sec @ p99=4.073000ms
2015/03/11 15:38:22 61: 55992.929617 op/sec @ p99=4.024000ms
2015/03/11 15:38:32 62: 55822.072635 op/sec @ p99=4.004000ms
2015/03/11 15:38:42 63: 50557.821204 op/sec @ p99=5.170000ms

I am going to
i) investigate the p99 latency increase when concurrency is 1, 2, 3;
ii) improve server side IO.

from grpc-go.

iamqizhao avatar iamqizhao commented on June 20, 2024

BTW, GOMAXPROCS = 8, running on my desktop (Intel 6-Core Xeon CPU, 32GB Quad-Channel).

from grpc-go.

iamqizhao avatar iamqizhao commented on June 20, 2024

and Go 1.4.1, ubuntu (with google customized kernel).

from grpc-go.

iamqizhao avatar iamqizhao commented on June 20, 2024

Made another tiny change to client. now we have the peak throughput 63301 QPS and it also beats http when concurrency is 1. 2 and 3 are left for further investigation (probably due to benchmark warm-up issue).

2015/03/11 16:18:18 1: 8287.741866 op/sec @ p99=0.259000ms
2015/03/11 16:18:28 2: 8740.703689 op/sec @ p99=0.460000ms
2015/03/11 16:18:38 3: 13895.774170 op/sec @ p99=0.664000ms
2015/03/11 16:18:49 4: 20081.196002 op/sec @ p99=0.653000ms
2015/03/11 16:18:59 5: 26879.486075 op/sec @ p99=0.626000ms
2015/03/11 16:19:09 6: 32999.525010 op/sec @ p99=0.627000ms
2015/03/11 16:19:19 7: 39121.779852 op/sec @ p99=0.621000ms
2015/03/11 16:19:29 8: 45110.061786 op/sec @ p99=0.627000ms
2015/03/11 16:19:39 9: 49552.546522 op/sec @ p99=0.657000ms
2015/03/11 16:19:49 10: 51632.747219 op/sec @ p99=0.711000ms
2015/03/11 16:19:59 11: 52957.872081 op/sec @ p99=0.780000ms
2015/03/11 16:20:09 12: 54336.444169 op/sec @ p99=0.829000ms
2015/03/11 16:20:19 13: 55156.122747 op/sec @ p99=0.864000ms
2015/03/11 16:20:29 14: 55970.795801 op/sec @ p99=0.913000ms
2015/03/11 16:20:39 15: 56392.698372 op/sec @ p99=0.958000ms
2015/03/11 16:20:49 16: 56639.934694 op/sec @ p99=1.006000ms
2015/03/11 16:20:59 17: 57212.360536 op/sec @ p99=1.052000ms
2015/03/11 16:21:09 18: 56732.254707 op/sec @ p99=1.102000ms
2015/03/11 16:21:19 19: 57302.967797 op/sec @ p99=1.149000ms
2015/03/11 16:21:29 20: 57268.604606 op/sec @ p99=1.189000ms
2015/03/11 16:21:39 21: 56108.716471 op/sec @ p99=1.436000ms
2015/03/11 16:21:49 22: 57034.746842 op/sec @ p99=1.389000ms
2015/03/11 16:21:59 23: 57868.233547 op/sec @ p99=1.332000ms
2015/03/11 16:22:09 24: 56552.212083 op/sec @ p99=1.563000ms
2015/03/11 16:22:19 25: 57846.229820 op/sec @ p99=1.507000ms
2015/03/11 16:22:29 26: 56325.985601 op/sec @ p99=1.684000ms
2015/03/11 16:22:39 27: 57354.014531 op/sec @ p99=1.692000ms
2015/03/11 16:22:49 28: 58479.597553 op/sec @ p99=1.570000ms
2015/03/11 16:22:59 29: 57459.811004 op/sec @ p99=1.774000ms
2015/03/11 16:23:09 30: 57723.544929 op/sec @ p99=1.831000ms
2015/03/11 16:23:19 31: 58054.148338 op/sec @ p99=1.802000ms
2015/03/11 16:23:29 32: 58245.871014 op/sec @ p99=1.904000ms
2015/03/11 16:23:39 33: 57989.680830 op/sec @ p99=1.999000ms
2015/03/11 16:23:49 34: 58629.714685 op/sec @ p99=1.986000ms
2015/03/11 16:23:59 35: 58676.437061 op/sec @ p99=2.052000ms
2015/03/11 16:24:09 36: 58030.467134 op/sec @ p99=2.235000ms
2015/03/11 16:24:19 37: 58978.829010 op/sec @ p99=2.166000ms
2015/03/11 16:24:29 38: 58860.499125 op/sec @ p99=2.233000ms
2015/03/11 16:24:39 39: 59112.461256 op/sec @ p99=2.242000ms
2015/03/11 16:24:49 40: 58102.794015 op/sec @ p99=2.554000ms
2015/03/11 16:24:59 41: 58533.033816 op/sec @ p99=2.582000ms
2015/03/11 16:25:10 42: 57432.107985 op/sec @ p99=2.877000ms
2015/03/11 16:25:20 43: 57441.592555 op/sec @ p99=2.862000ms
2015/03/11 16:25:30 44: 59182.285817 op/sec @ p99=2.679000ms
2015/03/11 16:25:40 45: 60218.583441 op/sec @ p99=2.514000ms
2015/03/11 16:25:50 46: 59339.214638 op/sec @ p99=2.635000ms
2015/03/11 16:26:00 47: 59887.236005 op/sec @ p99=2.724000ms
2015/03/11 16:26:10 48: 60498.420696 op/sec @ p99=2.759000ms
2015/03/11 16:26:20 49: 59776.344567 op/sec @ p99=2.876000ms
2015/03/11 16:26:30 50: 58319.649181 op/sec @ p99=3.157000ms
2015/03/11 16:26:40 51: 59763.456368 op/sec @ p99=3.000000ms
2015/03/11 16:26:50 52: 59949.755663 op/sec @ p99=2.936000ms
2015/03/11 16:27:00 53: 60523.324912 op/sec @ p99=3.022000ms
2015/03/11 16:27:10 54: 61237.016441 op/sec @ p99=2.900000ms
2015/03/11 16:27:20 55: 60443.541833 op/sec @ p99=3.110000ms
2015/03/11 16:27:31 56: 61355.069920 op/sec @ p99=3.002000ms
2015/03/11 16:27:41 57: 61220.384786 op/sec @ p99=3.256000ms
2015/03/11 16:27:51 58: 61390.398581 op/sec @ p99=3.121000ms
2015/03/11 16:28:01 59: 61282.281079 op/sec @ p99=3.324000ms
2015/03/11 16:28:11 60: 61879.932352 op/sec @ p99=3.189000ms
2015/03/11 16:28:21 61: 62044.597061 op/sec @ p99=3.139000ms
2015/03/11 16:28:31 62: 61517.119104 op/sec @ p99=3.518000ms
2015/03/11 16:28:41 63: 62252.693242 op/sec @ p99=3.356000ms
2015/03/11 16:28:51 64: 62464.251294 op/sec @ p99=3.363000ms
2015/03/11 16:29:01 65: 61825.208182 op/sec @ p99=3.604000ms
2015/03/11 16:29:11 66: 58922.860966 op/sec @ p99=4.134000ms
2015/03/11 16:29:22 67: 61939.896746 op/sec @ p99=3.701000ms
2015/03/11 16:29:32 68: 61960.670654 op/sec @ p99=3.775000ms
2015/03/11 16:29:42 69: 61829.558540 op/sec @ p99=3.928000ms
2015/03/11 16:29:52 70: 62892.344351 op/sec @ p99=3.711000ms
2015/03/11 16:30:02 71: 61878.997245 op/sec @ p99=3.989000ms
2015/03/11 16:30:12 72: 62742.805674 op/sec @ p99=3.801000ms
2015/03/11 16:30:22 73: 63161.662421 op/sec @ p99=3.925000ms
2015/03/11 16:30:32 74: 62578.722089 op/sec @ p99=4.087000ms
2015/03/11 16:30:42 75: 62649.500442 op/sec @ p99=4.234000ms
2015/03/11 16:30:53 76: 62845.936641 op/sec @ p99=3.981000ms
2015/03/11 16:31:03 77: 62615.585104 op/sec @ p99=4.138000ms
2015/03/11 16:31:13 78: 62959.456672 op/sec @ p99=4.205000ms
2015/03/11 16:31:23 79: 63169.017659 op/sec @ p99=4.314000ms
2015/03/11 16:31:33 80: 63301.385304 op/sec @ p99=4.133000ms
2015/03/11 16:31:43 81: 63278.517512 op/sec @ p99=4.333000ms
2015/03/11 16:31:53 82: 62771.565187 op/sec @ p99=4.552000ms
2015/03/11 16:32:03 83: 62730.621970 op/sec @ p99=4.504000ms
2015/03/11 16:32:13 84: 61752.486115 op/sec @ p99=4.856000ms
2015/03/11 16:32:24 85: 61467.460666 op/sec @ p99=4.877000ms
2015/03/11 16:32:34 86: 61068.068302 op/sec @ p99=5.045000ms

from grpc-go.

iamqizhao avatar iamqizhao commented on June 20, 2024

When I started to quantify the improvement of my changes, I ran your load test on my desktop. I cannot reproduce your results when GOMAXPROCS is 8 (or something > 1). When it is 8 (for both client and server), the existing grpc without any performance tuning can reach 50K QPS (> 70K with my changes) and the peak QPS for client http client is 20K QPS. I understand you used 2 EC2 but I used my desktop. But this still should not happen. I suspect you did not get GOMAXPROCS set to 8 in your load test because I also ran the load test with GOMAXPROCS = 1 and the results seemed match what you saw.

from grpc-go.

codahale avatar codahale commented on June 20, 2024

The profiler graphs I created were created with GOMAXPROCS=8.

from grpc-go.

codahale avatar codahale commented on June 20, 2024

I strongly recommend using two separate machines, as your loopback interface shares very little in common with your network interface.

from grpc-go.

iamqizhao avatar iamqizhao commented on June 20, 2024

I enabled io batching in #123. Close this one and will track and tackle the remaining perf related issues in #89.

from grpc-go.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.