Comments (37)
I can confirm, I have the same
from grpc-node.
I have the same problem.version 1.8.4
from grpc-node.
Me too!
from grpc-node.
Please, where is author?
from grpc-node.
I handle this specific error with code 14 and do this
oldClient.close()
const newClient = new Service(addr, grpc.credentials.createInsecure())
But newClient
cannot connect to Server in some situations, so how to fully renew a client?
from grpc-node.
It seems we are hitting something like this as well. If a server is shutdown clients are not able to reconnect to a new instance. Instead they keep trying to complete requests against the old host and fails with error 14.
Currently our only work around is to restart the service to get it running on the new server hosts.
Are there anything we can do to assist with this?
from grpc-node.
Thanks for your reply.
What is on the new server hosts do you mean?
I just restart server, and in some situation, client request fails and gives error 14. In such situation, new Service
is not effective. I can ensure I have changed to new client to request, but I keep getting error 14 after that.
from grpc-node.
We have services running in a cluster where we deploy new versions regularly. When a new version of the server is started and the old one is shut down, clients to the old service gets into above mentioned state.
from grpc-node.
Yes, your situation is the same with mine. So don't you have solution too?
from grpc-node.
Currently we've reverted to version 1.7.3 and we will investigate some more.
from grpc-node.
thanks for your share
from grpc-node.
This is affecting me as well, especially while i'm developing.
I have my nodemon restarting the server at every change.
Once the server came back to accept connection the client returns with a Connection Failed 14 Failed to read endpoint
.
this is very annoying as now i need to restart the client too.
Is there any workaround for the time being?
from grpc-node.
Does v1.9.0 fix this?
from grpc-node.
Yes with version 1.9.0 is fixed! Thanks a lot!
from grpc-node.
@Crevil Can you confirm that v1.9.0 fixes this problem like @rhelenagh said?
from grpc-node.
from grpc-node.
from grpc-node.
Ok, I believe you. Thanks.
from grpc-node.
@Crevil Hello, after testing a lot, I find there is another problem at v1.9.1.
Sometimes when I restart grpc server, client request doesn't arrive server anymore and tcp connection doesn't appear too by ss -atpn
. At the same time, I cannot see any error to let me restart client like as before.
from grpc-node.
@zyf0330 Actually we just experienced something alike yesterday. A client silently stopped receiving responses from the server, but a restart of the client service did fix it.
I'm currently looking into what happened and how we can detect it.
from grpc-node.
Thanks a lot. I can also fix it by restarting client.
from grpc-node.
The error appeared somewhat differently than the issues mention above. Last time we got an error code 14 on requests when the server had restarted. This time we got nothing. I suspect an exception was thrown, but we didn't expect this and therefor had no try catch
around the client calls.
I was not able to reproduce the issue against the grpc package. In the initial post you mention you could replicate this. Do you have this setup available anywhere, e.g. GitHub?
from grpc-node.
I use a client which send request to server every 1 second, and restart server to make it happen. But its probability is low, I have no method to reproduce this problem reliably. Sorry.
from grpc-node.
We just made a couple of tests that do not indicate an issue with the grpc layer of things, although it still is my number one suspect. The test was as follows.
Run an HTTP Node.js server (service A) containing a gRPC client.
On inbound HTTP requests a gRPC request is send to a Go gRPC server (service B) and service A responds to the HTTP request.
We setup siege with 25 concurrent users without delay between requests. This resulted in around 100 requests/sec.
siege \
-H 'Cache-Control: no-cache' \
-H 'Content-Type: application/json' \
https://endpoint.to.service.a.com/handler
When siege
got running we shutdown service B and let it startup on another IP in the cluster.
Service A would hang in the transition when only a single replica of service B was running but when it got up service A resumed its requests.
The above was repeated several times with 1-3 replicas of service B.
Every time service A was able to recover from the dropout and continue to handle inbound requests.
from grpc-node.
When you shut the servers down, did you do a graceful shutdown or just kill the process? Did you observe a significant time delay between starting up the service B replica and having your requests start to complete again?
from grpc-node.
Killed to process. There was no noticeable delay in the startup. We are running a kubernetes setup so there are around 10 seconds delay from service to actually receiving traffic. When running a single replica service A was able to complete requests around that time.
With multiple replicas I noticed no delay at all. Traffic shiftet nicely to other services only failing those requests actually running on the service B instance when it was killed.
from grpc-node.
From your description, gRPC seems to be working as expected. When the existing server goes away, new requests start trying to connect to a new server, and when that server becomes available, those new requests start completing again.
If possible, the best way to handle that kind of situation would be to start up the new replica and wait for it to be ready to accept requests, then do a graceful shutdown of the existing server. If you do that, existing requests will complete, and new requests will go to the new server.
from grpc-node.
Yeah, that was my conclusion as well. I tested a non-graceful shutdown on purpose to see how the layers would react. Normally we roll new deployments out as you describe.
The reason we decided to test this again was an issue yesterday where we had a service that stopped responding to traffic and the only possibility we can come up with is that the grpc request hangs.
Do you know of any edge-cases where the Node client could get stuck?
(To mitigate this in the future we will make sure that the requests have deadlines.)
from grpc-node.
If a server dies abruptly, then pending calls against that server will probably not end until either that call's deadline or the TCP timeout, or maybe an HTTP/2 ping timeout. And the client probably also waits for the TCP timeout or the HTTP/2 ping timeout before attempting to reconnect to the new server. That may look like a hang, depending on how long those timeouts are.
from grpc-node.
@Crevil In my situation, I restart server by stop it normally, the signal should be SIGINT.
And about request timeout, it is useless if I have only one server, even if client can get timeout error.
from grpc-node.
Thanks for clarifying @murgatroid99. I suspekt that to be the case then. We've been using this package with Protobuf.js for the proto implementations but the TypeScript typings generated by pbts
effectively hides the additional parameters for service methods, eg. passed to Client.makeUnaryRequest()
. Because of this I had no deadlines setup. (Sure thing, I should have thought about it)
@zyf0330 The deadlines lets you retry the request or find an alternative result for the clients. That should be useful. But it seems that the problem we experienced is not consistent with this one after all. As mentionen, under high load we could not make the error appear.
from grpc-node.
When I can reproduce it, I will help.
from grpc-node.
Was there any luck, @zyf0330 ?
from grpc-node.
Sorry, I didn't work with this problem recently.
from grpc-node.
Alright, I'll close this one for now, as this might even be two different separate issues anyway. If you come around and manage to have a reproduction case for us, please open a new issue with the details of the reproduction.
from grpc-node.
I use tensorflow1.4 and had the same issue
from grpc-node.
@anancds please don't bump older issues like these. This one is confusing anyway because several different people added unrelated problems on top of it, so this isn't a useful comment. Please open a new issue detailing what's going on using our issue template.
from grpc-node.
Related Issues (20)
- Clarify build steps for v10+ HOT 2
- Discussion: Why HTTP2 pseudo headers such as :authory are removed from Metadata HOT 2
- Deadline examples/docs could use clarification HOT 8
- When response stream includes messages and then an error, final message(s) can be dropped HOT 1
- Servers base64-encode “grpc-status-details-bin” header with padding, but should be unpadded HOT 1
- Server does not check the HTTP method of the request HOT 2
- Compression handling has some issues
- Cardinality violations should use error code “unimplemented” HOT 3
- Servers send grpc-message trailer even when no error HOT 1
- successful calls still emit 'cancelled' as of v1.10.x HOT 2
- grpc-loader: Expose field options HOT 3
- `@grpc/grpc-js@latest` is pointing to an old version HOT 1
- Implement authority overrides in the DNS resolver HOT 7
- return process.dlopen(module, path.toNamespacedPath(filename));/issues/233 HOT 1
- gRPC-JS client hangs when target server restarts HOT 1
- Infinite loop in LoadBalancingCall.doPick after closing the client HOT 2
- `grpc_tools_node_protoc` generated code does not build for `--moduleResolution=nodenext` HOT 3
- Performance degradation in high throughput applications due to DNS resolution bug HOT 5
- GRPC 2 UNKNOWN error.details always empty string HOT 3
- a HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from grpc-node.