Coder Social home page Coder Social logo

Comments (11)

RanVaknin avatar RanVaknin commented on June 12, 2024

Hi @alesk20,

Thanks for reaching out. The behavior is indeed odd. Since the return value from the await call to .send() is hanging, it might be because the server did not close the connection and the SDK is still awaiting a response.

Without seeing more detailed logs it would be very difficult to diagnose. This could be due to different httphandler defaults with regard to connection management that you might need to change.

For example, in the v2 SDK the default timeout was 60 seconds, in v3 we use the defaults provided by node's http client which is 0:

requestTimeout: The number of milliseconds a request can take before automatically being terminated. Defaults to 0, which disables the timeout. The number of milliseconds a request can take before being automatically terminated.

My guess is that this issue where the server hangs is also happening on v2, but the default behavior of the older version makes this more transparent. You might want to dial down the timeout to be more aggressive , perhaps at 60 seconds to align it with v2's behavior and see if this solves your issue.

Thanks,
Ran~

from aws-sdk-js-v3.

alesk20 avatar alesk20 commented on June 12, 2024

Hi @RanVaknin,

thank you for the response. I'll try setting the timeout explicitly to 60 seconds, but it's still strange that all the messages get published with V2 sdk and instead with V3 sdk they don't get published when sns client hangs.
Shouldn't also the messages handled with V2 sdk not being published if they reach the default 60 secs timeout?
What I observe is that I don't lose any message with V2 sdk but with V3 sdk I lose them when sns client is hanging and I forcefully trhow a timeout.

Thanks

from aws-sdk-js-v3.

alesk20 avatar alesk20 commented on June 12, 2024

Hi @RanVaknin,

I want to add another question after reading your response: in the V2 sdk what happens when the default requestTimeout is reached? An error is thrown or the promise is just resolved?

The timeout of 180 seconds I mentioned in my first message was not set on client-sns, but as external timeout to drop the process and retry, so in my actual implementation, after what you said, I think the connection to SNS topic still hangs even if I drop the process.

It still doesn't explain why V3 sdk has this slowdowns publishing messages to SNS topic, while the V2 sdk delivers them immediately, also under huge pressure, without missing any delivery.

Thanks

from aws-sdk-js-v3.

RanVaknin avatar RanVaknin commented on June 12, 2024

Hi @alesk20 , requestTimeout means that the connection will terminated from the client side. It does not mean a retry.

Shouldn't also the messages handled with V2 sdk not being published if they reach the default 60 secs timeout?

Not necessarily, the server might receive and process your request but it might not be responding with the status to inform the client that the message was / wasn't processed.

It's hard to say why you are only experiencing this with v3. It might be because differences in connection management, or something you did differently in your code.
Without seeing an end to end example it will be very difficult to root cause this.

Can you set up a minimal github repository that can reliably (intermittently reliably is also ok) reproduce this behavior?
Ideally this reproduction would have the working v2, and the non working v3 code so we can compare these as well.

Thanks,
Ran~

from aws-sdk-js-v3.

alesk20 avatar alesk20 commented on June 12, 2024

Hi @RanVaknin,

unfortunately it's very difficult to replicate this case, it only happens to me after 1-2 hours and only in production environment, where I have a lot of traffic on the sqs queue. I also tried to replicate it on a test environment myself, but couldn't manage to do it.

As I said in the first message, I didn't change anything on the code, I just migrate V2 sdk to V3 sdk and upgraded Node.js 16 to Node.js 18, these two are the only things I changed. I don't think the problem is Node.js 18 version.

Can you tell me what happens on V2 sdk when default requestTimeout is reached? The promise gets resolved or an error is thrown?

Thanks.

from aws-sdk-js-v3.

RanVaknin avatar RanVaknin commented on June 12, 2024

Hi @alesk20 ,

Can you tell me what happens on V2 sdk when default requestTimeout is reached? The promise gets resolved or an error is thrown?

When v2 requestTimeout (or in its v2 name timeout) is reached, the client will kill the connection, and an error would be thrown as shown here: https://github.com/aws/aws-sdk-js/blob/36e3f6d5c27adf522b7517f095f060f4581d9b03/lib/http/node.js#L86. You might be handling it in v2 and not doing so in v3?

As I said in the first message, I didn't change anything on the code, I just migrate V2 sdk to V3 sdk and upgraded Node.js 16 to Node.js 18, these two are the only things I changed. I don't think the problem is Node.js 18 version.

I understand your concern, however I cannot point to a single point in the SDK and say "this is why your code is not working like it did in v2" There is about 8 years of development between when v2 was first introduced to when v3 was released, the architecture of the two is very different and evolved with the JS language itself and the Ecosystem's best practices.

I tried to strip down all of the http configurations used by the v2 SDK and actually have found that the only http option we explicitly override is indeed timeout however I was wrong initially. We actually set it to 120000ms (2 min) by default:

console.log(sns.config.httpOptions)
// prints: { timeout: 120000 }

I don't think it will be helpful for us to keep comparing the two, and instead we should try and focus how to help with your current setup.

Are you running your application from something like a Docker container? I'm asking because Docker has decent support for tcpDump which allows you to inspect TCP level networking events. You could use that, or any other network diagnostic tool to find what closes those connections.

I understand that your current repro code does not raise the reported behavior, but can you please share it anyway? Right now we are doing a lot of theorizing which is not helpful. By you sharing your code we can better visualize the architecture and do a simple visual check of certain things you might be missing to get this to work correctly (this is not to suggest that your code is wrong). If you have the v2 code handy, feel free to share that too.

Thanks again for your cooperation.

All the best,
Ran~

from aws-sdk-js-v3.

github-actions avatar github-actions commented on June 12, 2024

This issue has not received a response in 1 week. If you still think there is a problem, please leave a comment to avoid the issue from automatically closing.

from aws-sdk-js-v3.

alesk20 avatar alesk20 commented on June 12, 2024

Hi @RanVaknin,
with further investigations it seems that the problem resides on node 18 version, which is giving hanging http requests problem in other ways, not only on aws sdk. I will investigate more and try to release my project with node 20, which seems not to have these hanging problems.

from aws-sdk-js-v3.

alesk20 avatar alesk20 commented on June 12, 2024

Hi @RanVaknin,
I think I found the problem and it's not with nodejs versions. The problem is with the S3 client of "@aws-sdk/client-s3": I managed to replicate the issue and I see that the sdk is never closing the socket opened with S3 requests and this eventually leads to a bottleneck in the server sockets pool.
I think I solved the problem forcing the "requestTimeout" on the S3 client:

const s3 = new S3({ ...options.s3, requestHandler: new NodeHttpHandler({ httpAgent: new Agent({ keepAlive: true, keepAliveMsecs: 1000 }), requestTimeout: 5000 }) });

By doing this, I see that the S3 sockets are being closed after 5 seconds and no connection is hanging.
Isn't this a sdk bug? With aws-sdk 2 the connections to S3 were successfully closed automatically after the response.

Kind regards.

from aws-sdk-js-v3.

RanVaknin avatar RanVaknin commented on June 12, 2024

Hi @alesk20 ,

I don't know the S3 operation you are using since it was not mentioned in the original issue description, but if I had to guess it's with the actual response from getObject. In v3 it returns a stream, and in NodeJS if you don't consume a stream the underlying connection might stay open.

This is covered here:

Because keepAlive is defaulted to true, if you acquire a streaming response, such as S3::getObject's Body field. You must read the stream to completion in order for the socket to close naturally.

Thanks,
Ran~

from aws-sdk-js-v3.

alesk20 avatar alesk20 commented on June 12, 2024

Hi @RanVaknin,
yes I publish and retrieve different objects to/from S3. When I use getObject operation I always consume the body like this:

const s3ObjectBody = await s3Object.Body.transformToByteArray();

Am I missing something?

Thanks.

EDIT: There was actually a point in the code where I was not consuming the Body stream. I fixed that, I'll let you know if the problem remains, but from my tests it seems to fix the issue, also removing the "requestTimeout" I put as a workaround.

Thank you again.

Kind regards.

from aws-sdk-js-v3.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.