Coder Social home page Coder Social logo

Comments (10)

jchristgit avatar jchristgit commented on September 27, 2024 1

Please retry with the latest master. I haven't seen this error myself before nor could I reproduce it, but it should deal with automatically requeueing requests that were closed abruptly.

from nostrum.

jchristgit avatar jchristgit commented on September 27, 2024 1

I pushed another commit to master, with the same caveats: I cannot really test it.

Are you on an unreliable network? Or do you have another clue why these may keep appearing?

from nostrum.

jchristgit avatar jchristgit commented on September 27, 2024 1

Sorry but I have no environment where I can reproduce this, at all.

Since you mentioned that the only difference between stage and prod is the AWS NAT Gateway I cannot help but assume that the AWS NAT Gateway is at fault, and nostrum works properly here. Perhaps the NAT Gateway is messing around with HTTP/2 connections or it is doing some other strange form of introspection on the request. In nostrum the bot gateway request does not do anything out of the ordinary in terms of HTTP requests.

from nostrum.

tspenov avatar tspenov commented on September 27, 2024

Please retry with the latest master. I haven't seen this error myself before nor could I reproduce it, but it should deal with automatically requeueing requests that were closed abruptly.

Thanks for the quick reply! I'll test it out and let you know

from nostrum.

tspenov avatar tspenov commented on September 27, 2024

Please retry with the latest master. I haven't seen this error myself before nor could I reproduce it, but it should deal with automatically requeueing requests that were closed abruptly.

@jchristgit I tested and in logs I see now this error:

Date: 2023-06-29T14:03:02.570
Level: error

State machine 'Elixir.Nostrum.Api.Ratelimiter' terminating

Last event = {
  info, {
    gun_down, <0.5653.0>, http2, closed, [#Ref<0.2297931708.4051173377.228586>]
  }
}

Server state = {
  connected, {
    conn => <0.5653.0>,
    inflight => #{},
    outstanding => {
      <<"/users/@me">> => {
        initial, {
          [{
            #{body => <<>>,
              headers => [{<<content-type>>, <<application/json>>}],
              method => get,
              params => [],
              route => <<"/users/@me">>
            }, 
            {<0.6601.0>, #Ref<0.2297931708.4051173377.228585>}
          }], []
        }
      }
    },
    remaining_in_window => 48,
    running => #{}
  }
}

Reason for termination = error: {badkey, #Ref<0.2297931708.4051173377.228586>}

Callback modules = ['Elixir.Nostrum.Api.Ratelimiter']

Callback mode = state_functions

Stacktrace = [
  {erlang, map_get, [#Ref<0.2297931708.4051173377.228586>, #{}], [{error_info, #{module => erl_erts_errors}}]},
  {'Elixir.Nostrum.Api.Ratelimiter', '-connected/3-fun-1-', 3, [{file, "lib/nostrum/api/ratelimiter.ex"}, {line, 881}]},
  {'Elixir.Enum', '-map/2-lists^map/1-0-', 2, [{file, "lib/enum.ex"}, {line, 1658}]},
  {'Elixir.Nostrum.Api.Ratelimiter', connected, 3, [{file, "lib/nostrum/api/ratelimiter.ex"}, {line, 878}]},
  {gen_statem, loop_state_callback, 11, [{file, "gen_statem.erl"}, {line, 1426}]},
  {proc_lib, init_p_do_apply, 3, [{file, "proc_lib.erl"}, {line, 240}]}
]

Time-outs: {1, [{{timeout, reset_bot_calls_window}, expired}]}

from nostrum.

tspenov avatar tspenov commented on September 27, 2024

I pushed another commit to master, with the same caveats: I cannot really test it.

Are you on an unreliable network? Or do you have another clue why these may keep appearing?

@jchristgit Thank you! I'll test it out tomorrow and let you know.

The elixir app is deployed to AWS, so there shouldn't be any problem with the network.
Not sure why the WS to discord gets closed so often.
Another thing the app is experiencing very often is {:stream_error, :closed} from gun in which case I just retry the command/interaction.

from nostrum.

tspenov avatar tspenov commented on September 27, 2024

I pushed another commit to master, with the same caveats: I cannot really test it.

Are you on an unreliable network? Or do you have another clue why these may keep appearing?

@jchristgit Sometimes after Nostrum.Api.get_current_user() is executed I see in logs: "Request to \"/users/@me\" was closed abnormally, requeueing" but the call doesn't return anything, neither raises an exception. It seems like it is blocking indefinitely.

To work around this I have wrapped it in a Task like this:

  defp warm_up(msg_or_interaction, retries \\ 3) do
    t1 = System.monotonic_time(:millisecond)
    log(msg_or_interaction, "WARM UP STARTING...")

    task = Task.async(fn -> Nostrum.Api.get_current_user() end)

    case Task.yield(task, 5000) || Task.shutdown(task) do
      nil ->
        log(msg_or_interaction, "WARM UP ERROR: Timeout reached.", type: :error)

        if retries > 0 do
          log(msg_or_interaction, "WARM UP TIMEOUT: Retrying...", type: :error)

          warm_up(msg_or_interaction, retries - 1)
        else
          log(msg_or_interaction, "WARM UP TIMEOUT: No more retries.", type: :error)
        end

      {:ok, result} ->
        # handle the result
        handle_result(result, msg_or_interaction, retries)
    end

    t2 = System.monotonic_time(:millisecond)
    log(msg_or_interaction, "Time spent warming up #{t2 - t1}ms.")
  end

Here are some log excerpts:

{"level":"info","message":"[id=1124357812546457661] WARM UP STARTING... msg.content=<@1039814526708764742> hi metadata=%
{"level":"debug","message":"Accounting for request with 50 remaining user calls (initial)","timestamp":"2023-06-30T15:16:32.178"}
{"level":"debug","message":"Accounting for request with 49 remaining user calls","timestamp":"2023-06-30T15:16:32.178"}
{"level":"warning","message":"Request to \"/users/@me\" was closed abnormally, requeueing","timestamp":"2023-06-30T15:16:32.183"}
{"level":"info","message":"[id=1124357812546457661] WARM UP ERROR: Timeout reached. msg.content=<@1039814526708764742> {"level":"info","message":"[id=1124357812546457661] WARM UP TIMEOUT: Retrying...\"852836083381174282\"}","timestamp":"2023-06-30T15:16:37.178"}
{"level":"info","message":"[id=1124357812546457661] WARM UP STARTING... \"852836083381174282\"}","timestamp":"2023-06-30T15:16:37.178"}
{"level":"debug","message":"Accounting for request with 50 remaining user calls (initial)","timestamp":"2023-06-30T15:16:37.202"}
{"level":"debug","message":"Accounting for request with 49 remaining user calls","timestamp":"2023-06-30T15:16:37.202"}
{"level":"info","message":"[id=1124357812546457661] WARM UP SUCCESS ….
{"level":"info","message":"[id=1124357812546457661] Time spent warming up 190ms.
{"level":"info","message":"[id=1124357812546457661] Time spent warming up 5191ms …

from nostrum.

jchristgit avatar jchristgit commented on September 27, 2024

from nostrum.

tspenov avatar tspenov commented on September 27, 2024

I cannot reproduce this here, not with spamming the endpoint nor with running it one by one like you do. Do you have a proxy sitting in front of nostrum? Some outgoing firewall?

@jchristgit The elixir app is deployed in kubernetes cluster in AWS.
The problem is only on the production cluster. Locally and on stage cluster it is fine.
On production I can reproduce every time if the WS connection has been idle for 5-10 mins (no other API call to discord has been made).
The only difference between stage and prod clusters is that on prod cluster there is AWS NAT Gateway for outgoing requests. The default timeout for tcp connections for the AWS NAT Gateway is 350 seconds ~ 6 mins
But I wonder why this happens since there are HEARTBEAT_ACK in the logs pretty much every 40 seconds.
They should be refreshing this connection so this timeout from AWS should not happen.
Do you have any insight on this ?

from nostrum.

tspenov avatar tspenov commented on September 27, 2024

@jchristgit My current workaround is having a job that makes an API call every couple of minutes so it makes sure that the connection doesn't get idle.

Thank you for quick responses and the fixes, really appreciated! 💙

I am closing the issue.

from nostrum.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.