Coder Social home page Coder Social logo

Comments (19)

jan-auer avatar jan-auer commented on June 18, 2024 1

Thank you, this in fact points out the problem: transfer-encoding: chunked. The sentry web container is not able to handle this, and we should not be creating such a request. We'll submit a bugfix for this, thank you for uncovering it. (Edit: #890).

The configuration option you're setting, _client: reqwest, is unstable and still being developed. I'm curious what @untitaker wanted to test with this. In the meanwhile, I would suggest to switch back to the regular client.

As this is likely an issue with the way the containers are being run or networking is set up, I would propose to move this issue to getsentry/onpremise. Any objections, @BYK?

from relay.

untitaker avatar untitaker commented on June 18, 2024

Can you try adding

http:
  _client: reqwest

to your config?

from relay.

mvaled avatar mvaled commented on June 18, 2024

@untitaker

Thanks for you response. After that change, things got bad turn:

2020-12-07T22:15:14Z [relay_server::actors::upstream] INFO: registering with upstream (http://web.sentry.svc:9190/)
2020-12-07T22:15:14Z [relay_server::actors::upstream] ERROR: authentication encountered error: upstream request returned error 400 Bad Request
  caused by: No valid json body
2020-12-07T22:15:31Z [relay_server::actors::upstream] INFO: registering with upstream (http://web.sentry.svc:9190/)
2020-12-07T22:15:31Z [relay_server::actors::upstream] ERROR: authentication encountered error: upstream request returned error 400 Bad Request
  caused by: No valid json body

from relay.

untitaker avatar untitaker commented on June 18, 2024

Cool, that's worse. Do you have capability to inspect the response body of sentry to relay? I wonder why the request is invalid.

from relay.

untitaker avatar untitaker commented on June 18, 2024

Context: We intend to switch to reqwest as HTTP client (the option to switch back/forth is temporary), so I think it's wasted time to investigate bugs in the old client.

from relay.

mvaled avatar mvaled commented on June 18, 2024

Do you have capability to inspect the response body of sentry to relay?

Probably, but right now I have no clue. I would have to look the at the code to see if I could trap the request/response or put DEBUG=True in the webserver: Would it help?

Changing relay's logging level to DEBUG didn't helped much, though.

from relay.

mvaled avatar mvaled commented on June 18, 2024

I'm looking at the code of the upstream actor, and I noticed I've never spotted the info log at:

                log::info!("relay successfully registered with upstream");
                slf.auth_state = AuthState::Registered;
                slf.auth_backoff.reset();

But I've received events from clients, so... what's happening here? Am I missing something?

from relay.

mvaled avatar mvaled commented on June 18, 2024

Well I'm not getting events from clients right. I'm triggering events in some the applications and I don't see them appear in Sentry :*(.

Still the logs just show:

2020-12-08T00:17:19Z [relay_server::actors::upstream] INFO: registering with upstream (http://web.sentry.svc:9190/)
2020-12-08T00:17:19Z [relay_server::actors::upstream] ERROR: authentication encountered error: could not send request to upstream
  caused by: IO error: Connection reset by peer (os error 104)
  caused by: IO error: Connection reset by peer (os error 104)
  caused by: IO error: Connection reset by peer (os error 104)
  caused by: Connection reset by peer (os error 104)
2020-12-08T00:17:43Z [relay_server::actors::events] ERROR: error processing event: event exceeded its configured lifetime
2020-12-08T00:17:43Z [relay_server::actors::events] ERROR: error processing event: event exceeded its configured lifetime

from relay.

mvaled avatar mvaled commented on June 18, 2024

Hum... I ran docker pull for all the images snuba, rely, sentry and symbolicator. Now I'm getting these logs:

2020-12-08T01:33:42Z [relay_server::actors::upstream] INFO: registering with upstream (http://web.sentry.svc:9190/)
2020-12-08T01:33:42Z [relay_server::actors::upstream] INFO: relay successfully registered with upstream
2020-12-08T01:33:42Z [relay_server::actors::project_upstream] ERROR: error fetching project states: could not send request to upstream
  caused by: IO error: Connection reset by peer (os error 104)
  caused by: IO error: Connection reset by peer (os error 104)
  caused by: IO error: Connection reset by peer (os error 104)
  caused by: Connection reset by peer (os error 104)
2020-12-08T01:33:43Z [relay_server::actors::events] ERROR: error processing event: failed to resolve project information
  caused by: failed to fetch project state from upstream
2020-12-08T01:33:43Z [relay_server::actors::events] ERROR: error processing event: failed to resolve project information
  caused by: failed to fetch project state from upstream
2020-12-08T01:33:43Z [relay_server::actors::events] ERROR: error processing event: failed to resolve project information
  caused by: failed to fetch project state from upstream
2020-12-08T01:33:43Z [relay_server::actors::events] ERROR: error processing event: failed to resolve project information
  caused by: failed to fetch project state from upstream
2020-12-08T01:33:43Z [relay_server::actors::events] ERROR: error processing event: failed to resolve project information
  caused by: failed to fetch project state from upstream
2020-12-08T01:33:43Z [relay_server::actors::events] ERROR: error processing event: failed to resolve project information
  caused by: failed to fetch project state from upstream

from relay.

mvaled avatar mvaled commented on June 18, 2024

I tried once more to use reqwest, but it failed again with 400 Bad request. But I wasn't able to catch the request to why sentry's complaining about it. And now I'm back to the cycle of ERROR: authentication encountered error...

I'll report back tomorrow if I find anything.

Best regards.

from relay.

untitaker avatar untitaker commented on June 18, 2024

@mvaled I have opened #878 for the 400 bug, let me know if that fixes the issue

from relay.

mvaled avatar mvaled commented on June 18, 2024

@untitaker

As commented there the issue still happens. As soon as I switch to reqwest, I get the 400 error. This is the full log for the first seconds after restarting relay with reqwest:

2020-12-17T12:31:43Z [relay::setup] INFO: launching relay from config folder /work/.relay
2020-12-17T12:31:43Z [relay::setup] INFO:   relay mode: managed
2020-12-17T12:31:43Z [relay::setup] INFO:   relay id: <redacted>
2020-12-17T12:31:43Z [relay::setup] INFO:   public key: <redacted>
2020-12-17T12:31:43Z [relay::setup] INFO:   log level: INFO
2020-12-17T12:31:43Z [relay_server::actors::outcome::processing] INFO: OutcomeProducer started.
2020-12-17T12:31:43Z [relay_server::actors::events] INFO: starting 8 event processing workers
2020-12-17T12:31:43Z [relay_server::actors::upstream] INFO: upstream relay started
2020-12-17T12:31:43Z [relay_server::actors::upstream] INFO: registering with upstream (http://web.sentry.svc:9190/)
2020-12-17T12:31:43Z [relay_server::actors::store] INFO: store forwarder started
2020-12-17T12:31:43Z [relay_server::actors::project_redis] INFO: redis project cache started
2020-12-17T12:31:43Z [relay_server::actors::project_redis] INFO: redis project cache started
2020-12-17T12:31:43Z [relay_server::actors::project_redis] INFO: redis project cache started
2020-12-17T12:31:43Z [relay_server::actors::project_redis] INFO: redis project cache started
2020-12-17T12:31:43Z [relay_server::actors::project_redis] INFO: redis project cache started
2020-12-17T12:31:43Z [relay_server::actors::project_redis] INFO: redis project cache started
2020-12-17T12:31:43Z [relay_server::actors::project_redis] INFO: redis project cache started
2020-12-17T12:31:43Z [relay_server::service] INFO: spawning http server
2020-12-17T12:31:43Z [relay_server::actors::project_redis] INFO: redis project cache started
2020-12-17T12:31:43Z [relay_server::service] INFO:   listening on: http://0.0.0.0:3000/
2020-12-17T12:31:43Z [actix_net::server::server] INFO: Starting 8 workers
2020-12-17T12:31:43Z [relay_server::actors::upstream] ERROR: authentication encountered error: upstream request returned error 400 Bad Request
  caused by: No valid json body
2020-12-17T12:31:43Z [relay_server::actors::upstream] INFO: registering with upstream (http://web.sentry.svc:9190/)
2020-12-17T12:31:43Z [actix_net::server::server] INFO: Starting server on 0.0.0.0:3000
2020-12-17T12:31:43Z [relay_server::actors::controller] INFO: relay server starting
2020-12-17T12:31:43Z [relay_server::actors::connector] INFO: metered connector started
2020-12-17T12:31:43Z [relay_server::actors::events] INFO: event manager started
2020-12-17T12:31:43Z [relay_server::actors::project_local] INFO: project local cache started
2020-12-17T12:31:43Z [relay_server::actors::project_upstream] INFO: project upstream cache started
2020-12-17T12:31:43Z [relay_server::actors::project_cache] INFO: project cache started
2020-12-17T12:31:43Z [relay_server::actors::relays] INFO: key cache started
2020-12-17T12:31:43Z [relay_server::actors::upstream] ERROR: authentication encountered error: upstream request returned error 400 Bad Request
  caused by: No valid json body
2020-12-17T12:31:44Z [relay_server::actors::upstream] INFO: registering with upstream (http://web.sentry.svc:9190/)
2020-12-17T12:31:44Z [relay_server::actors::upstream] ERROR: authentication encountered error: upstream request returned error 400 Bad Request
  caused by: No valid json body

from relay.

untitaker avatar untitaker commented on June 18, 2024

Fair enough, we merged this because it fixed the issue for another customer. I'm not really sure what to try then

Probably, but right now I have no clue. I would have to look the at the code to see if I could trap the request/response or put DEBUG=True in the webserver: Would it help?

assuming you're sure you're running the correct commit, let's try request capturing: I was thinking more of using wireshark assuming you have unencrypted traffic.

from relay.

mvaled avatar mvaled commented on June 18, 2024

Hi @untitaker,

I'm rebuilding the image from master:

$ git log -1 --oneline
33f0dfbb feat: Enable use of multiple value types per field (#882)

$ build docker . -t getsentry/relay:nightly

So, I'll be running relay with the right commit.

As for using wireshark, I'm not really sure if I'm allowed. Let's find out.

from relay.

mvaled avatar mvaled commented on June 18, 2024

Unfortunately, using wireshark is not possible. I don't have physical access to the server and X11 forwarding via SSH is not allowed :(

from relay.

mvaled avatar mvaled commented on June 18, 2024

I'm gonna use tcpdump (or something like that) in the server and import it in wireshark locally.

from relay.

mvaled avatar mvaled commented on June 18, 2024

@untitaker

Using tcpdump I could sniff the traffic going to the upstream. Once with actix, and once with reqwest. I think I have spotted a difference:

Screenshot from 2020-12-18 11-31-30

On the left is the actix trace, on the right is the reqwest. The JSON payload using reqwest is Array (seemingly empty) instead of an object.

from relay.

mvaled avatar mvaled commented on June 18, 2024

Looking again at the traces I noticed my previous comment was a response coming from the server, not the POST send by the relay. So I looked again.

In the good trace, I observe the following:

Relay sends a POST to upstream with a JSON payload {relay_id: ..., public_key: ..., version: ...}. There are two custom HTTP headers: x-sentry-relay-id: bcfd9d5a-... which matches the relay_id in the payload; and x-sentry-relay-signature, which I assume is a signature of the HTTP request with headers included. Other headers are:

  content-type: application/json
  accept-encoding: gzip, deflate
  user-agent: actix-web/0.7.19
  content-length: 130
  date: Fri, 18 Dec 2020 16:11:52 GMT

The upstream returns 200, with a JSON payload {relay_id: ..., token: ...}

In the "bad" trace:

A similar POST is done. The JSON payload is the same as before. But some HTTP headers are different:

content-type: application/json
content-encoding: identity
accept: */*
accept-encoding: gzip
transfer-encoding: chunked

I hope this may provide some clues about what's happening.

from relay.

jan-auer avatar jan-auer commented on June 18, 2024

Closing since both self-hosted and Relay have changed substantially. If a similar bug is still occurring, please re-submit to the self-hosted repository. Thanks!

from relay.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.