Comments (10)
Hey @wangzlei
The content in the baggage is not important. In the case that triggered this for us:
- it is simply multiple key-value pairs, as is allowed in the W3C baggage spec
- it was a SfN call from application code running on EC2, not a lambda
Note that - my description, above, seems pretty clear about what the problem is. To re-iterate:
- at lines https://github.com/open-telemetry/opentelemetry-java-contrib/blob/main/aws-xray-propagator/src/main/java/io/opentelemetry/contrib/awsxray/propagator/AwsXrayPropagator.java#L116-L127 the trace header is initialised with a bunch of bytes for the current trace context
- at line https://github.com/open-telemetry/opentelemetry-java-contrib/blob/main/aws-xray-propagator/src/main/java/io/opentelemetry/contrib/awsxray/propagator/AwsXrayPropagator.java#L145 the trace header has baggage appended, up to a maximum of 256 additional chars
- the length check in step 2 does not take into account the characters added in step 1, leading to an overall header that is well over 256 characters and which fails validation by (at least) the SfN API
For us, this makes AWS XRay entirely unusable and we've had to drop it. Also, (separate from this bug), the truncation algorithm here is a little simplistic for our needs - we might want to prioritise retaining certain key-value pairs in the baggage (eg. some with specific prefixes or key values), rather than just arbitrarily dropping all baggage after a certain point.
from aws-otel-java-instrumentation.
This issue is stale because it has been open 90 days with no activity. If you want to keep this issue open, please just leave a comment below and auto-close will be canceled
from aws-otel-java-instrumentation.
This is still an issue, please correct, thanks!
from aws-otel-java-instrumentation.
@wangzlei @srprash As the owner of this component, do you mind providing some input here?
from aws-otel-java-instrumentation.
May I know where is the baggage from?
When this issue happens, what the content in badge, is the StepFunction state machine triggered by an AWS Lambda function?
from aws-otel-java-instrumentation.
The X-Ray propagator seems to essentially corrupt baggage due to it's incompatibility with the W3C baggage spec. It's also quite unpredictable behaviour due to seemingly requiring to rely on the existing header length to decide where to truncate.
For this reason I suggest that baggage should be ignored by the X-Ray propagator to prevent further issues. This should at least be the default behaviour. Unfortunately spec does not specify this, or anything regarding X-Ray propagator.
Other implementations already do this, for example opentelemetry-js xray propagator: https://github.com/open-telemetry/opentelemetry-js-contrib/blob/main/propagators/opentelemetry-propagator-aws-xray/src/AWSXRayPropagator.ts
from aws-otel-java-instrumentation.
Thanks for providing comments.
@jamesmoessis
The X-Ray trace header does not differentiate between trace context and baggage like W3C does. In the X-Ray header, everything except for the trace ID, parent ID, and sample flag is treated as baggage. Through the AWS X-Ray propagator, we can convert between the X-Ray header and W3C trace context + W3C baggage. This is proposed as a short-term solution. In the long-term future, X-Ray aims to move forward to W3C and fully adopt W3C trace context and baggage.
@davidconnard
Regarding the length of the X-Ray header, I did not find a definition in X-Ray spec. However, based on some clues, it can be inferred that StepFunction likely has a contract with X-Ray, hence there is length checking in the StepFunction client API. So it is reasonable to truncate the baggage length to (256 - the length of the existing trace context).
The reason for querying the baggage content is this issue is likely related to a feature released by Lambda in the latter half of 2023. I need to confirm once again the length of the content Lambda injects into the X-Ray trace header and whether truncating it will affect Lambda's functionality. If you can help print the the baggage introduces the issue it is helpful.
from aws-otel-java-instrumentation.
Clarify that this issue is different than open-telemetry/opentelemetry-java-contrib#1178
This issue is awsxraypropagator tries to inject a trace header over 256 to StepFunction. The issue #1178 is awsxraypropagator conflicts with baggage propagator since it truncates the baggage value.
from aws-otel-java-instrumentation.
Related Issues (20)
- Lack of a docker image HOT 3
- OTEL trace id mismatches with XRay trace id HOT 5
- Visual presentation and Resource Detection for eks HOT 4
- How to suppress export logs in open-telemetry agent? HOT 4
- Update actions to not use Node.js 12 HOT 1
- Logback MDC `AWS-XRAY-TRACE-ID` and `trace_id` mismatch sporadically HOT 9
- AWS Lambda: Unable to debug the issue HOT 1
- xray tracing stops after some hours HOT 7
- Does the xray remote sampler respect parent sampling decision? HOT 4
- FRAME_SIZE_ERROR: 4740180 HOT 2
- Using more than one tracing systems with AWS Java Agent #2333 HOT 2
- Upgrade opentelemetry-java to 1.30.x HOT 1
- AWS OTEL agents in Docker images HOT 3
- [ci] Add new EKS cluster targets HOT 1
- Support for OTEL Synchronous Gauge HOT 2
- http.server.duration has exceeded the maximum allowed cardinality HOT 3
- ADOT Java Agent v1.31.1 - High Cardinality Metrics
- Java autoinstrumentation does not support PKIX algorithm HOT 2
- Clarification about how to autoinstrumenting a spring-boot 3 with micrometer tracing and ADOT agent HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from aws-otel-java-instrumentation.