Comments (9)
Thank you everyone for reporting and investigating this issue, in special @asmadsen for providing a fix.
The fix will be available in the next minor release (v1.29.0) of the ADOT Java Agent.
from aws-otel-java-instrumentation.
Release v1.29.0 is available for usage.
from aws-otel-java-instrumentation.
We experienced the same issue in our environments with corrupted trace-ids after upgrading AWS OpenTelemetry Java Agent from v1.20.0 to v1.23.0 sometime last April.
After reverting to v1.20.0, everything started working normally again. We also tried v1.26.0 released mid-May with equally unsuccessful results.
It would seem some changes were introduced that affected propagation of trace-ids sometime after v1.20.0.
from aws-otel-java-instrumentation.
That's good to hear, because that makes it less likely that we are doing something stupid which causes this 😉
I just tested and the problem persists with v1.27.0.
from aws-otel-java-instrumentation.
I can confirm the problem with v1.26.0 and v1.27.0.
We also discovered that not every log entry contains the attribute "AWS-XRAY-TRACE-ID"
from aws-otel-java-instrumentation.
@felixscheinost To understand this further, I have a few questions
The reason for the failure is evident when viewing the log events in CloudWatch: the AWS-XRAY-TRACE-ID and the trace_id fields of the logged JSON objects are completely different (not just the usual 1- and inserted -).
Do you mean, that in your CW logs, you see the trace id as xxxxxxx
instead of the expected xray formatted 1-xxx-xxxxx
? Or are they entirely different?
On the other hand there is also one log stream where for the first 15 minutes the two IDs match (modulo the 1- and -) but suddenly they break and seem to stay broken for all further logs.
Do you mean that initially the trace ids differ only by the1-
and-
and then afterwards they differ completely in their values?
Can you provide an example of the trace id that you see in your trace and the one logged in CW logs?
Are you seeing this in your root service (that generates a trace) or is it receiving the trace context from some upstream service?
from aws-otel-java-instrumentation.
@felixscheinost To understand this further, I have a few questions
Do you mean, that in your CW logs, you see the trace id asxxxxxxx
instead of the expected xray formatted1-xxx-xxxxx
? Or are they entirely different?
We started testing v1.28.0
and have come to the following conclusion: AWS-XRAY-TRACE-ID
still does not reliably match trace_id
in the MDC context. We are using logback as per ADOT Logging Guide and configured PatternLayout to be '%-5p %X{AWS-XRAY-TRACE-ID} %X'
so that we can see both MDC key-value pairs side-by-side:
2023-07-25T10:26:12.977+10:00 DEBUG 1-64bf1632-122b76764ff0867839928046@ea4d0bbd5a946345 AWS-XRAY-TRACE-ID=1-64bf1632-122b76764ff0867839928046@ea4d0bbd5a946345, trace_id=64bf16a47e128818703e67e9253173ac, trace_flags=01, span_id=ec9a94155c77f5fb 1 --- [ parallel-2] c.b.i.commons.logging.LoggingWebFilter : Handling HTTP request: GET http://mytest-dev-ecs-nlb-5487598345723.elb.ap-southeast-2.amazonaws.com/api/v1/test/data
2023-07-25T10:26:12.982+10:00 INFO 1-64bf1632-122b76764ff0867839928046@ea4d0bbd5a946345 AWS-XRAY-TRACE-ID=1-64bf1632-122b76764ff0867839928046@ea4d0bbd5a946345, trace_id=64bf16a47e128818703e67e9253173ac, trace_flags=01, span_id=d73e4a75dc689b8b 1 --- [ parallel-2] c.b.i.test.controller.TestControllerV1 : Received a call to fetch Tenant data..
2023-07-25T10:26:12.998+10:00 DEBUG 1-64bf1632-122b76764ff0867839928046@e2639b97695436b0 AWS-XRAY-TRACE-ID=1-64bf1632-122b76764ff0867839928046@e2639b97695436b0, trace_id=64bf16a47e128818703e67e9253173ac, trace_flags=01, span_id=b4daf8f1ed65136d 1 --- [ctor-http-nio-2] c.b.i.commons.logging.WebClientLogger : HTTP call: GET http://servicenow-adapter-service.internal.dev.acme.com:8080/api/v2/tenants Response: statusCode=200 OK contentLength=0 timeTaken=14ms
2023-07-25T10:26:13.226+10:00 DEBUG trace_id=64bf16a47e128818703e67e9253173ac, trace_flags=01, span_id=ec9a94155c77f5fb 1 --- [ctor-http-nio-1] c.b.i.commons.logging.LoggingWebFilter : Handled HTTP request: GET http://mytest-dev-ecs-nlb-5487598345723.elb.ap-southeast-2.amazonaws.com/api/v1/test/data statusCode=200 OK
Can you provide an example of the trace id that you see in your trace and the one logged in CW logs? Are you seeing this in your root service (that generates a trace) or is it receiving the trace context from some upstream service?
In the logging output above, we can see that the value for AWS-XRAY-TRACE-ID
: 1-64bf1632-122b76764ff0867839928046
does not match that of trace_id
: 64bf16a47e128818703e67e9253173ac
and span_id
: d73e4a75dc689b8b
.
We can also see that the values for trace_id
/span_id
match that of the parent context upstream and the trace is rendered properly in the AWS XRay Traces UI.
from aws-otel-java-instrumentation.
Sorry, for the late reply. I wanted to prepare a basic example project to reproduce the issue but didn't find time to do that until now.
Seems like there is already progress on fixing the issue, thanks @asmadsen
I'll provide the link to the example project nevertheless: https://github.com/felixscheinost/test-aws-otel-java-agent
Here is the log output with v1.20.0: https://gist.github.com/felixscheinost/9b1f5d601da916cab1adba3743404552
And here with v1.28.0: https://gist.github.com/felixscheinost/e25e256a780ffbe1879dfff929d69885
In those logs there is easily visible what @asmadsen figured out:
First request, handled on thread exec-1
has trace ID 1-64c12479-9781b7613289b8b04750c6da@80025d3e95138300
. The next request which is handled on the same thread has the same trace ID.
For the first time a request is handled on another thread the AWS-XRAY-TRACE-ID is empty, afterwards it stays the same for all subsequent requests on that thread.
from aws-otel-java-instrumentation.
Closing the issue
from aws-otel-java-instrumentation.
Related Issues (20)
- Lack of a docker image HOT 3
- OTEL trace id mismatches with XRay trace id HOT 5
- Visual presentation and Resource Detection for eks HOT 4
- How to suppress export logs in open-telemetry agent? HOT 4
- Update actions to not use Node.js 12 HOT 1
- AWS Lambda: Unable to debug the issue HOT 1
- xray tracing stops after some hours HOT 7
- Does the xray remote sampler respect parent sampling decision? HOT 4
- FRAME_SIZE_ERROR: 4740180 HOT 2
- Using more than one tracing systems with AWS Java Agent #2333 HOT 2
- Upgrade opentelemetry-java to 1.30.x HOT 1
- AwsXrayPropagator can create invalid trace headers during baggage propagation HOT 10
- AWS OTEL agents in Docker images HOT 3
- [ci] Add new EKS cluster targets HOT 1
- Support for OTEL Synchronous Gauge HOT 2
- http.server.duration has exceeded the maximum allowed cardinality HOT 3
- ADOT Java Agent v1.31.1 - High Cardinality Metrics
- Java autoinstrumentation does not support PKIX algorithm HOT 2
- Clarification about how to autoinstrumenting a spring-boot 3 with micrometer tracing and ADOT agent HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from aws-otel-java-instrumentation.