odigos-io / odigos Goto Github PK
View Code? Open in Web Editor NEWDistributed tracing without code changes. π Instantly monitor any application using OpenTelemetry and eBPF
Home Page: https://odigos.io
License: Apache License 2.0
Distributed tracing without code changes. π Instantly monitor any application using OpenTelemetry and eBPF
Home Page: https://odigos.io
License: Apache License 2.0
Describe the bug
While uninstalling odigos via cli, label odigos-instrumentation=enabled
in instrumented namespace isn't removed. This situation results with the issue with discovering services in affected namespaces while setting up odigos once again from scratch.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
No issues with discovering previously instrumented services during odigos reinstallation
Hello,
Could you allow advanced users to set the full url of the endpoints,
and maybe on the front the port and protocol used by the gateway directly on the web page.
This would allow us to have,
This will allow a good integration with a more complex architecture like this one:
Kube cluster A idigos-gateway> Kube cluster B loki ingress :
Kube cluster A idigos-gateway --> Kube cluster B tempo ingress :
What do you think about it?
Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
Describe the solution you'd like
A clear and concise description of what you want to happen.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.
Destination name: Elastic Observability
Support OpenTelemetry format: yes, see: https://www.elastic.co/guide/en/apm/guide/8.10/open-telemetry-direct.html#connect-open-telemetry-collector
Supported signals: traces, metrics, logs
When running $odigos --help
, the following output is generated:
A longer description that spans multiple lines and likely contains
examples and usage of using your application. For example:
Cobra is a CLI library for Go that empowers applications.
This application is a tool to generate the needed files
to quickly create a Cobra application.
Usage:
odigos [command]
Available Commands:
add A brief description of your command
completion Generate the autocompletion script for the specified shell
help Help about any command
install Install Odigos
ui Start the Odigos UI
uninstall A brief description of your command
version A brief description of your command
Flags:
-h, --help help for odigos
--kubeconfig string (optional) absolute path to the kubeconfig file (default "/Users/amirblum/.kube/config")
Use "odigos [command] --help" for more information about a command.
Odigos should replace all the default texts such as "A brief description of your command" with meaningful information
Describe the bug
Current UI requires a "http" or "https" prefix for OTLP endpoint. While in fact, it is using otlp/grpc instead of otlp/http. According to the OpenTelemetry Collector naming convention, we should use ip:port to indicate the use of the otlp/grpc protocol and http://ip:port or https://ip:port to indicate the use of the otlp/http protocol.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
The UI should not requires a "http" or "https" prefix for OTLP endpoint.
Odigos UI binary should include the git tag version.
This can be done using ldflags
in goreleaser, similar to the way that we do it for the CLI:
https://github.com/keyval-dev/odigos/blob/02639256a750cfa8e0e7555954af046fc8140c23/.goreleaser.yaml#L19-L22
Destination name: HyperDX
Support OpenTelemetry format: yes, see: https://www.hyperdx.io/docs/install/opentelemetry
Supported signals: traces, metrics, logs
In the create destination page (for example at /overview/destinations/create/form?dest=splunk
)
Pressing enter should be the same as clicking the create destination button.
In Odigos CLI, when running odigos uninstall
the user is prompted to confirm the uninstall.
We should treat pressing Enter as writing y.
Current state:
β odigos uninstall
About to uninstall Odigos from namespace odigos-system
Are you sure? [Y/n]:
Aborting uninstall
Desired:
β odigos uninstall
About to uninstall Odigos from namespace odigos-system
Are you sure? [Y/n]:
Uninstalling Odigos Deployments β
Uninstalling Odigos DaemonSets β
Uninstalling Odigos ConfigMaps β
Uninstalling Odigos CRDs β
Uninstalling Odigos RBAC β
Uninstalling Odigos Secrets β
Uninstalling Namespace odigos-system β
Waiting for namespace to be deleted β
Rolling back odigos changes to pods β
Use generated name for InstrumentedApplication
instead of fixed name
In logz.io destination, the region field should be a dropdown that contains the following values: https://docs.logz.io/user-guide/accounts/account-region.html
Similar to the site field in Datadog destination.
Describe the bug
The pod of the demo app is crashed when instrumentation is enabled.
To Reproduce
Steps to reproduce the behavior:
Run command according to the doc:
kubectl apply -f https://raw.githubusercontent.com/keyval-dev/microservices-demo/master/release/kubernetes-manifests.yaml
Install Odigos
helm repo add odigos https://keyval-dev.github.io/odigos-charts/
helm install my-odigos odigos/odigos --namespace odigos-system --create-namespace
Enable instrumentation with the UI
Find pod are crashed
kubectl get pod -n default
Expected behavior
The pod should not be crashed
Screenshots
The following screenshot is after I enable instrumentation for frontend pod:
Environment:
Additional context
It seems the problem is caused by some compatibility issue of Golang packages:
kubectl logs -f frontend-54d7ddf7f4-54tcn -p -c server-instrumentation
... ...
{"level":"info","ts":1675158604.7114024,"caller":"grpc/probe.go:213","msg":"closing gRPC instrumentor"}
{"level":"info","ts":1675158604.7114253,"caller":"server/probe.go:223","msg":"closing gRPC server instrumentor"}
{"level":"info","ts":1675158604.7114346,"caller":"server/probe.go:179","msg":"closing net/http instrumentor"}
{"level":"info","ts":1675158604.7114527,"caller":"mux/probe.go:179","msg":"closing gorilla/mux instrumentor"}
{"level":"error","ts":1675158604.7114713,"caller":"cli/main.go:79","msg":"error while running instrumentors","error":"field UprobeClientConnInvoke: program uprobe_ClientConn_Invoke: map .rodata: map create: read- and write-only maps not supported (requires >= v5.2)","stacktrace":"main.main\n\t/app/cli/main.go:79\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250"}
Currently, logs are extracted using the filelog
receiver. Moving to OTLP for exporting has the following advantages:
Getting logs by scraping pod files will still be in use for unrecognized pods or for SDKs without a logs exporter.
For example, there is no SDK for MySQL but we still want to let the users ship MySQL logs to the selected destinations.
Implementing this issue includes two tasks:
A. configure all the SDKs to export logs via OTLP exporter
B. configure the collectors' include
attribute of the filelog
receiver to only include unrecognized applications or known languages without logs exporter.
At the time of writing this issue, the following SDKs implements OTLP logs exporter:
Language | OTLP Logs Exporter Exists |
---|---|
Java | Exists |
Python | Missing implementation for OTLP over HTTP |
Go | Not Exists |
Javascript | Not Exists |
.NET | Not Exists |
We would like to have the documentation in the same repository as Odigos' code.
Currently, all documentation files are stored at: https://github.com/keyval-dev/odigos-docs/
We should create a new subdirectory called docs
and put everything there
Change Odigos logo background to be transparent instead of black.
The list of applications that appear on the source selection page defaults to the applications of the default namespace, the problem occurs when there is a namespace that starts with a lowercase letter "d" and is displayed first in the dropdown list.
To Reproduce
Steps to reproduce the behavior:
kubectl create ns <name>
Expected behavior
The apps listed should correspond to the information displayed in the dropdown menu.
Hello,
when we set an url starting with https:// the config on the gateway configmap has an http:// prefix.
Thank for your work, this sounds very promising
helm install my-odigos odigos/odigos --namespace odigos-system --create-namespace
ends in the following pods installed but odiglet
is not coming up.
odiglet-gg9tw 0/1 CrashLoopBackOff 69 (4m35s ago) 6h9m
odigos-autoscaler-6f98b9f9-86m9s 2/2 Running 13 (5h15m ago) 6h9m
odigos-instrumentor-66bd6d6f5b-ckl7q 2/2 Running 13 (5h15m ago) 6h9m
odigos-scheduler-6d694ff89f-vhlxv 2/2 Running 14 (5h15m ago) 6h9m
odigos-ui-fb4485899-mrs9p 1/1 Running 2 (5h15m ago) 6h9m
Docker Desktop 4.17.0 (99724).
on a M1 Macbook MacOS Ventura 13.3.1 (22E261)
Error via k logs odiglet-gg9tw -n odigos-system
is:
failed to try resolving symlinks in path "/var/log/pods/odigos-system_odiglet-gg9tw_3ca139e5-35f1-48d3-a748-afe2fbcf40e8/odiglet/69.log": lstat /var/log/pods/odigos-system_odiglet-gg9tw_3ca139e5-35f1-48d3-a748-afe2fbcf40e8/odiglet/69.log: no such file or directory%
k version -o yaml
shows:
clientVersion:
buildDate: "2022-12-08T19:58:30Z"
compiler: gc
gitCommit: b46a3f887ca979b1a5d14fd39cb1af43e7e5d12d
gitTreeState: clean
gitVersion: v1.26.0
goVersion: go1.19.4
major: "1"
minor: "26"
platform: darwin/arm64
kustomizeVersion: v4.5.7
serverVersion:
buildDate: "2022-11-09T13:29:58Z"
compiler: gc
gitCommit: 872a965c6c6526caa949f0c6ac028ef7aff3fb78
gitTreeState: clean
gitVersion: v1.25.4
goVersion: go1.19.3
major: "1"
minor: "25"
platform: linux/arm64
thanks
Issue
Problem Statement
Currently, our application displays scrollbars in certain areas of the app, which can be visually distracting and affect the overall user experience. We need to implement a solution to remove these scrollbars entirely by using CSS.
Expected Behavior
When users interact with our app, there should be no visible scrollbars on the page, even when the content exceeds the viewport height or width. Instead, the content should seamlessly flow without any scroll indicators.
Steps to Reproduce
Current Implementation
Currently, scrollbars are visible and are handled individually.
Desired Implementation
We need to apply CSS styles that hide the scrollbars throughout the entire app, ensuring a clean and seamless user experience. This should be achieved without breaking any functionality or causing layout issues.
Possible Solution
One approach to remove scrollbars is by using CSS properties such as overflow: hidden on specific elements or globally. We should carefully consider where and how this CSS is applied to ensure it doesn't interfere with the app's functionality.
Acceptance Criteria
Note
Before implementing any changes, it's essential to thoroughly test the proposed solution to ensure it doesn't introduce any unforeseen issues or negatively impact the user experience.
Please feel free to assign this issue to a developer who can work on implementing the solution and provide updates as progress is made. If you have any questions or need further clarification, don't hesitate to ask.
Greetings from ZincObserve team - https://github.com/zinclabs/zincobserve . Would love to see support for ZincObserve in odigo.
Is your feature request related to a problem? Please describe.
Support for sending logs, metrics and traces to ZincObserve
Describe the solution you'd like
A direct selection for ZincObserve in odigos that allows people to send logs, metrics and traces to ZincObserve. ZincObserve supports:
Logs - JSON array or elasticsearch _bulk format
Metrics - Prometheus remote write
Traces - OTLP
Here are details around it - https://zinc.dev/docs/guide/ingestion/
There should not be really much to do for odigo team to achieve this as all the elements are already there.
Describe alternatives you've considered
use elasticsearch, opentelemtry, and prometheus enpoints separately.
Additional context
No
2022-10-18T08:18:24.473Z error helper/transformer.go:110 Failed to process entry {"kind": "receiver", "name": "filelog", "pipeline": "logs", "operator_id": "parser-crio", "operator_type": "regex_parser", "error": {"description": "time parser: parsing time \"2022-10-18T16:18:24.42145571+08:00\" as \"2006-01-02T15:04:05.000000000-07:00\": cannot parse \"08:00\" as \".000000000\""}, "action": "send", "entry": {"observed_timestamp":"2022-10-18T08:18:24.473080079Z","timestamp":"0001-01-01T00:00:00Z","body":"2022-10-18T16:18:24.42145571+08:00 stdout F [GIN] 2022/10/18 - 08:18:24 | 500 | 20.148Β΅s | 10.42.0.4 | POST \"/break?http_code=500\"","attributes":{"log":"[GIN] 2022/10/18 - 08:18:24 | 500 | 20.148Β΅s | 10.42.0.4 | POST \"/break?http_code=500\"","log.file.path":"/var/log/pods/default_historicaltruth-5d8c48c95c-2gqnm_0b6a83d1-2f18-414d-ad37-c161ea393804/historicaltruth/0.log","logtag":"F","stream":"stdout","time":"2022-10-18T16:18:24.42145571+08:00"},"severity":0,"scope_name":""}}
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/helper.(*TransformerOperator).HandleEntryError
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.55.0/operator/helper/transformer.go:110
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/helper.(*ParserOperator).ParseWith
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.55.0/operator/helper/parser.go:173
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/helper.(*ParserOperator).ProcessWithCallback
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.55.0/operator/helper/parser.go:116
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/helper.(*ParserOperator).ProcessWith
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.55.0/operator/helper/parser.go:102
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/parser/regex.(*Parser).Process
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.55.0/operator/parser/regex/regex.go:103
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/transformer/router.(*Transformer).Process
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.55.0/operator/transformer/router/router.go:135
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/helper.(*WriterOperator).Write
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.55.0/operator/helper/writer.go:65
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/file.(*Input).emit
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.55.0/operator/input/file/file.go:65
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer.(*Reader).ReadToEnd
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.55.0/fileconsumer/reader.go:138
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer.(*Input).poll.func1
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.55.0/fileconsumer/file.go:159
The current language detection mechanism of creating a pod, detecting the language, and then terminating the pod is error prune. I saw some OutOfPods
errors when running on EKS clusters.
Moving the same lang detection logic to odiglet will make the language detection process faster and more robust.
Build and increment agents' container images as part of the CI/CD workflow
Is happening a lot and does not indicate an error. We should either log it in a higher debug level or just remove this line.
Is your feature request related to a problem? Please describe.
We have an ELK-like stack (FLOOD stack?) with OpenSearch for monitoring and observability. This project would be a game changer if OpenSearch was supported as a destination.
Describe the solution you'd like
OpenSearch as log/metric/trace destination for odigos
Describe the bug
Minor duplication in README - "Getting Started Guide", "Documentation", and "quickstart guide" all end up at https://docs.odigos.io/intro as it is the default docs page
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Links lead to different information/user flows, or links are deduplicated and have consistent labeling.
GitHub just announced new ARM-based runners:
https://github.blog/2023-10-02-introducing-the-new-apple-silicon-powered-m1-macos-larger-runner-for-github-actions/
Currently, the majority of the build time is spent on building ARM images on x64 runners.
Moving to building the ARM images on ARM runners should make our build time much faster.
Currently, the only validation the frontend performs is that the destination URL is a valid URL.
We can help users avoid typing the wrong URLs by performing more validations in the frontend before persisting the destination.
Examples of possible additional validations:
9200
4317
The first step in implementing this issue may be just validating a single destination in addition to thinking about a good design for validations in frontend.
The Odigos UI binary should have a flag that will print the current version of the UI.
Add to GitHub actions workflow: build and push container images of all the instrumentation containers
Describe the solution you'd like
Currently, instrumentation is achieved through device plugins, which are deployed through daemonsets. This approach does not work well in serverless services.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.
Just really wanna know what is the url to be added for destination when using Loki as backend, as I can't find any example for the configuration, nor any detail docs on this, and the configuration failed. pls. advise, thanks.
my Loki deploy is as below:
root@xxxxx:# kubectl -n logging get pods# kubectl -n logging get svc
NAME READY STATUS RESTARTS AGE
loki-0 1/1 Running 0 4m32s
loki-grafana-7595cff4f5-4szxs 2/2 Running 0 4m31s
loki-promtail-2n28d 1/1 Running 0 4m32s
loki-promtail-46q2l 1/1 Running 0 4m32s
loki-promtail-6fjfm 1/1 Running 0 4m32s
loki-promtail-8fzzp 1/1 Running 0 4m32s
loki-promtail-8ndgj 1/1 Running 0 4m32s
loki-promtail-h7nfx 1/1 Running 0 4m32s
loki-promtail-nn25b 1/1 Running 0 4m32s
loki-promtail-r7nmv 1/1 Running 0 4m32s
loki-promtail-xbxpg 1/1 Running 0 4m32s
root@xxxxx:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
loki ClusterIP 10.68.124.157 3100/TCP 4m9s
loki-grafana ClusterIP 10.68.25.148 80/TCP 4m9s
loki-headless ClusterIP None 3100/TCP 4m9s
loki-memberlist ClusterIP None 7946/TCP 4m9s
It seems to odigos add
cli command is not doing anything.
I think it can be safely removed
As VictoriaMetrics gains popularity, we would like to add it as a new destination in Odigos.
Describe the bug
While installing in a cluster with existing deployments, installing and using the opt-out option for odigos leads to a situation where cluster services end up in crashloopbackoff with no easy way to fix other than reapplying all deployment manifests.
I understand that the documentation states uninstalling is a work in progress, but the mutation of deployments and statefulsets leaves them in a bad state after uninstall, and users without gitops or similar deployment methods will face challenges in fixing their deployments. A workaround and a prominent warning should be displayed on this repo until there is a clean uninstall path.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Changes by odigos to deployments should be reverted when odigos is disabled and uninstalled.
Currently, every time a new Kubernetes version comes out or when a new OpenTelemetry instrumentation is updated, dependabot opens many pull requests for every dependency. We can use the group version updates feature of dependabot to open a single pull request that contains all the relevant dependencies changes.
As more and more destinations are added, it will be harder for users to find the exact destination they are looking for.
A common way to solve this problem is by implementing a search.
Search can probably happen on the client side.
Destination name: Axiom
Support OpenTelemetry format: yes, see: https://axiom.co/docs/send-data/opentelemetry
Supported signals: traces, metrics, logs
The main steps involved when debugging Odigos locally:
TAG=<CURRENT-ODIGOS-VERSION> make build-images load-to-kind
odigos version
odigos-system
namespace:kubectl delete pods --all -n odigos-system
Destination name: Highlight.io
Support OpenTelemetry format: yes, see: https://www.highlight.io/docs/getting-started/backend-logging/file
Supported signals: logs
Describe the bug
Java applications in bank-of-athnos always crash when instrumentation is enabled
To Reproduce
Steps to reproduce the behavior:
# install bank-of-athnos
kubectl apply -f https://raw.githubusercontent.com/keyval-dev/bank-of-athnos/main/release/kubernetes-manifests.yaml
# install odigos
helm repo add odigos https://keyval-dev.github.io/odigos-charts/
helm install my-odigos odigos/odigos --namespace odigos-system --create-namespace
Enable instrumentation with the UI, and find pod are crashed
Expected behavior
A clear and concise description of what you expected to happen.
Screenshots
If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
Additional context
Add any other context about the problem here.
The current categories are only managed / open source.
We plan to add many more observability backends in the future, which will require more granular categories such as Cloud Providers, Security, Finance, etc.
When installing odigos via the CLI by running odigos install
the cli reports installation successful even if Odiglet (the deployed DaemonSet) is not running. Kubernetes may refuse to start the pods of Odiglet (for example: if there is a PodSecurityPolicy that blocks hostPID
).
arePodsReady
function should also check that for any DaemonSet in the ns
namespace the number of running pods equal to the number of desired and is bigger than 1.
https://github.com/keyval-dev/odigos/blob/2f7d45c9a2deb1751bab3e120fe86a1ea1fdc7f5/cli/cmd/install.go#L86
Currently, our README.md contains a hard-coded table of all the destinations supported by Odigos.
The same data is available in destination/data/*.yaml
files.
The destination table could be automatically generated from the YAML files, making it always in sync with the destinations actually supported.
The current guide for adding new destination is not updated.
We changed the way a new destinations are added.
Here is an example of a PR that added destination with the new way: #535
The relevant doc file is: docs/adding-new-dest.mdx
Kubernetes clusters with PodSecurityPolicy enabled may refuse to deploy Odiglet due to the privileged permissions required (hostPID, hostNetwork, hostPath, etc).
We should detect if PodSecurityPolicy is enabled in this cluster (maybe by finding if the resource exists?)
and if so we should create a new policy and bind it to the odiglet
service account.
Describe the bug
Installing odigos stack without version specified via cli results in odigos pods failure.
To Reproduce
Steps to reproduce the behavior:
cli install
command without the --version
tag specified.InvalidImageName
status for odigos pods.Expected behavior
If version isn't specifically set up latest
should be apply even if this docker images tag doesn't exist yet.
βFailed to apply default image tag "keyval/odigos-odiglet:": couldn't p
arse image reference "keyval/odigos-odiglet:": invalid reference formatβ
A declarative, efficient, and flexible JavaScript library for building user interfaces.
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. πππ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google β€οΈ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.