The streams-explorer from bakdata

Display Kafka topic configs

Kafka topic configs such as compact/ delete in the context of whole pipelines can help to find errors/improvements.

Clean Docker image

Some issues I noticed with the current Docker image

something is installing Python 2.7
why are we overwriting uvicorn / fastapi from the base image by installing from requirements.txt
Clean up unneeded packages after build

Use consumerGroup annotation instead of consumer_group for metrics

Display consumer group read rate

Display details for Connect dead letter topics

Visualize Kafka Connect dead letter topics

https://docs.confluent.io/platform/current/installation/configuration/connect/sink-connect-configs.html#errors.deadletterqueue.topic.name

Link to Consumer groups Grafana dashboard

Add elasticsearch sink only for ElasticsearchSinkConnector

Currently, a sink for elasticsearch is added if the config includes the transforms.changeTopic.replacement setting. A sink for this connector should only be added if the connector.class is ElasticsearchSinkConnector.

Add health endpoint

Add /health endpoint and liveness/readiness probes in Helm chart.

Provide metric provider as a plugin to allow customization

Visually highlight activity in pipelines

Using the metrics, we could visually differentiate edges currently data written into/readout of a topic. For the differentiation, we could use an animation: https://g6.antv.vision/en/examples/scatter/edge#edge

Update graph automatically

Repeatedly update the graph in the backend. We could use https://fastapi-utils.davidmontague.xyz/user-guide/repeated-tasks/ to trigger the update endpoint every X minutes.

Search and focus nodes

To find nodes easily, a search over all nodes, which allows to select and focus nodes, could enhance usability.

Opening Streams Explorer with pipeline parameter for non-existent pipeline leads to internal server error

Steps to reproduce:

Open https://streams-explorer.example.com/static/?pipeline=does-not-exist

Expected behavior: redirect to all overview and display warning

Display graph on full window height

Currently, there is whitespace below the graph which is reserved for the details pane. I think it would be better to use the full height and display the details "above" the graph

Add URL parameters to zoom to node

Display available replica count

Currently, the status of replicas is displayed; next to it, we could show the available number of replicas.

Fallback to generic visualization of connector if no specific visualization exists

Move graph layout calculation to frontend

Possible relevant use cases:

Users could have full control over graph layout configurations
Grouping nodes into separate (sub-)pipelines could be done using combo groups of G6
Grouping/filtering nodes can be achieved while providing a useful graph layout
Updates (new nodes e.g., connectors, streams apps, topics, ...) could be handled without a whole graph reload
E.g., in the future, Kafka streams app nodes could be expanded to display internal topologies as part of the whole data pipeline

Dynamically resize graph

The graph canvas should always fill out the whole window, also when the window is resized.

Error loading webpage: Failed to fetch: 500

2021-02-03 16:36:13.848 | DEBUG    | streams_explorer.core.services.metric_providers:refresh_data:100 - Pulling metrics from Prometheus
10.0.1.141:30050 - "GET /api/metrics HTTP/1.1" 500
[2021-02-03 16:36:21 +0000] [7] [ERROR] Exception in ASGI application
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/uvicorn/protocols/http/httptools_impl.py", line 396, in run_asgi
    result = await app(self.scope, self.receive, self.send)
  File "/usr/local/lib/python3.8/site-packages/uvicorn/middleware/proxy_headers.py", line 45, in __call__
    return await self.app(scope, receive, send)
  File "/usr/local/lib/python3.8/site-packages/fastapi/applications.py", line 199, in __call__
    await super().__call__(scope, receive, send)
  File "/usr/local/lib/python3.8/site-packages/starlette/applications.py", line 111, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/lib/python3.8/site-packages/starlette/middleware/errors.py", line 181, in __call__
    raise exc from None
  File "/usr/local/lib/python3.8/site-packages/starlette/middleware/errors.py", line 159, in __call__
    await self.app(scope, receive, _send)
  File "/usr/local/lib/python3.8/site-packages/starlette/exceptions.py", line 82, in __call__
    raise exc from None
  File "/usr/local/lib/python3.8/site-packages/starlette/exceptions.py", line 71, in __call__
    await self.app(scope, receive, sender)
  File "/usr/local/lib/python3.8/site-packages/starlette/routing.py", line 566, in __call__
    await route.handle(scope, receive, send)
  File "/usr/local/lib/python3.8/site-packages/starlette/routing.py", line 227, in handle
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.8/site-packages/starlette/routing.py", line 41, in app
    response = await func(request)
  File "/usr/local/lib/python3.8/site-packages/fastapi/routing.py", line 201, in app
    raw_response = await run_endpoint_function(
  File "/usr/local/lib/python3.8/site-packages/fastapi/routing.py", line 148, in run_endpoint_function
    return await dependant.call(**values)
  File "/app/streams_explorer/api/routes/metrics.py", line 18, in metrics
    return streams_explorer.get_metrics()
  File "/app/streams_explorer/streams_explorer.py", line 62, in get_metrics
    return self.data_flow.get_metrics()
  File "/app/streams_explorer/core/services/dataflow_graph.py", line 92, in get_metrics
    return self.metrics_provider.get()
  File "/app/streams_explorer/core/services/metric_providers.py", line 80, in get
    self.update()
  File "/app/streams_explorer/core/services/metric_providers.py", line 61, in update
    self.metrics = [
  File "/app/streams_explorer/core/services/metric_providers.py", line 65, in <listcomp>
    self.get_consumer_group(node_id, node)
  File "/app/streams_explorer/core/services/metric_providers.py", line 54, in get_consumer_group
    node_type: NodeTypesEnum = node["node_type"]
KeyError: 'node_type'

I deleted a connector and the respective dead letter topic and consumer group, maybe it is related

Connectors without a read rate are displayed as active

Helm chart - rbac objects should iterate over SE_K8Sdeploymentnamespaces

The namespaced rbac resources should be decoupled from the deployment namespace of the streams-explorer.
For example, in our cluster, we deploy streams-explorer in infra and plan to discover stream apps in data and app.

I feel this is a valid use case and should be supported by the helm chart:

Proposal:

Role becomes a ClusterRole
a RoleBinding from CR to SA is created on a per namespace basis.
this effectively binds the CR to the target namespaces.
this can be made optional by introducing

rbac:
  enabled: true
  clusterScope:
     enabled: true
     namespaces: ["apps", "data"]

Update Readme

new features
link to announcement blogpost
update screenshot (use url instead of markdown link this time so it works for PyPI)
Add full image of demo pipeline
new extractor plugins
update extractor gist

Add support for source connectors

Source connector nodes should be shown in the graph

Do not display secrets in Kafka connect configs

[Question] Request for clarification of some points

Thanks for the nice work: I have a couple of questions I would like to clarify as we are considering to poc the streams explorer.

Streams bootstrap chart/lib optional
It is not actually required to use the streams bootstrap chart nor the streams library as long as our Deployments are labeled accordingly and expose the simplified topology via env variables. Right?

Exporters needed to visualize simple pipeline
Is the presence of all metrics really needed to build the first visualization of a pipeline? Especially the need for the connect exporter would not be justified in case Kafka Connect is even not used.

Dataflow between Kafka stream apps visualized
Consider two distinct kafka streaming apps, where the second one consumes the output of the first. Is that dataflow visualized accordingly or are two seperate "apps" without a connecting link ?

Replacing exporters
As we use Burrow for lag monitoring, I wonder if we could plugin our own extractor for the lag metrics and if that is the case, I wonder where the appropriate place in the code would be at.

Other than that, I would like to understand which exact metrics are required so that we could potentially trick the streams explorer by rewriting our existing metrics to match the required ones.

Use topology endpoint for topic extraction
We'd prefer to read the topology dynamically through a REST endpoint exposed by our streaming apps instead of relying on env variables being set via Helm. Do you consider this a reasonable alternative plus could you give a hint where that logic would be supposed to be implemented?

Looking forward to response!

release 1.1.5 classifies apps wrongly as streaming apps

streams-explorer at 1.1.5 seems to be classifying a few apps (e.g. redis and keycloak) as streaming apps, even though they are not carrying topic env variables nor a pipeline selector or similar.

These are the log messages we are observing:

At 1.1.4 this is not observed, thus I suspect it is a side effect introduced with the sts support.

Link to consumer group in AKHQ from Streams App

Update Frontend to React 17

We should check all our dependencies for support of React 17 and see if there are any breaking changes.

For now we need to wait for restful-react add support for it. contiamo/restful-react#333

Display number of replicas for streaming applications

Using Prometheus, the number of pods currently running in a Deployment could be fetched using the kube_deployment_status_replicas metric.

Support multiple schema versions

Add Grafana dashboards

Topics
Consumer Groups

search for a topic and focus on it
select another topic in the graph
No node details available

bakdata / streams-explorer Goto Github PK

streams-explorer's People

Contributors

Stargazers

Watchers

Forkers

streams-explorer's Issues

Recommend Projects

Recommend Topics

Recommend Org