Coder Social home page Coder Social logo

bakdata / streams-explorer Goto Github PK

View Code? Open in Web Editor NEW
44.0 9.0 4.0 3.97 MB

Explore Apache Kafka data pipelines in Kubernetes.

Home Page: https://medium.com/bakdata/exploring-data-pipelines-in-apache-kafka-with-streams-explorer-8337dd11fdad

License: MIT License

Dockerfile 0.31% Python 69.80% CSS 0.33% TypeScript 28.47% JavaScript 0.44% Shell 0.20% Mustache 0.46%
python data-pipelines hacktoberfest kubernetes react apache-kafka data-stream kafka-connect kafka-streams

streams-explorer's People

Contributors

bakdata-bot avatar dependabot[bot] avatar disrupted avatar jakobedding avatar michaelkora avatar philipp94831 avatar sujuka99 avatar twiechert avatar yannick-roeder avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

streams-explorer's Issues

Display Kafka topic configs

Kafka topic configs such as compact/ delete in the context of whole pipelines can help to find errors/improvements.

Clean Docker image

Some issues I noticed with the current Docker image

  • something is installing Python 2.7
  • why are we overwriting uvicorn / fastapi from the base image by installing from requirements.txt
  • Clean up unneeded packages after build

Search and focus nodes

To find nodes easily, a search over all nodes, which allows to select and focus nodes, could enhance usability.

Display graph on full window height

Currently, there is whitespace below the graph which is reserved for the details pane. I think it would be better to use the full height and display the details "above" the graph

Move graph layout calculation to frontend

Possible relevant use cases:

  • Users could have full control over graph layout configurations
  • Grouping nodes into separate (sub-)pipelines could be done using combo groups of G6
  • Grouping/filtering nodes can be achieved while providing a useful graph layout
  • Updates (new nodes e.g., connectors, streams apps, topics, ...) could be handled without a whole graph reload
  • E.g., in the future, Kafka streams app nodes could be expanded to display internal topologies as part of the whole data pipeline

Error loading webpage: Failed to fetch: 500

2021-02-03 16:36:13.848 | DEBUG    | streams_explorer.core.services.metric_providers:refresh_data:100 - Pulling metrics from Prometheus
10.0.1.141:30050 - "GET /api/metrics HTTP/1.1" 500
[2021-02-03 16:36:21 +0000] [7] [ERROR] Exception in ASGI application
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/uvicorn/protocols/http/httptools_impl.py", line 396, in run_asgi
    result = await app(self.scope, self.receive, self.send)
  File "/usr/local/lib/python3.8/site-packages/uvicorn/middleware/proxy_headers.py", line 45, in __call__
    return await self.app(scope, receive, send)
  File "/usr/local/lib/python3.8/site-packages/fastapi/applications.py", line 199, in __call__
    await super().__call__(scope, receive, send)
  File "/usr/local/lib/python3.8/site-packages/starlette/applications.py", line 111, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/lib/python3.8/site-packages/starlette/middleware/errors.py", line 181, in __call__
    raise exc from None
  File "/usr/local/lib/python3.8/site-packages/starlette/middleware/errors.py", line 159, in __call__
    await self.app(scope, receive, _send)
  File "/usr/local/lib/python3.8/site-packages/starlette/exceptions.py", line 82, in __call__
    raise exc from None
  File "/usr/local/lib/python3.8/site-packages/starlette/exceptions.py", line 71, in __call__
    await self.app(scope, receive, sender)
  File "/usr/local/lib/python3.8/site-packages/starlette/routing.py", line 566, in __call__
    await route.handle(scope, receive, send)
  File "/usr/local/lib/python3.8/site-packages/starlette/routing.py", line 227, in handle
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.8/site-packages/starlette/routing.py", line 41, in app
    response = await func(request)
  File "/usr/local/lib/python3.8/site-packages/fastapi/routing.py", line 201, in app
    raw_response = await run_endpoint_function(
  File "/usr/local/lib/python3.8/site-packages/fastapi/routing.py", line 148, in run_endpoint_function
    return await dependant.call(**values)
  File "/app/streams_explorer/api/routes/metrics.py", line 18, in metrics
    return streams_explorer.get_metrics()
  File "/app/streams_explorer/streams_explorer.py", line 62, in get_metrics
    return self.data_flow.get_metrics()
  File "/app/streams_explorer/core/services/dataflow_graph.py", line 92, in get_metrics
    return self.metrics_provider.get()
  File "/app/streams_explorer/core/services/metric_providers.py", line 80, in get
    self.update()
  File "/app/streams_explorer/core/services/metric_providers.py", line 61, in update
    self.metrics = [
  File "/app/streams_explorer/core/services/metric_providers.py", line 65, in <listcomp>
    self.get_consumer_group(node_id, node)
  File "/app/streams_explorer/core/services/metric_providers.py", line 54, in get_consumer_group
    node_type: NodeTypesEnum = node["node_type"]
KeyError: 'node_type'

I deleted a connector and the respective dead letter topic and consumer group, maybe it is related

Helm chart - rbac objects should iterate over SE_K8S__deployment__namespaces

The namespaced rbac resources should be decoupled from the deployment namespace of the streams-explorer.
For example, in our cluster, we deploy streams-explorer in infra and plan to discover stream apps in data and app.

I feel this is a valid use case and should be supported by the helm chart:

Proposal:

  • Role becomes a ClusterRole
  • a RoleBinding from CR to SA is created on a per namespace basis.
  • this effectively binds the CR to the target namespaces.
  • this can be made optional by introducing
rbac:
  enabled: true
  clusterScope:
     enabled: true
     namespaces: ["apps", "data"]

Update Readme

  • new features
  • link to announcement blogpost
  • update screenshot (use url instead of markdown link this time so it works for PyPI)
  • Add full image of demo pipeline
  • new extractor plugins
  • update extractor gist

[Question] Request for clarification of some points

Thanks for the nice work: I have a couple of questions I would like to clarify as we are considering to poc the streams explorer.

Streams bootstrap chart/lib optional
It is not actually required to use the streams bootstrap chart nor the streams library as long as our Deployments are labeled accordingly and expose the simplified topology via env variables. Right?

Exporters needed to visualize simple pipeline
Is the presence of all metrics really needed to build the first visualization of a pipeline? Especially the need for the connect exporter would not be justified in case Kafka Connect is even not used.

Dataflow between Kafka stream apps visualized
Consider two distinct kafka streaming apps, where the second one consumes the output of the first. Is that dataflow visualized accordingly or are two seperate "apps" without a connecting link ?

Replacing exporters
As we use Burrow for lag monitoring, I wonder if we could plugin our own extractor for the lag metrics and if that is the case, I wonder where the appropriate place in the code would be at.

Other than that, I would like to understand which exact metrics are required so that we could potentially trick the streams explorer by rewriting our existing metrics to match the required ones.

Use topology endpoint for topic extraction
We'd prefer to read the topology dynamically through a REST endpoint exposed by our streaming apps instead of relying on env variables being set via Helm. Do you consider this a reasonable alternative plus could you give a hint where that logic would be supposed to be implemented?

Looking forward to response!

release 1.1.5 classifies apps wrongly as streaming apps

streams-explorer at 1.1.5 seems to be classifying a few apps (e.g. redis and keycloak) as streaming apps, even though they are not carrying topic env variables nor a pipeline selector or similar.

image

These are the log messages we are observing:
image

At 1.1.4 this is not observed, thus I suspect it is a side effect introduced with the sts support.

Show refresh interval for metrics in frontend

Right now the frontend doesn't show the refresh interval for metrics. We should show this to the user. Additionally we could add a dropdown menu to change the interval from the current default of 30s.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.