Coder Social home page Coder Social logo

pinot-wiki's Introduction

Building a real-time analytics dashboard with Streamlit, Apache Pinot, and Apache Kafka

Clone repository

git clone [email protected]:mneedham/pinot-wiki.git && cd pinot-wiki

Spin up all components

docker-compose up

or on the Mac M1:

docker-compose -f docker-compose-m1.yml up

Setup Python

Ingest Wikipedia events

python -m venv .venv
source venv/bin/activate
pip install -r requirements.txt

Create Kafka topic

docker exec -it kafka-wiki kafka-topics.sh \
  --bootstrap-server localhost:9092 \
  --partitions 5 \
  --topic wiki-events \
  --create

Ingest Wikipedia events

python wiki_to_kafka.py

Check Wikipedia events are ingesting

docker exec -it kafka-wiki kafka-run-class.sh kafka.tools.GetOffsetShell \
  --broker-list localhost:9092 \
  --topic wiki-events
kafkacat -C -b localhost:9092 -t wiki-events

Add Pinot Table

docker exec -it pinot-controller-wiki bin/pinot-admin.sh AddTable \
  -tableConfigFile /config/table.json \
  -schemaFile /config/schema.json \
  -exec

Open the Pinot UI http://localhost:9000/

Run Streamlit app

streamlit run streamlit/app.py

pinot-wiki's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

pinot-wiki's Issues

Conflicting JAR's SLF4j Issue

PROBLEM: I received the parallel bindings error in SLF4j and i tried following the Blog (https://www.baeldung.com/slf4j-classpath-multiple-bindings) but still the error is seen.

(python) (base) akram@ISHERIFF-M-RBNA pinot-wiki % docker exec -it pinot-controller-wiki bin/pinot-admin.sh AddTable
-tableConfigFile /config/table.json
-schemaFile /config/schema.json
-exec
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/pinot/lib/pinot-all-0.10.0-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/pinot/plugins/pinot-environment/pinot-azure/pinot-azure-0.10.0-shaded.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/pinot/plugins/pinot-metrics/pinot-yammer/pinot-yammer-0.10.0-shaded.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/pinot/plugins/pinot-metrics/pinot-dropwizard/pinot-dropwizard-0.10.0-shaded.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/pinot/plugins/pinot-input-format/pinot-parquet/pinot-parquet-0.10.0-shaded.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/pinot/plugins/pinot-file-system/pinot-s3/pinot-s3-0.10.0-shaded.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.

WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.codehaus.groovy.reflection.CachedClass (file:/opt/pinot/lib/pinot-all-0.10.0-jar-with-dependencies.jar) to method java.lang.Object.finalize()
WARNING: Please consider reporting this to the maintainers of org.codehaus.groovy.reflection.CachedClass

2022/09/08 19:36:33.469 INFO [AddTableCommand] [main] Executing command: AddTable -tableConfigFile /config/table.json -schemaFile /config/schema.json -controllerProtocol http -controllerHost 172.18.0.4 -controllerPort 9000 -user null -password [hidden] -exec

When i run the Streamlit App, I see this Error.

(python) (base) akram@ISHERIFF-M-RBNA PINOT_KAFKA % cd pinot-wiki
(python) (base) akram@ISHERIFF-M-RBNA pinot-wiki %
(python) (base) akram@ISHERIFF-M-RBNA pinot-wiki % streamlit run streamlit/app.py

You can now view your Streamlit app in your browser.

Local URL: http://localhost:8501
Network URL: http://10.21.68.44:8501

For better performance, install the Watchdog module:

$ xcode-select --install
$ pip install watchdog

2022-09-08 12:11:27.809 Uncaught app exception
Traceback (most recent call last):
File "/Users/akram/AKRAM_CODE_FOLDER/PINOT_KAFKA/opt/anaconda3/envs/pycaret_akram/bin/python/lib/python3.9/site-packages/httpx/_transports/default.py", line 60, in map_httpcore_exceptions
yield
File "/Users/akram/AKRAM_CODE_FOLDER/PINOT_KAFKA/opt/anaconda3/envs/pycaret_akram/bin/python/lib/python3.9/site-packages/httpx/_transports/default.py", line 218, in handle_request
resp = self._pool.handle_request(req)
File "/Users/akram/AKRAM_CODE_FOLDER/PINOT_KAFKA/opt/anaconda3/envs/pycaret_akram/bin/python/lib/python3.9/site-packages/httpcore/_sync/connection_pool.py", line 253, in handle_request
raise exc
File "/Users/akram/AKRAM_CODE_FOLDER/PINOT_KAFKA/opt/anaconda3/envs/pycaret_akram/bin/python/lib/python3.9/site-packages/httpcore/_sync/connection_pool.py", line 237, in handle_request
response = connection.handle_request(request)
File "/Users/akram/AKRAM_CODE_FOLDER/PINOT_KAFKA/opt/anaconda3/envs/pycaret_akram/bin/python/lib/python3.9/site-packages/httpcore/_sync/connection.py", line 90, in handle_request
return self._connection.handle_request(request)
File "/Users/akram/AKRAM_CODE_FOLDER/PINOT_KAFKA/opt/anaconda3/envs/pycaret_akram/bin/python/lib/python3.9/site-packages/httpcore/_sync/http11.py", line 105, in handle_request
raise exc
File "/Users/akram/AKRAM_CODE_FOLDER/PINOT_KAFKA/opt/anaconda3/envs/pycaret_akram/bin/python/lib/python3.9/site-packages/httpcore/_sync/http11.py", line 84, in handle_request
) = self._receive_response_headers(**kwargs)
File "/Users/akram/AKRAM_CODE_FOLDER/PINOT_KAFKA/opt/anaconda3/envs/pycaret_akram/bin/python/lib/python3.9/site-packages/httpcore/_sync/http11.py", line 148, in _receive_response_headers
event = self._receive_event(timeout=timeout)
File "/Users/akram/AKRAM_CODE_FOLDER/PINOT_KAFKA/opt/anaconda3/envs/pycaret_akram/bin/python/lib/python3.9/site-packages/httpcore/_sync/http11.py", line 191, in _receive_event
raise RemoteProtocolError(msg)
httpcore.RemoteProtocolError: Server disconnected without sending a response.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/Users/akram/AKRAM_CODE_FOLDER/PINOT_KAFKA/opt/anaconda3/envs/pycaret_akram/bin/python/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 556, in _run_script
exec(code, module.dict)
File "/Users/akram/AKRAM_CODE_FOLDER/PINOT_KAFKA/pinot-wiki/streamlit/app.py", line 318, in
page()
File "/Users/akram/AKRAM_CODE_FOLDER/PINOT_KAFKA/pinot-wiki/streamlit/app.py", line 28, in overview
curs.execute(query)
File "/Users/akram/AKRAM_CODE_FOLDER/PINOT_KAFKA/opt/anaconda3/envs/pycaret_akram/bin/python/lib/python3.9/site-packages/pinotdb/db.py", line 51, in g
return f(self, *args, **kwargs)
File "/Users/akram/AKRAM_CODE_FOLDER/PINOT_KAFKA/opt/anaconda3/envs/pycaret_akram/bin/python/lib/python3.9/site-packages/pinotdb/db.py", line 448, in execute
r = self.session.post(
File "/Users/akram/AKRAM_CODE_FOLDER/PINOT_KAFKA/opt/anaconda3/envs/pycaret_akram/bin/python/lib/python3.9/site-packages/httpx/_client.py", line 1130, in post
return self.request(
File "/Users/akram/AKRAM_CODE_FOLDER/PINOT_KAFKA/opt/anaconda3/envs/pycaret_akram/bin/python/lib/python3.9/site-packages/httpx/_client.py", line 815, in request
return self.send(request, auth=auth, follow_redirects=follow_redirects)
File "/Users/akram/AKRAM_CODE_FOLDER/PINOT_KAFKA/opt/anaconda3/envs/pycaret_akram/bin/python/lib/python3.9/site-packages/httpx/_client.py", line 902, in send
response = self._send_handling_auth(
File "/Users/akram/AKRAM_CODE_FOLDER/PINOT_KAFKA/opt/anaconda3/envs/pycaret_akram/bin/python/lib/python3.9/site-packages/httpx/_client.py", line 930, in _send_handling_auth
response = self._send_handling_redirects(
File "/Users/akram/AKRAM_CODE_FOLDER/PINOT_KAFKA/opt/anaconda3/envs/pycaret_akram/bin/python/lib/python3.9/site-packages/httpx/_client.py", line 967, in _send_handling_redirects
response = self._send_single_request(request)
File "/Users/akram/AKRAM_CODE_FOLDER/PINOT_KAFKA/opt/anaconda3/envs/pycaret_akram/bin/python/lib/python3.9/site-packages/httpx/_client.py", line 1003, in _send_single_request
response = transport.handle_request(request)
File "/Users/akram/AKRAM_CODE_FOLDER/PINOT_KAFKA/opt/anaconda3/envs/pycaret_akram/bin/python/lib/python3.9/site-packages/httpx/_transports/default.py", line 218, in handle_request
resp = self._pool.handle_request(req)
File "/usr/local/Cellar/[email protected]/3.9.13_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/contextlib.py", line 137, in exit
self.gen.throw(typ, value, traceback)
File "/Users/akram/AKRAM_CODE_FOLDER/PINOT_KAFKA/opt/anaconda3/envs/pycaret_akram/bin/python/lib/python3.9/site-packages/httpx/_transports/default.py", line 77, in map_httpcore_exceptions
raise mapped_exc(message) from exc
httpx.RemoteProtocolError: Server disconnected without sending a response.
2022-09-08 12:11:55.732 Uncaught app exception
Traceback (most recent call last):

streamlit maximum recursion depth exceeded while calling a Python object

First, congrats to this repo's contributors, very clear and concise. I'm getting recursion limit errors running modified streamlit dashboard e.g.: streamlit maximum recursion depth exceeded while calling a Python object. Setting higher recursion limit up to 1500 does only help to slightly postpone the issue, the stack size is growing at every st.experimental_rerun . Inside streamlit app I follow similar workflow, connect to pinot db, query it and display results. Sounds like a streamlit experimental_rerun issue. I wonder you've met with similar.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.