wiwdata / presto-chart Goto Github PK
View Code? Open in Web Editor NEWHighly configurable Helm Presto Chart
License: MIT License
Highly configurable Helm Presto Chart
License: MIT License
Hello,
I want to deploy this chart to an on-prem kubrnetes cluster with a catalog that makes use of minio for the hive metastore and backend storage. For reference, see -
https://blog.minio.io/building-an-on-premise-ml-ecosystem-with-minio-powered-by-presto-weka-r-and-s3select-feature-fefbbaa87054
I have managed to use the prestosql/presto
docker image to achieve this successfully on a dev machine (just docker, no k8s). This proof of concept used 1 minio container, 1 presto container, and 1 jupyter python container to connect to presto and push/pull data.
In order to get it working, I had to sort out networking and to create the right catalog file. The catalog file I'm using is -
# lake.catalog
connector.name=hive-hadoop2
hive.metastore=file
hive.metastore.catalog.dir=s3://presto/
hive.allow-drop-table=true
hive.s3.aws-access-key=<USER>
hive.s3.aws-secret-key=<PASSWORD>
hive.s3.endpoint=<URL>
hive.s3.path-style-access=true
hive.s3.ssl.enabled=false
hive.s3select-pushdown.enabled=true
hive.storage-format=parquet
wiwdata/presto-chart
I have a jupyter notebook up in the cluster. I have minio up in the cluster.
I cannot bring up a presto cluster with a working catalog configmap. This is what my configmap looks like -
---
# Source: presto/templates/configmap-catalog.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: presto-catalog
labels:
app: presto
chart: presto-1
release: presto-1579859901
data:
hive.properties: |
connector.name=hive-hadoop2
hive.metastore=file
hive.metastore.catalog.dir=s3://presto/
hive.allow-drop-table=true
hive.s3.aws-access-key=<USER>
hive.s3.aws-secret-key=<PASSWORD>
hive.s3.endpoint="minio-service.default.svc.cluster.local:9000"
hive.s3.path-style-access=true
hive.s3.ssl.enabled=false
hive.s3select-pushdown.enabled=true
hive.storage-format=parquet
---
** All other values in values.yaml
are unchanged**
Unfortunately, the cluster keeps entering a crashloop.
riaz@k3s-dev:~/presto-chart$ sudo kubectl get all
NAME READY STATUS RESTARTS AGE
pod/kubernetes-cockpit-tlnsw 1/1 Running 0 3h54m
pod/minio-69c5c44c7c-74dkh 1/1 Running 0 136m
pod/presto-1579859901-worker-845cd7cb9c-2tkzz 0/1 CrashLoopBackOff 3 3m45s
pod/presto-1579859901-worker-845cd7cb9c-hh295 0/1 CrashLoopBackOff 3 3m45s
pod/presto-1579859901-coordinator-7df8fc5c45-m699k 0/1 CrashLoopBackOff 3 3m45s
NAME DESIRED CURRENT READY AGE
replicationcontroller/kubernetes-cockpit 1 1 1 3h54m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.43.0.1 <none> 443/TCP 13h
service/workbench ExternalName <none> proxy-public.workbench.svc.cluster.local 80/TCP 13h
service/kubernetes-cockpit ClusterIP 10.43.32.92 <none> 443/TCP 3h54m
service/minio-service ClusterIP 10.43.102.159 <none> 9000/TCP 136m
service/presto-1579859901 ClusterIP 10.43.214.17 <none> 80/TCP 3m45s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/minio 1/1 1 1 136m
deployment.apps/presto-1579859901-worker 0/2 2 0 3m45s
deployment.apps/presto-1579859901-coordinator 0/1 1 0 3m45s
NAME DESIRED CURRENT READY AGE
replicaset.apps/minio-69c5c44c7c 1 1 1 136m
replicaset.apps/presto-1579859901-worker-845cd7cb9c 2 2 0 3m45s
replicaset.apps/presto-1579859901-coordinator-7df8fc5c45 1 1 0 3m45s
I will post the logs in the next message so that they don't clog up this one, but the salient error (I think) is this -
2020-01-24T11:13:07.350Z ERROR main com.facebook.presto.server.PrestoServer Unable to create injector, see the following errors:
1) Explicit bindings are required and com.facebook.presto.hive.authentication.HdfsAuthentication is not explicitly bound.
while locating com.facebook.presto.hive.authentication.HdfsAuthentication
for the 3rd parameter of com.facebook.presto.hive.HdfsEnvironment.<init>(HdfsEnvironment.java:50)
at com.facebook.presto.hive.HiveClientModule.configure(HiveClientModule.java:68)
2) Explicit bindings are required and com.facebook.presto.hive.s3.S3ConfigurationUpdater is not explicitly bound.
while locating com.facebook.presto.hive.s3.S3ConfigurationUpdater
for the 2nd parameter of com.facebook.presto.hive.HdfsConfigurationUpdater.<init>(HdfsConfigurationUpdater.java:77)
at com.facebook.presto.hive.HiveClientModule.configure(HiveClientModule.java:66)
3) Error: Could not coerce value 'parquet' to com.facebook.presto.hive.HiveStorageFormat (property 'hive.storage-format') in order to call [public com.facebook.presto.hive.HiveClientConfig com.facebook.presto.hive.HiveClientConfig.setHiveStorageFormat(com.facebook.presto.hive.HiveStorageFormat)]
4) Configuration property 'hive.s3.aws-access-key' was not used
at io.airlift.bootstrap.Bootstrap.lambda$initialize$2(Bootstrap.java:233)
5) Configuration property 'hive.s3.aws-secret-key' was not used
at io.airlift.bootstrap.Bootstrap.lambda$initialize$2(Bootstrap.java:233)
6) Configuration property 'hive.s3.endpoint' was not used
at io.airlift.bootstrap.Bootstrap.lambda$initialize$2(Bootstrap.java:233)
7) Configuration property 'hive.s3.path-style-access' was not used
at io.airlift.bootstrap.Bootstrap.lambda$initialize$2(Bootstrap.java:233)
8) Configuration property 'hive.s3.ssl.enabled' was not used
at io.airlift.bootstrap.Bootstrap.lambda$initialize$2(Bootstrap.java:233)
9) Configuration property 'hive.s3select-pushdown.enabled' was not used
at io.airlift.bootstrap.Bootstrap.lambda$initialize$2(Bootstrap.java:233)
10) Configuration property 'hive.storage-format' was not used
at io.airlift.bootstrap.Bootstrap.lambda$initialize$2(Bootstrap.java:233)
10 errors
This suggests to me that this docker container has been built with a version of presto that doesn't support the s3-backed hive metastore.
Is this correct? If so, could I build an updated one?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.