Coder Social home page Coder Social logo

presto-chart's Introduction

Presto Helm Chart

Highly configurable Helm Presto Chart based on the stable/presto chart but significantly altered for greater flexibility:

  • Specify connectors within the values.yaml file to easily manage them without modifying the image.
  • Separated resources and selectors/affinities for coordinator and worker deployments given the different naturesmofmthe two deployments.
  • Override and add configuration properties and JVM configuration within the values.yaml file.
  • Templated bootstrapping within the containers allows for additional runtime configuration makes for more natural injection of environmental data. Particularly useful for rendering secrets into configurations and connectors via container environment variables.

Check out the example values.yaml file for more detailed documentation and examples of how the above chamges work.

Basic Chart Installation

This chart is packaged for easy install and any of the packaged versions stored in the charts directory can be installed via their download URL:

$ helm install \
  --name my-presto \
  --namespace my-presto-namespace \
  --values values.yaml \
  https://github.com/wiwdata/presto-chart/raw/master/charts/presto-1.tgz

where the values.yaml is one you've created locally. For more details about the chart see the chart README.

presto-chart's People

Contributors

sbrunk avatar sernst avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

presto-chart's Issues

Cannot create s3-backed hive metastore

Hello,

Objective

I want to deploy this chart to an on-prem kubrnetes cluster with a catalog that makes use of minio for the hive metastore and backend storage. For reference, see -
https://blog.minio.io/building-an-on-premise-ml-ecosystem-with-minio-powered-by-presto-weka-r-and-s3select-feature-fefbbaa87054

Reason to think it should work

I have managed to use the prestosql/presto docker image to achieve this successfully on a dev machine (just docker, no k8s). This proof of concept used 1 minio container, 1 presto container, and 1 jupyter python container to connect to presto and push/pull data.

In order to get it working, I had to sort out networking and to create the right catalog file. The catalog file I'm using is -

# lake.catalog
connector.name=hive-hadoop2
hive.metastore=file
hive.metastore.catalog.dir=s3://presto/
hive.allow-drop-table=true
hive.s3.aws-access-key=<USER>
hive.s3.aws-secret-key=<PASSWORD>
hive.s3.endpoint=<URL>
hive.s3.path-style-access=true
hive.s3.ssl.enabled=false
hive.s3select-pushdown.enabled=true
hive.storage-format=parquet

How far I've gotten with wiwdata/presto-chart

I have a jupyter notebook up in the cluster. I have minio up in the cluster.

  1. I am able to bring up a wiwdata presto cluster with defaults.
  2. I am able to interact with the presto using the python notebook. I am able to interact with the minio server using the python notebook. So networking is posing no issues.

Where I am stuck

I cannot bring up a presto cluster with a working catalog configmap. This is what my configmap looks like -

---
# Source: presto/templates/configmap-catalog.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: presto-catalog
  labels:
    app: presto
    chart: presto-1
    release: presto-1579859901
data:
  hive.properties: |
    connector.name=hive-hadoop2
    hive.metastore=file
    hive.metastore.catalog.dir=s3://presto/
    hive.allow-drop-table=true
    hive.s3.aws-access-key=<USER>
    hive.s3.aws-secret-key=<PASSWORD>
    hive.s3.endpoint="minio-service.default.svc.cluster.local:9000"
    hive.s3.path-style-access=true
    hive.s3.ssl.enabled=false
    hive.s3select-pushdown.enabled=true
    hive.storage-format=parquet
---

** All other values in values.yaml are unchanged**

Unfortunately, the cluster keeps entering a crashloop.

riaz@k3s-dev:~/presto-chart$ sudo kubectl get all
NAME                                                 READY   STATUS             RESTARTS   AGE
pod/kubernetes-cockpit-tlnsw                         1/1     Running            0          3h54m
pod/minio-69c5c44c7c-74dkh                           1/1     Running            0          136m
pod/presto-1579859901-worker-845cd7cb9c-2tkzz        0/1     CrashLoopBackOff   3          3m45s
pod/presto-1579859901-worker-845cd7cb9c-hh295        0/1     CrashLoopBackOff   3          3m45s
pod/presto-1579859901-coordinator-7df8fc5c45-m699k   0/1     CrashLoopBackOff   3          3m45s

NAME                                       DESIRED   CURRENT   READY   AGE
replicationcontroller/kubernetes-cockpit   1         1         1       3h54m

NAME                         TYPE           CLUSTER-IP      EXTERNAL-IP                                PORT(S)    AGE
service/kubernetes           ClusterIP      10.43.0.1       <none>                                     443/TCP    13h
service/workbench            ExternalName   <none>          proxy-public.workbench.svc.cluster.local   80/TCP     13h
service/kubernetes-cockpit   ClusterIP      10.43.32.92     <none>                                     443/TCP    3h54m
service/minio-service        ClusterIP      10.43.102.159   <none>                                     9000/TCP   136m
service/presto-1579859901    ClusterIP      10.43.214.17    <none>                                     80/TCP     3m45s

NAME                                            READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/minio                           1/1     1            1           136m
deployment.apps/presto-1579859901-worker        0/2     2            0           3m45s
deployment.apps/presto-1579859901-coordinator   0/1     1            0           3m45s

NAME                                                       DESIRED   CURRENT   READY   AGE
replicaset.apps/minio-69c5c44c7c                           1         1         1       136m
replicaset.apps/presto-1579859901-worker-845cd7cb9c        2         2         0       3m45s
replicaset.apps/presto-1579859901-coordinator-7df8fc5c45   1         1         0       3m45s

I will post the logs in the next message so that they don't clog up this one, but the salient error (I think) is this -

2020-01-24T11:13:07.350Z	ERROR	main	com.facebook.presto.server.PrestoServer	Unable to create injector, see the following errors:

1) Explicit bindings are required and com.facebook.presto.hive.authentication.HdfsAuthentication is not explicitly bound.
  while locating com.facebook.presto.hive.authentication.HdfsAuthentication
    for the 3rd parameter of com.facebook.presto.hive.HdfsEnvironment.<init>(HdfsEnvironment.java:50)
  at com.facebook.presto.hive.HiveClientModule.configure(HiveClientModule.java:68)

2) Explicit bindings are required and com.facebook.presto.hive.s3.S3ConfigurationUpdater is not explicitly bound.
  while locating com.facebook.presto.hive.s3.S3ConfigurationUpdater
    for the 2nd parameter of com.facebook.presto.hive.HdfsConfigurationUpdater.<init>(HdfsConfigurationUpdater.java:77)
  at com.facebook.presto.hive.HiveClientModule.configure(HiveClientModule.java:66)

3) Error: Could not coerce value 'parquet' to com.facebook.presto.hive.HiveStorageFormat (property 'hive.storage-format') in order to call [public com.facebook.presto.hive.HiveClientConfig com.facebook.presto.hive.HiveClientConfig.setHiveStorageFormat(com.facebook.presto.hive.HiveStorageFormat)]

4) Configuration property 'hive.s3.aws-access-key' was not used
  at io.airlift.bootstrap.Bootstrap.lambda$initialize$2(Bootstrap.java:233)

5) Configuration property 'hive.s3.aws-secret-key' was not used
  at io.airlift.bootstrap.Bootstrap.lambda$initialize$2(Bootstrap.java:233)

6) Configuration property 'hive.s3.endpoint' was not used
  at io.airlift.bootstrap.Bootstrap.lambda$initialize$2(Bootstrap.java:233)

7) Configuration property 'hive.s3.path-style-access' was not used
  at io.airlift.bootstrap.Bootstrap.lambda$initialize$2(Bootstrap.java:233)

8) Configuration property 'hive.s3.ssl.enabled' was not used
  at io.airlift.bootstrap.Bootstrap.lambda$initialize$2(Bootstrap.java:233)

9) Configuration property 'hive.s3select-pushdown.enabled' was not used
  at io.airlift.bootstrap.Bootstrap.lambda$initialize$2(Bootstrap.java:233)

10) Configuration property 'hive.storage-format' was not used
  at io.airlift.bootstrap.Bootstrap.lambda$initialize$2(Bootstrap.java:233)

10 errors


This suggests to me that this docker container has been built with a version of presto that doesn't support the s3-backed hive metastore.

Is this correct? If so, could I build an updated one?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.