Coder Social home page Coder Social logo

lewuathe / docker-trino-cluster Goto Github PK

View Code? Open in Web Editor NEW
119.0 5.0 46.0 13.04 MB

Multiple node presto cluster on docker container

License: Apache License 2.0

Makefile 40.06% Python 12.58% Dockerfile 24.00% Shell 21.32% HCL 2.03%
docker-presto-cluster presto cluster docker-container sql

docker-trino-cluster's Introduction

docker-trino-cluster CircleCI GitHub tag (latest by date) GitHub

docker-trino-cluster is a simple tool for launching multiple node trino cluster on docker container. The image is synched with the master branch of trino repository. Therefore you can try the latest trino for developing purpose easily.

Features

  • Multiple node cluster on docker container with docker-compose
  • Distribution of pre-build trino docker images
  • Override the catalog properties with custom one
  • Terraform module to launch ECS based cluster

Images

Role Image Pulls Tags
coordinator lewuathe/trino-coordinator Docker Pulls tags
worker lewuathe/trino-worker Docker Pulls tags

We are also providing ARM based images. Images for ARM have suffix -arm64v8 in the tag. For instance, the image of 336 has two types of images supporting multi-architectures. Following architectures are supported for now.

  • linux/amd64
  • linux/arm64/v8

Usage

Images are uploaded in DockerHub. These images are build with the corresponding version of trino. Image tagged with 306 uses trino 306 inside. Each docker image gets two arguments

Index Argument Description
1 discovery_uri Required parameter to specify the URI to coordinator host
2 node_id Optional parameter to specify the node identity. UUID will be generated if not given

You can launch multi node trino cluster in the local machine as follows.

# Create a custom network
$ docker network create trino_network

# Launch coordinator
$ docker run -p 8080:8080 -it \
    --net trino_network \
    --name coordinator \
    lewuathe/trino-coordinator:330-SNAPSHOT http://localhost:8080

# Launch two workers
$ docker run -it \
    --net trino_network \
    --name worker1 \
    lewuathe/trino-worker:330-SNAPSHOT http://coordinator:8080

$ docker run -it \
    --net trino_network \
    --name worker2 \
    lewuathe/trino-worker:330-SNAPSHOT http://coordinator:8080

docker-compose.yml

docker-compose enables us to coordinator multiple containers more easily. You can launch a multiple node docker trino cluster with the following yaml file. command is required to pass discovery URI and node id information which must be unique in a cluster. If node ID is not passed, the UUID is generated automatically at launch time.

version: '3'

services:
  coordinator:
    image: "lewuathe/trino-coordinator:${trino_VERSION}"
    ports:
      - "8080:8080"
    container_name: "coordinator"
    command: http://coordinator:8080 coordinator
  worker0:
    image: "lewuathe/trino-worker:${trino_VERSION}"
    container_name: "worker0"
    ports:
      - "8081:8081"
    command: http://coordinator:8080 worker0
  worker1:
    image: "lewuathe/trino-worker:${trino_VERSION}"
    container_name: "worker1"
    ports:
      - "8082:8081"
    command: http://coordinator:8080 worker1

The version can be specified as the environment variable.

$ trino_VERSION=330-SNAPSHOT docker-compose up

Custom Catalogs

While the image provides several default connectors (i.e. JMX, Memory, TPC-H and TPC-DS), you may want to override the catalog property with your own ones. That can be easily achieved by mounting the catalog directory onto /usr/local/trino/etc/catalog. Please look at volumes configuration for docker-compose.

services:
  coordinator:
    image: "lewuathe/trino-coordinator:${trino_VERSION}"
    ports:
      - "8080:8080"
    container_name: "coordinator"
    command: http://coordinator:8080 coordinator
    volumes:
      - ./example/etc/catalog:/usr/local/trino/etc/catalog

Terraform

You can launch trino cluster on AWS Fargate by using terraform-aws-trino module. The following Terraform configuration provides a trino cluster with 2 worker processes on Fargate.

module "trino" {
  source           = "github.com/Lewuathe/terraform-aws-trino"
  cluster_capacity = 2
}

output "alb_dns_name" {
  value = module.trino.alb_dns_name
}

Please see here for more detail.

Development

Build Image

$ make build

Snapshot Image

You may want to build the trino with your own build package for the development of trino itself.

$ cp /path/to/trino/trino-server/target/trino-server-330-SNAPSHOT.tar.gz /path/to/docker-trino-cluster/trino-base/
$ make snapshot

LICENSE

Apache v2 License

docker-trino-cluster's People

Contributors

juancb avatar lewuathe avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

docker-trino-cluster's Issues

docker image needs to install 'less'

When I running query from presto-cli, below error will pop up, seems the dependency "less" is lost, after I manually installed the less package, the error gone.

presto> SELECT sum(unblendedcost) AS cost
     -> FROM statistics.statistics.aws
     -> WHERE "user:project" = 'test';
ERROR: failed to open pager: Cannot run program "less": error=2, No such file or directory

So I think the "less" package should be added to the base docker images?

330 docker compose java11 error

PRESTO_VERSION=330 docker-compose up

Recreating coordinator ... done
Recreating worker1 ... done
Recreating worker0 ... done
Attaching to worker1, coordinator, worker0
worker0 | Rendered etc/node.properties
worker0 | Rendered etc/config.properties
worker1 | Rendered etc/node.properties
worker1 | Rendered etc/config.properties
coordinator | Rendered etc/node.properties
coordinator | Rendered etc/config.properties
coordinator | ERROR: Future versions of Presto will require Java 11 after March 2020.
coordinator |
coordinator | You may temporarily continue running on Java 8 by adding the following
coordinator | JVM config option:
coordinator |
coordinator | -Dpresto-temporarily-allow-java8=true
coordinator |
worker0 | ERROR: Future versions of Presto will require Java 11 after March 2020.
worker0 |
worker0 | You may temporarily continue running on Java 8 by adding the following
worker0 | JVM config option:
worker0 |
worker0 | -Dpresto-temporarily-allow-java8=true
worker0 |
worker1 | ERROR: Future versions of Presto will require Java 11 after March 2020.
worker1 |
worker1 | You may temporarily continue running on Java 8 by adding the following
worker1 | JVM config option:
worker1 |
worker1 | -Dpresto-temporarily-allow-java8=true
worker1 |
coordinator exited with code 100
worker0 exited with code 100
worker1 exited with code 100

329 docker compose error nodeid na

PRESTO_VERSION=329 docker-compose up

worker0 |
worker0 | 1) Error: Invalid configuration property node.id: is malformed (for class io.airlift.node.NodeConfig.nodeId)
worker0 |
worker0 | 1 error
worker0 | com.google.inject.CreationException: Unable to create injector, see the following errors:
worker0 |
worker0 | 1) Error: Invalid configuration property node.id: is malformed (for class io.airlift.node.NodeConfig.nodeId)
worker0 |
worker0 | 1 error
worker0 | at com.google.inject.internal.Errors.throwCreationExceptionIfErrorsExist(Errors.java:543)
worker0 | at com.google.inject.internal.InternalInjectorCreator.initializeStatically(InternalInjectorCreator.java:159)
worker0 | at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:106)
worker0 | at com.google.inject.Guice.createInjector(Guice.java:87)
worker0 | at io.airlift.bootstrap.Bootstrap.initialize(Bootstrap.java:240)
worker0 | at io.prestosql.server.PrestoServer.run(PrestoServer.java:120)
worker0 | at io.prestosql.server.PrestoServer.main(PrestoServer.java:70)
worker0 |
worker0 |
worker1 | 2020-03-08T15:33:00.561Z ERROR main io.prestosql.server.PrestoServer Unable to create injector, see the following errors:
worker1 |
worker1 | 1) Error: Invalid configuration property node.id: is malformed (for class io.airlift.node.NodeConfig.nodeId)
worker1 |
worker1 | 1 error
worker1 | com.google.inject.CreationException: Unable to create injector, see the following errors:
worker1 |
worker1 | 1) Error: Invalid configuration property node.id: is malformed (for class io.airlift.node.NodeConfig.nodeId)
worker1 |
worker1 | 1 error
worker1 | at com.google.inject.internal.Errors.throwCreationExceptionIfErrorsExist(Errors.java:543)
worker1 | at com.google.inject.internal.InternalInjectorCreator.initializeStatically(InternalInjectorCreator.java:159)
worker1 | at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:106)
worker1 | at com.google.inject.Guice.createInjector(Guice.java:87)
worker1 | at io.airlift.bootstrap.Bootstrap.initialize(Bootstrap.java:240)
worker1 | at io.prestosql.server.PrestoServer.run(PrestoServer.java:120)
worker1 | at io.prestosql.server.PrestoServer.main(PrestoServer.java:70)
worker1 |
worker1 |
worker0 exited with code 1
worker1 exited with code 1
coordinator | 2020-03-08T15:33:01.622Z ERROR main io.prestosql.server.PrestoServer Unable to create injector, see the following errors:
coordinator |
coordinator | 1) Error: Invalid configuration property node.id: is malformed (for class io.airlift.node.NodeConfig.nodeId)
coordinator |
coordinator | 1 error
coordinator | com.google.inject.CreationException: Unable to create injector, see the following errors:
coordinator |
coordinator | 1) Error: Invalid configuration property node.id: is malformed (for class io.airlift.node.NodeConfig.nodeId)
coordinator |
coordinator | 1 error
coordinator | at com.google.inject.internal.Errors.throwCreationExceptionIfErrorsExist(Errors.java:543)
coordinator | at com.google.inject.internal.InternalInjectorCreator.initializeStatically(InternalInjectorCreator.java:159)
coordinator | at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:106)
coordinator | at com.google.inject.Guice.createInjector(Guice.java:87)
coordinator | at io.airlift.bootstrap.Bootstrap.initialize(Bootstrap.java:240)
coordinator | at io.prestosql.server.PrestoServer.run(PrestoServer.java:120)
coordinator | at io.prestosql.server.PrestoServer.main(PrestoServer.java:70)
coordinator |
coordinator |
coordinator exited with code 1

log has
worker0 | 2020-03-08T15:11:28.481Z INFO main Bootstrap node.id ---- -- n/a --

Accessing Azure Data Lake Gen 2 Storage

Hi. This may be more of a feature request than it is an issue.

I would like to find out how best to access data within ADLS Gen 2 storage. I know there are some Azure jar files necessary but where exactly would I need to place these?In the base Presto Dockerfile or within both the Coordinator and Worker images?

I believe I'd also require a Hive metastore, is there a configuration pattern I can follow to add this to the current stack?

Any help would be greatly appreciated.

Eddie

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.