epahomov / docker-spark Goto Github PK

View Code? Open in Web Editor NEW

120.0 120.0 81.0 30 KB

Docker image for general apache spark client

License: Apache License 2.0

docker-spark's People

Contributors

Stargazers

Watchers

Forkers

hcchen dslamba emmekappa lefromage jeperez sivagnanamciena juanpedromoreno shylu clakech dmitryfill kushalmangtani kiranchitturi xwzpp browny jbdalido meyerson jeremysanecki muminoff vigarbuaa joola wenpei alpivonka acasimiro ebirukov azywait dataisbeautiful neosign dforde-rms albertcbrown eleven-99 young8 cristianpupazan jaydlowrider dgreene3p ilovejs zsolt-donca pyxixi2012 zhuohuwu0603 thbeh kenmsj ralic vkim aravindr18 ledjon cortwave abhijeetpathak nwittstruck samuelsmal kaidix fharenheit git-josip justvisiting smr39 thejaspm nareshmiriyala zblz benknightdark mgijsbertihodenpijl yaltar mariusneo zergey giserh jaaliu dingkple nmfc2003 npakhomova grandata drsnowbird meua jamesbconner kiat arian miguelperalvo carloslopezroa vrlo loseryao

docker-spark's Issues

Request for docker-compose.yml file

i tried to use docker-compose to create 1 master and 1+ worker containers, however, it didn't work. any ideas? the container just dies (spark_master) and i can't get to the logs because it's too late.

i have something like the following.

spark_master:
 image: epahomov/docker-spark
 container_name: spark_master
 command: /start-master.sh
 ports:
  - "7077:7077"
  - "8080:8080"
spark_slave:
 image: epahomov/docker-spark
 command: /start-worker.sh
 links:
  - spark_master

would be nice to do docker-compose scale spark_slave=2

Any changes this will be updated?

This is deploying spark 1.3. Any chance this product becomes active and/or updated? How are do you think it would be to contribute and update?

remove_alias.sh lead to errors using InetAddress.getLocalHost.getHostAddress

Hi,

Thanks for sharing your work, I am now running a spark cluster in minutes on my laptop, cool!

But when I submit some code to the cluster, it fails because of the remove_alias workaround.

When I try to use InetAddress.getLocalHost.getHostAddress I face an exception:

java.net.UnknownHostException: 34a5ed171df7: Name or service not known

The code I am using is the spark-cassandra-connector. The exception is thrown here: https://github.com/datastax/spark-cassandra-connector/blob/v1.3.0-RC1/spark-cassandra-connector/src/main/scala/com/datastax/spark/connector/cql/CassandraConnectorConf.scala#L108

If I remove the remove_alias from the spark-shell launcher, the spark-cassandra-connector works but later when spark "deploy" the submitted code to its cluster of worker, there is an error because the nodes try to use their internal names to communicate.

Seems like a linking problem. I will trying to remove the remove_alias workaround and to link each node properly for them to communicate using their names

Standalone daemon

Is it somehow possible to just run it in the background as daemon? I would like to use it in combination with R, but not using the command line interface. Would be great if it is possible to connect to it from outside and deploy the scripts using some connector (e.g. sparklyr package).

JDBC Connection

I would like to connect from outside using JDBC to the Spark SQL. Is this possible? What should I do?
I tried using -p 10000:10000 but it is not working.

epahomov / docker-spark Goto Github PK

docker-spark's People

Contributors

Stargazers

Watchers

Forkers

docker-spark's Issues

Request for docker-compose.yml file

Any changes this will be updated?

remove_alias.sh lead to errors using InetAddress.getLocalHost.getHostAddress

Standalone daemon

JDBC Connection

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent