epahomov / docker-spark Goto Github PK
View Code? Open in Web Editor NEWDocker image for general apache spark client
License: Apache License 2.0
Docker image for general apache spark client
License: Apache License 2.0
i tried to use docker-compose to create 1 master and 1+ worker containers, however, it didn't work. any ideas? the container just dies (spark_master) and i can't get to the logs because it's too late.
i have something like the following.
spark_master:
image: epahomov/docker-spark
container_name: spark_master
command: /start-master.sh
ports:
- "7077:7077"
- "8080:8080"
spark_slave:
image: epahomov/docker-spark
command: /start-worker.sh
links:
- spark_master
would be nice to do docker-compose scale spark_slave=2
This is deploying spark 1.3. Any chance this product becomes active and/or updated? How are do you think it would be to contribute and update?
Hi,
Thanks for sharing your work, I am now running a spark cluster in minutes on my laptop, cool!
But when I submit some code to the cluster, it fails because of the remove_alias workaround.
When I try to use InetAddress.getLocalHost.getHostAddress I face an exception:
java.net.UnknownHostException: 34a5ed171df7: Name or service not known
The code I am using is the spark-cassandra-connector. The exception is thrown here: https://github.com/datastax/spark-cassandra-connector/blob/v1.3.0-RC1/spark-cassandra-connector/src/main/scala/com/datastax/spark/connector/cql/CassandraConnectorConf.scala#L108
If I remove the remove_alias from the spark-shell launcher, the spark-cassandra-connector works but later when spark "deploy" the submitted code to its cluster of worker, there is an error because the nodes try to use their internal names to communicate.
Seems like a linking problem. I will trying to remove the remove_alias workaround and to link each node properly for them to communicate using their names
Is it somehow possible to just run it in the background as daemon? I would like to use it in combination with R, but not using the command line interface. Would be great if it is possible to connect to it from outside and deploy the scripts using some connector (e.g. sparklyr package).
I would like to connect from outside using JDBC to the Spark SQL. Is this possible? What should I do?
I tried using -p 10000:10000 but it is not working.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.