Comments (8)
For people who want to have a quick hacky solution to this. One can create a machine in the VPC and launch a cluster from that machine using private DNS/IP.
However, one needs to do the following:
- Hack flintrock (https://github.com/eshioji/flintrock/pull/1/files)
- Assign additional security group that allows access to necessary ports (the one flintrock assigns didn't work for my VPC setting)
- Make it so that the machines can resolve their own hostname (the default AMI gets hostnames like ip-xx-xx-xx-xx and Spark fails to launch because it can't resolve it).
For the last problem, I did the following:
Create a script with the following content and let it run at startup using rc.local. Not the nicest way but it works.
OWN_IP=`curl -s http://169.254.169.254/latest/meta-data/local-ipv4`
HOSTNME=`hostname`
echo "$OWN_IP $HOSTNME" | sudo tee --append /etc/hosts
Finally thank you @nchammas , I wanted to use a custom AMI and this project was a life-saver!
from flintrock.
is something like the chaordic fork implemented here? If so are their instructions? I'm trying to run Spark behind the VPC with not outside traffic.
from flintrock.
Thanks @nchammas. This use case is becoming more common in enterprise setting. Any thoughts on when this feature might become available?
from flintrock.
As with the typical open source project, there are no promises on when a feature might be delivered. The things that usually get worked on are critical bug fixes, fun stuff, and everything else, in that order. 😄
My current intent is to work next on adding support for resizing existing clusters and improving launch progress reporting. If I get some time at work to devote to Flintrock (e.g. for this issue), I may work on it after I get those 2 features out.
from flintrock.
It is exactly our case ( and we don't even have hostname resolution, and that would need some hack for spark and hadoop ), I will work out something and submit it to you @nchammas.
My use case will be :
- Downloading spark from a custom location ( thanks to #104 )
- Use private DNS names instead of public ones
- Define several origins ( multiple CIDR addresses to access the cluster for the security groups )
from flintrock.
In case of the private VPC, I'm struggling with the ssh access when installing and configuring, because i'm outside the security group and it take my public IP to access an private IP so doesn't work.
I'm thinking about 2 solutions :
- Opening a security breach for the install and configuration step
- Bind the access_origins to the VPC ( means my private IP is within the range of the origins )
What do you think about it @nchammas ?
from flintrock.
Hey @rmessner, I updated the description of the issue and commented on your PRs as well.
from flintrock.
Related PRs from others who have implemented their own fixes for this:
from flintrock.
Related Issues (20)
- Make Python 3 the default on launched clusters HOT 1
- Check version compatibility across Spark/Hadoop/Java and warn user when appropriate
- Timed out waiting for HDFS master to come up. HOT 3
- Upgrade to paramiko==2.7.2 HOT 1
- Issues reading S3 data on spark Cluster. HOT 11
- Authentication failure while launching cluster HOT 1
- Gracefully handle interrupted launches: allow flintrock caller to more easily idenify instances HOT 1
- Possible regression? Cluster Launch With More than 20 instances hits AWS Rate Limits HOT 4
- Possible issue in launching large number of instances and then destroying the cluster HOT 5
- New Release HOT 2
- SPARK_PUBLIC_DNS is incorrectly set when launching into a private VPC HOT 8
- get error when running "flintrock configure" command HOT 2
- Update parmaiko dependency to 2.10.1+ HOT 1
- SSH timeout and [Errno None] Unable to connect to port 22 on AWS EC2 HOT 6
- Spark connection issue on EC2 HOT 5
- Locked out at random. Never to recover. HOT 8
- request a new release HOT 1
- AdoptOpenJDK.jfrog.io has been deprecated HOT 6
- Fail to launch flintrock using Adoptuim HOT 14
- Spot instance tags issue HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from flintrock.