Comments (8)
Are you seeing in the AWS Console that the instances (especially the master instance) are still online when you try to SSH back in?
If SSH is timing out, then it seems like either the instance is down, or the security group rules changed somehow, or the SSH keys were changed somehow. There could be other possibilities of course, but of those three I feel like the first is most likely so that's where I would start.
from flintrock.
Yes I can see in the aws console that the instances are still running.
I just rebooted my Mac and somehow I could login again. Not sure if the ssh was blocking me or so? Hope it doesn't happen again.
from flintrock.
Perhaps it's something weird happening with your network? Anyway, feel free to reopen this issue if you have more resolution on where the problem might be coming from.
from flintrock.
Actually it came back the next day again, can't fix it now. I am at university on Eduroam, also tried mobile hotspot. Same thing.
Even if I login with ssh it times out. My instances are running and 2/2 checks passed.
When I flintrock launch cluster8
It now gives me:
opt/homebrew/lib/python3.10/site-packages/paramiko/transport.py:219: CryptographyDeprecationWarning: Blowfish has been deprecated
"class": algorithms.Blowfish,
2023-03-31 09:28:20,913 - flintrock.ec2 - INFO - Launching 4 instances...
2023-03-31 09:28:37,173 - flintrock.ec2 - DEBUG - 4 instances not in state 'running': 'i-09436562527109c97', 'i-09688b6b7e02ff6c0', 'i-0386eb57946f277df', ...
2023-03-31 09:28:42,171 - flintrock.ec2 - DEBUG - 4 instances not in state 'running': 'i-09436562527109c97', 'i-0386eb57946f277df', 'i-073c7c86259375c10', ...
2023-03-31 09:28:46,037 - flintrock.ec2 - DEBUG - 4 instances not in state 'running': 'i-09436562527109c97', 'i-0386eb57946f277df', 'i-073c7c86259375c10', ...
endlessly...
from flintrock.
What do you see in the console for these instances that don't seem to launch normally?
from flintrock.
Allow me a few days to see if it re-appears. I ended up doing a full reset of the aws environment (was using a learners lab). Any command in particular I should be sure to check in the console?
from flintrock.
It just happened again.
flintrock login cluster
/opt/homebrew/lib/python3.10/site-packages/paramiko/transport.py:219: CryptographyDeprecationWarning: Blowfish has been deprecated
"class": algorithms.Blowfish,
ssh: connect to host 54.205.234.98 port 22: Operation timed out
My ~/.aws/credentials
are up to date.
Looking at the dashbboard I see nothing special. See screenshot here.
This is a fresh cluster. Only logged in once after launch. I setup pydoop/yarn. Then exited. A few days later try to login and it keeps timing out.
Anything else I can check or reports I can generate?
I also tried the AWSSupport-TroubleshootSSH
in System Services but that process got stuck and some steps failed:
cda7fc3a-cee3-4f29-9c4c-b04734b91d65 | 1 | assertInstanceIsManagedInstance | aws:assertAwsResourceProperty | Failed | Wed, 05 Apr 2023 01:38:00 GMT | Wed, 05 Apr 2023 01:38:00 GMT
-- | -- | -- | -- | -- | -- | --
3cb6b036-5b86-40ef-a54c-07dade48c315 | 2 | assertAllowOffline | aws:assertAwsResourceProperty | Success | Wed, 05 Apr 2023 01:38:01 GMT | Wed, 05 Apr 2023 01:38:01 GMT
f28f0344-e556-46bd-8634-00f96827e9c6 | 3 | assertActionIsFixAll | aws:assertAwsResourceProperty | Success | Wed, 05 Apr 2023 01:38:01 GMT | Wed, 05 Apr 2023 01:38:02 GMT
0dcc8230-715c-4514-84e3-7d81042ec3bf | 4 | assertSubnetId | aws:assertAwsResourceProperty | Success | Wed, 05 Apr 2023 01:38:02 GMT | Wed, 05 Apr 2023 01:38:02 GMT
b88be25f-d38f-446f-af2d-b6c308912c6d | 5 | describeSourceInstance | aws:executeAwsApi | Success | Wed, 05 Apr 2023 01:38:03 GMT | Wed, 05 Apr 2023 01:38:03 GMT
74f0b54d-c5ee-4ac0-9858-c82efd345bdc | 6 | troubleshootSSHOfflineWithSubnetId | aws:executeAutomation | Failed | Wed, 05 Apr 2023 01:38:03 GMT | Wed, 05 Apr 2023 01:38:12 GMT
f631870d-2d3b-40e7-b2df-0d7d4dc7419a | 7 | installEC2Rescue | aws:runCommand | Pending | - | -
28666d80-ecaa-437f-a668-f2cfece634d2 | 8 | troubleshootSSH | aws:runCommand | Pending | - | -
1822d358-a7e4-4208-a99f-ee1a6b109fbc | 9 | troubleshootSSHOffline | aws:executeAutomation | Pending | - | -
from flintrock.
The lockout after some days smells like a firewall or network issue that's independent of Flintrock. But I'm not sure how to debug this because it could be so many different things. Do you have an AWS admin who can help you investigate?
from flintrock.
Related Issues (20)
- Spark 3.0.0 incompatibilities HOT 20
- Flintrock on EC2 SSH Timeout Errors HOT 6
- Sometimes describe gets the cluster name wrong HOT 4
- Wrong error message when specifying invalid AMI HOT 2
- Make Python 3 the default on launched clusters HOT 1
- Check version compatibility across Spark/Hadoop/Java and warn user when appropriate
- Timed out waiting for HDFS master to come up. HOT 3
- Upgrade to paramiko==2.7.2 HOT 1
- Issues reading S3 data on spark Cluster. HOT 11
- Authentication failure while launching cluster HOT 1
- Gracefully handle interrupted launches: allow flintrock caller to more easily idenify instances HOT 1
- Possible regression? Cluster Launch With More than 20 instances hits AWS Rate Limits HOT 4
- Possible issue in launching large number of instances and then destroying the cluster HOT 5
- New Release HOT 2
- SPARK_PUBLIC_DNS is incorrectly set when launching into a private VPC HOT 8
- get error when running "flintrock configure" command HOT 2
- Update parmaiko dependency to 2.10.1+ HOT 1
- SSH timeout and [Errno None] Unable to connect to port 22 on AWS EC2 HOT 6
- Spark connection issue on EC2 HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from flintrock.