Coder Social home page Coder Social logo

Comments (9)

t-8ch avatar t-8ch commented on May 29, 2024 1

@donoghuc For me it works too with new new bolt/net-ssh

from bolt.

TheMeier avatar TheMeier commented on May 29, 2024

@MikaelSmith any update here? Is there a corresponding Jira issue?

from bolt.

donoghuc avatar donoghuc commented on May 29, 2024

Hi @TheMeier I have filed https://tickets.puppetlabs.com/browse/BOLT-1423 . Are you running in to a similar issue? Are you setting proxyjump in Bolt's ssh configuration options in addition to ProxyConfig in you OpenSSH configuration options?

from bolt.

TheMeier avatar TheMeier commented on May 29, 2024

@donoghuc we have a propriatary jump host wich does not work with ProxyJump. We use a config like this on the agent

jump.host.
  ControlMaster auto
  ControlPath ~/.ssh/connections/%r_%h_%p
  ControlPersist no
  ProxyCommand none

ProxyCommand /usr/bin/ssh  jump.host  %h

All tests show this behaviour, everything works fine but the session never terminates just as OP described

from bolt.

donoghuc avatar donoghuc commented on May 29, 2024

@t-8ch @TheMeier I am having trouble reproducing the hanging behavior using the latest version of bolt 1.24.0 which includes net-ssh 5.2.0 with system packages. I would like to share my observations and solicit some suggestions on how I might reproduce the described bug.

ssh config

ssh version

cas@cas-ThinkPad-T460p:~/working_dir/bolt$ ssh -V
OpenSSH_7.2p2 Ubuntu-4ubuntu2.8, OpenSSL 1.0.2g  1 Mar 2016

The following ssh config is saved in ~/.ssh/config

Host p71tjeuwlftc81l.delivery.puppetlabs.net
  IdentityFile ~/.ssh/id_rsa-acceptance
  User root
  StrictHostKeyChecking no
  HostName p71tjeuwlftc81l
  Port 22
  ProxyCommand ssh [email protected] nc -w 10 %h %p

In this case connections to p71tjeuwlftc81l are proxy'ed through cazdnnex9p1rwk1. Most of the settings are self explanitory but the one of note is the ProxyCommand. I have set the nc(1) (https://linux.die.net/man/1/nc) timeout to 10 seconds to demonstrate how the connection will terminate resulting in an IOError: closed stream in the net-ssh library: https://github.com/net-ssh/net-ssh/blob/8f38e64cabe44f2c5ca33c1b859446967162e04e/lib/net/ssh/transport/packet_stream.rb#L76

long running task

To demonstrate this I have a simple task which will write 'hi' to a file called written_by_task on the target p71tjeuwlftc81l and echo made it to stdout after 30 seconds.

cas@cas-ThinkPad-T460p:~/working_dir/bolt$ cat Boltdir/site/test/tasks/long_running_task.sh 
#!/bin/bash
sleep 30
echo 'hi' > /root/written_by_task
echo 'made it'

When I execute the task with the nc(1) timeout, bolt returns the following error in about 15 seconds:

cas@cas-ThinkPad-T460p:~/working_dir/bolt$ bolt task run test::long_running_task -t p71tjeuwlftc81l.delivery.puppetlabs.net
Started on p71tjeuwlftc81l.delivery.puppetlabs.net...
[email protected]'s password: 
Failed on p71tjeuwlftc81l.delivery.puppetlabs.net:
  closed stream
Failed on 1 node: p71tjeuwlftc81l.delivery.puppetlabs.net
Ran on 1 node in 14.92 seconds

Note how I was prompted for the password for the proxy server (I configured password auth for the proxy node instead of rsa key auth to prove I am executing via proxy, which is also why the task time is about 15 seconds instead of closer to 10 seconds).
The most important bit is that the file is in fact written because as the OP pointed out the task does get executed, we just dont wait for the output. So to illustrate that we could run a command (again through the proxy) to inspect that file

cas@cas-ThinkPad-T460p:~/working_dir/bolt$ bolt command run 'cat /root/written_by_task' -t p71tjeuwlftc81l.delivery.puppetlabs.net
Started on p71tjeuwlftc81l.delivery.puppetlabs.net...
[email protected]'s password: 
Finished on p71tjeuwlftc81l.delivery.puppetlabs.net:
  STDOUT:
    hi
Successful on 1 node: p71tjeuwlftc81l.delivery.puppetlabs.net
Ran on 1 node in 4.05 seconds

Now, if I bump up the nc(1) wait time (ProxyCommand ssh [email protected] nc -w 40 %h %p), the task should complete without error and both write to the file as well as return the expected output.

cas@cas-ThinkPad-T460p:~/working_dir/bolt$ bolt task run test::long_running_task -t p71tjeuwlftc81l.delivery.puppetlabs.net
Started on p71tjeuwlftc81l.delivery.puppetlabs.net...
[email protected]'s password: 
Finished on p71tjeuwlftc81l.delivery.puppetlabs.net:
  made it
  {
  }
Successful on 1 node: p71tjeuwlftc81l.delivery.puppetlabs.net
Ran on 1 node in 34.96 seconds

I would be curious if experimenting with the timeout in your ProxyCommand would help solve the issue, Also I am wondering if you are using a non-default connect-timeout for you ssh config. Perhaps some combination of those settings is the culprit and there is something we could do to document or fix in bolt.

from bolt.

TheMeier avatar TheMeier commented on May 29, 2024

Hm I was testing with 1.18.0 since that what the documented package installation provides.
Tried with the gem now, same result for me.
We are using a not so common setup with some fancy config on the jumphost so the netcat command is in the jumphost config not the clients ProxyCommand.
I will try to get the relevant parts of the jumphost sshd config

Concerning timeouts, I don't think this has anything to do with timeouts because I am testing with a simple

bolt command run "ls .bashrc" --nodes test.server.lan --debug

Which immediatly returns

Executing: ls .bashrc
stdout: .bashrc
Command returned successfully

Also we use the same setup with ansible and some homegrown tools using fabric.api so at least with python paramiko our setup works without any problems for many years

from bolt.

nicklewis avatar nicklewis commented on May 29, 2024

I'm closing this because it's several years old and looks like it was related to specific library versions. Please feel free to reopen if this is still a problem.

from bolt.

github-actions avatar github-actions commented on May 29, 2024

This issue has not had activity for 60 days and will be marked as stale.
If this issue continues to have no activity for 7 days, it will be closed.

from bolt.

github-actions avatar github-actions commented on May 29, 2024

This issue is stale and has been closed. If you believe this is in error,
or would like the Bolt team to reconsider it, please reopen the issue.

from bolt.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.