Coder Social home page Coder Social logo

massh's Introduction

logo

Go Test Go Report Card Go Report Card Go Doc

Description

Go package for streaming Linux distributed shell commands via SSH.

What makes Massh special is it's ability to stream & process output concurrently. See _examples/example_streaming for some sample code.

Contribute

Have a question, idea, or something you think can be improved? Open an issue or PR and let's discuss it!

Example:

package main

import (
	"fmt"
	"github.com/discoriver/massh"
	"golang.org/x/crypto/ssh"
)

func main() {
	// Create pointers to config & job
	config := massh.NewConfig()

	job := &massh.Job{
		Command: "echo hello world",
	}

	config.SetHosts([]string{"192.168.1.118"})

	// Password auth
	config.SetPasswordAuth("u01", "password")

	// Key auth in same config. Auth will try all methods provided before failing.
	err := config.SetPrivateKeyAuth("~/.ssh/id_rsa", "")
	if err != nil {
		panic(err)
	}

	config.SetJob(job)
	config.SetWorkerPool(2)
	config.SetSSHHostKeyCallback(ssh.InsecureIgnoreHostKey())

	// Make sure config will run
	config.CheckSanity()

	res, err := config.Run()
	if err != nil {
		panic(err)
	}

	for i := range res {
		fmt.Printf("%s:\n \t OUT: %s \t ERR: %v\n", res[i].Host, res[i].Output, res[i].Error)
	}
}

More examples, including this one, are available in the examples directory.

Usage:

Get the massh package;

go get github.com/DiscoRiver/massh

Documentation

Other

Bastion Host

Specify a bastion host and config with BastionHost and BastionHostSSHConfig in your massh.Config. You may leave BastionHostSSHConfig as nil, in which case SSHConfig will be used instead. The process is automatic, and if BastionHost is not nil, it will be used.

Streaming output

There is an example of streaming output in the direcotry _examples/example_streaming, which contains one method of reading from the results channel, and processing the output.

Running config.Stream() will populate the provided channel with results. Within this, there are two channels within each Result, StdOutStream and StdErrStream, which hold the stdout and stderr pipes respectively. Reading from these channels will give you the host's output/errors.

When a host has completed it's work and has exited, Result.DoneChannel will receive an empty struct. In my example, I use the following function to monitor this and report that a host has finished (see _examples/example_streaming for full program);

func readStream(res Result, wg *sync.WaitGroup) error {
	for {
		select {
		case d := <-res.StdOutStream:
			fmt.Printf("%s: %s", res.Host, d)
		case <-res.DoneChannel:
			fmt.Printf("%s: Finished\n", res.Host)
			wg.Done()
		}
	}
}

Unlike with Config.Run(), which returns a slice of Results when all hosts have exited, Config.Stream() requires some additional values to monitor host completion. For each individual host we have Result.DoneChannel, as explained above, but to detect when all hosts have finished, we have the variable NumberOfStreamingHostsCompleted, which will equal the length of Config.Hosts once everything has completed. Here is an example of what I'm using in _examples/example_streaming;

if NumberOfStreamingHostsCompleted == len(cfg.Hosts) {
		// We want to wait for all goroutines to complete before we declare that the work is finished, as
		// it's possible for us to execute this code before we've finished reading/processing all host output
		wg.Wait()

		fmt.Println("Everything returned.")
		return
}

Right now, the concurrency model used to read from the results channel is the responsibility of those using this package. An example of how this might be achieved can be found in the https://github.com/DiscoRiver/omnivore/tree/main/internal/ossh package, which is currently in development.

massh's People

Contributors

discoriver avatar ilyabrin avatar simonwaldherr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

massh's Issues

Adjust how command output is returned

Currently, command output is only returned after every host has finished executing;

for r := 0; r < len(c.Hosts); r++ {
		res = append(res, <-results)
	}

return res

After implementing this package into a production app, for an environment of around 200 servers, it became clear that this can cause an almost invisible bottleneck, especially if the worker pool is low compared to the number of hosts being touched. I wanted to refer to it as a perceived delay, but it can actually hang up the caller because it is waiting for every host to finish the work and return something.

There are two requirements I have to improve this;

  • Be able to track how many hosts haven't returned anything, and give some indication on what work is left.
  • Have the caller be able to process command output as a host finishes executing, rather than waiting.

Explore: Pausing

While working on https://github.com/discoriver/omnivore, the idea of pausing an in-progress command came to mind. In the event that file output breaks, and we need to continue in-memory (or at least give the option to continue), I may want to pause hosts temporarily while we determine if we should continue in-memory, to stop any processing being performed if the user takes a while to respond to the message.

Current thoughts are that we can signal the process to be suspended and resume it, which is pretty trivial, but we should explore what is available to us.

I'm creating this to explore options at a later date. If anyone else has thoughts, please feel free to join the conversation.

Pre-processing

Issue

One of the use cases for Massh is querying log files on a distributed environment, to consolidate them into a single filtered log file.

The way I use this right now is to run a grep command on each remote server, which gathers output and sticks it all in a timestamp-sorted local file. The problem is that some historical logs are archived in tar.gz format, which severely limits simple/easy access to the files within.

Simplified, the problem is that we need a way to prepare files to be worked on.

Possible Solutions

  • The obvious solution is to delcare that this isn't a responsibility of the Massh package, and should be handled by providing a shell script as the massh job.
  • Would it make sense to add a pre-processing script function that configures an environment on the target system, and then a separate command that is run on that directory? Having the actual work hidden within a script, rather than set as a massh job explicitly, may be undesireable and reduce the overall readability of the surrounding application. The current thoughts that occur to me of this approach are;
    • Separate exit codes for pre-processing script and command, which may aid in debugging.
    • Improved readability of the application by offloading less work to a shell script.
    • Would increasing the complexity of the massh package be beneficial, over the previous solution of containing all the work within the shell script?
    • Having pre-processing scripts return expected values, such as workingDirectory, may improve interactivity within the application and improve overall readability.

Most of my hesitation is around how much we're offsetting to shell scripts and how we maintain readability and efficiency within the Go program. Being able to debug and follow exactly what commands are doing is critical to building with massh. I'm also aware that we need to avoid the package scope getting out of control. I'm unsure what the best approach is here right now, but will continue to think.

Messaging from within Run and Stream methods

Issue

Currently, for package errors we're relying on Result.Error, which is rather inelegant, but it's necessary that we don't return an error from from the Stream method, for easier use.

I'd like to re-explore logging from within the Stream method (this will also apply to Run, but it's trivial). Perhaps adding a Messages field to give the user the option of looking at the internals, so they can then log it. We can't write anything out ourselves via the Massh package, but giving the user a value they can do what they like with is well within the scope.

Executing instructions in a declaritive manner

Amazing work on this project !

Addressing problem

If a user wants to execute large set of shell commands. It would annoying to always create a Go program every single time and setting the instructions in an array.

Solution

Based on the inspiration of Ansibles. We can set a simple yaml or json file which consists of an array of strings of shell commands to be executed.

Reason behind this

I am working a p2p network for executing custom scripts. Currently I am using Ansibles as a plugin system. But it's extremely cluttered and requires external dependencies. If I can implement the following issue. I can use your project as a plugin system to my project.
Project link: https://github.com/Akilan1999/p2p-rendering-computation

When Stream() returns many lines, sometimes chunks of lines are not captured.

Intermittently, running stream to capture a reasonable amount of text sometimes misses chunks of lines.

You can see an example of this here in "Go Test": https://github.com/DiscoRiver/massh/runs/3832011162?check_suite_focus=true

I was using the command cat /var/log/auth.log in this test, and you'll see the following on line 48;

do: pam_unix(sudo:session): session opened for user root by (uid=0)

This should be something like what is on line 44;

Oct  7 20:45:48 fv-az224-778 sudo: pam_unix(sudo:session): session opened for user root by (uid=0)

There doesn't immediately seem to be a pattern, but more investigation is necessary into the concurrency model.

Run() does not return JobStack results correctly.

run (see unexported), does not account for Config.JobStack, and therefore only returns a single job result from each host. Function needs to be updated so that the results channel has the correctly length when JobStack is present. Similar to how runStream functions.


Providing nil value to config.Stream() results in unexpected behaviour

Issue

When providing a nil value to config.Stream(), instead of running sshCommandStream, sshCommand is run, and no error is reported.

Cause

This is a failure in the worker function logic in session.go;

// worker invokes sshCommand for each host in the channel
func worker(hosts <-chan string, results chan<- Result, job *Job, sshConf *ssh.ClientConfig, resChan chan Result) {
	if resChan == nil {
		for host := range hosts {
			results <- sshCommand(host, job, sshConf)
		}
	} else {
		for host := range hosts {
			sshCommandStream(host, job, sshConf, resChan)
			}
	}
}

This was a lazy way to detect if we should be streaming.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.