lordmauve / chopsticks Goto Github PK
View Code? Open in Web Editor NEWChopsticks is an orchestration library: it lets you execute Python code on remote hosts over SSH.
Home Page: https://chopsticks.readthedocs.io/
License: Apache License 2.0
Chopsticks is an orchestration library: it lets you execute Python code on remote hosts over SSH.
Home Page: https://chopsticks.readthedocs.io/
License: Apache License 2.0
If the controller host disconnects while an operation is in process, it is conceivable that the agent process could be left running.
A particular worry could be accidental hangs/infinite on the agent side, causing processes that cannot be terminated.
Possibly the remote agent should signal itself with SIGINT, causing a KeyboardInterrupt that should allow the stack to unwind - calling appropriate exception and finally handlers.
@lordmauve
Hi Daniel, are there any details on chopsticks spring at EuroPython 2017?
Like exact day and time perhaps?
Thanks.
In case of connection failures, the error message raised by Chopsticks is usually just "Unexpected EOF on stream".
Until a connection is established, Chopsticks could collect stderr lines, and report them as part of this error message. This would avoid having to piece together the causes of failure from both the stack trace and stderr information.
This will bevery useful for debugging especially with recursive tunnels.
Chopsticks could call logging.basicConfig()
on the remote process, so that logs are sent over stderr.
It could also pickle the current root formatter so that the format of logs printed from the remote side matches that from the local side. (However, this might simply add extra problems).
I was wondering if its possible to use custom SSH port with chopstick tunnel.
Thanks
Consider creating some easy way to interactively script and explore remote systems in parallel.
The first example with calling time works perfectly. But when I try to do other simple things such as:
print(ssh.call(sys.version_info))
then the following exception is raised:
_pickle.PicklingError: Can't pickle <class 'sys.version_info'>: it's not the same object as sys.version_info
Also when getting a bit more complex:
print(ssh.call(subprocess.check_output, "apt-get update", shell=True).raise_failures())
Traceback (most recent call last):
File "sticks.py", line 23, in <module>
print(ssh.call(subprocess.check_output, "apt-get update", shell=True).raise_failures())
File "/Users/konstruktor/.local/share/virtualenvs/shipit-WRCZ5US3/lib/python3.6/site-packages/chopsticks/tunnel.py", line 291, in call
raise RemoteException(ret.msg)
I'm tunneling into a Ubuntu 18.04 machine. With SSHTunnel.
When an operation is performed on a closed tunnel, Chopsticks gets quite far into attempting to perform the operation. This should fail fast with a message like "Operation on closed tunnel". The current error message is this:
Traceback (most recent call last)
<ipython-input-4-43d4acc4cee4> in <module>()
----> 1 tun.call(ip)
/home/mauve/dev/chopsticks/chopsticks/tunnel.py in call(self, callable, *args, **kwargs)
217 """
218 self._call_async(loop.stop, callable, *args, **kwargs)
--> 219 ret = loop.run()
220 if isinstance(ret, ErrorResult):
221 raise RemoteException(ret.msg)
/home/mauve/dev/chopsticks/chopsticks/ioloop.py in run(self)
275 self.running = True
276 while self.running and (self.read or self.write):
--> 277 self.step()
278 return self.result
/home/mauve/dev/chopsticks/chopsticks/ioloop.py in step(self)
251 rfds = list(self.read) + [self.breakr]
252 wfds = list(self.write)
--> 253 rs, ws, xs = select(rfds, wfds, rfds + wfds)
254 if self.breakr in rs:
255 rs.remove(self.breakr)
OSError: [Errno 9] Bad file descriptor
Currently when a .put() operation fails early, the controller continues sending chunks.
Additionally, the remote side will send a traceback for each chunk it receives after the first failure. This causes a KeyError on the controller.
To recover gracefully from this, the remote side should disregard uploaded chunks until it receives an acknowledgement from the controller that the upload has been aborted.
Groups and tunnels should connect lazily. This would allow groups and tunnels to be specified in a single "inventory" file.
Additionally, Groups should share tunnels to the same hosts, as hosts would typically be in a number of groups.
It should be possible to compared ErrorResult instances to each other. They should also be hashable based on their message value.
This would be useful in testing.
If they are hashable they could be used as dict keys and in sets. It might be useful to be able to count the different error messages using a collections.Counter
, for example.
Groups are effectively sets of Tunnels. Now that Tunnels connect lazily, one could consider paradigms like
webservers = Group(...)
load_balancers = Group(...)
databases = Group(...)
with webservers + load_balancers as group:
group.call(install_http_monitoring)
It seems to be the case that imported modules are compiled with the from future import print_function option. This should not be the case - it should work like any other import.
When importing a package on the remote side that uses pbr, this error is raised:
Exception: Versioning for this project requires either an sdist tarball, or access to an upstream
git repository. It's also possible that there is a mismatch between the package name in setup.cfg and the argument given to pbr.version.VersionInfo. Project name mock was given, but was not able to be found.
Hi @lordmauve do you have some examples to share about how to use chopsticks?
Actually I am using Paramiko to create some remote dirs and execute a remote command but it's not clear if I can do the same with chopsticks.
As discussed in #26, we should be able to treat Groups like sets - in order to be able to union them etc.
We should also add an operation group.filter(callable)
, which executes the callable on all hosts in the group and returns a new group that contains only those hosts where the callable returns True.
This would enable building groups based on dynamic information sourced from the hosts themselves.
I've been working on https://github.com/SupercomputerInABriefcase/SuperComputerInABriefcase today - with the aim of running tasks on a cluster of Raspberry PIs as an educational exercise in distributed computing.
After a good while trying to understand/install OpenMPI (on laptop and PIs), I decided to try chopsticks and had it working in about 5 minutes.
But this sends the same task to each host in the group. Obviously we want to send different tasks to each host and send another task when that finishes.
Could this be done by either: a) passing in a list, or even better an iterator, of functions, to group.call
and have the work spread across the hosts. Or b) add a call_async
method to tunnel that takes a callback?
A little bit excited about this project :)
In the process playing around, and have been getting some odd results with the following program:
# main.py
from chopsticks.tunnel import Tunnel
import tasks
def main(host):
host.call(tasks.get_time)
if __name__ == '__main__':
main(Tunnel('[email protected]'))
# tasks.py
import subprocess
def get_time():
return subprocess.run('time', stdout=subprocess.PIPE, shell=True).stdout.decode()
0
.0
):
[[email protected]] Usage: time [-apvV] [-f format] [-o file] [--append] [--verbose]
[[email protected]] [--portability] [--format=format] [--output=file] [--version]
[[email protected]] [--quiet] [--help] command [arg...]
[[email protected]] Usage: time [-apvV] [-f format] [-o file] [--append] [--verbose]
Fatal Python error: could not acquire lock for <_io.BufferedWriter name='<stderr>'>
at interpreter shutdown, possibly due to daemon threads
Thread 0x00007f86dde4c700 (most recent call first):
File ".../lib/python3.6/site-packages/chopsticks/ioloop.py", line 155 in println
File ".../lib/python3.6/site-packages/chopsticks/ioloop.py", line 162 in _check
File ".../lib/python3.6/site-packages/chopsticks/ioloop.py", line 145 in on_data
File ".../lib/python3.6/site-packages/chopsticks/ioloop.py", line 223 in step
File ".../lib/python3.6/site-packages/chopsticks/ioloop.py", line 242 in run
File "/usr/lib64/python3.6/threading.py", line 864 in run
File "/usr/lib64/python3.6/threading.py", line 916 in _bootstrap_inner
File "/usr/lib64/python3.6/threading.py", line 884 in _bootstrap
Current thread 0x00007f86e259c4c0 (most recent call first):
[1] 6661 abort (core dumped) ./main.py
The number of lines that successfully print before failure varies.
I'm running Python 3.6.1
locally, and 3.5.2
on the server. I don't know if it helps, but the result of running ssh -V
is OpenSSH_7.5p1, OpenSSL 1.1.0f 25 May 2017
.
Please let me know if there's any further information I could provide to help shed some light on this.
The following code actually doesn't work...
from chopsticks.tunnel import SSHTunnel
def deco(fn):
return fn
@deco
def do_it():
return 'done'
tunnel = SSHTunnel('[email protected]')
print(tunnel.call(do_it))
... because when building the source code to send, serialise_func()
will append sub dependency functions below the original callable, hence when the code is remotely unpickled and executed, the decorator function is not yet declared.
The README says:
One might also draw a comparison with Python's built-in multiprocessing library, but instead of calling code in subprocesses on the same host, the code may be run on remote hosts.
What about chopsticks compared with the remote manager support in multiprocessing? That allows running the code on remote hosts through the multiprocessing interface. https://docs.python.org/3/library/multiprocessing.html#using-a-remote-manager
Currently lines from stderr are simple echoed prefixed by the hostname. This behaviour should be configurable.
One idea would be to register callbacks - or simply one global callback - to handle this output.
Hi, I have SSH username and password - is there currently any way how to authenticate ssh tunnel with password? I don't know - overriding something, creating some objects by myself, etc? I'd really need to pass the password to connect. Any hints would be useful.
Thanks
In order to check for Chopsticks' IO performance, and catch regressions, we should create a suite of realistic tasks that can act as a stable benchmark.
This would allow us to tune for performance. There is plenty of scope for this - consider approaches like serialising messages only once across all hosts, or pipelining requests.
Hi @lordmauve ,
I've encountered a bug which took me quite a while to pinpoint: when using chopsticks with python3 on slow network connection, the reading of the bootstrap code through stdin fails.
Actually I originally encountered the bug when patching bubble.py
whose file size became bigger and the connection to a google cloud engine server was slow, hence triggering the bug described here below. No need to say that I though for days that some of my code in bubble.py
was the culprit before finding out it was just due to the filesize :)
Consider the following file:
# test.py
import sys
from chopsticks.tunnel import SSHTunnel
def test():
return 'done'
tunnel = SSHTunnel(sys.argv[1])
print(tunnel.call(test))
I'm using a vm on my local computer for testing and when I want to throttle the connection I use this line in my ssh config:
Host foobar
ProxyCommand pv -q -L 50k | nc %h 22
now consider the following test script executions:
$ python2 test.py server
done
$ python2 test.py server.slow # with throttling
done
$ python3 test.py server
done
$ python3 test.py server.slow # with throttling
Traceback (most recent call last):
File "test.py", line 10, in <module>
print(tunnel.call(test))
File "/home/amigrave/.local/lib/python3.5/site-packages/chopsticks/tunnel.py", line 296, in call
self.connect()
File "/home/amigrave/.local/lib/python3.5/site-packages/chopsticks/tunnel.py", line 142, in connect
raise RemoteException(res.msg)
chopsticks.tunnel.RemoteException: Unexpected EOF on stream
The reason for this behavior lays in the bootstrap inline code executed with the python interpreter command line:
# tunnel.py#SubprocessTunnel
PYTHON_ARGS = [
'-usS',
'-c',
'import sys, os; sys.stdin = os.fdopen(0, \'rb\', 0); ' +
'__bubble = sys.stdin.read(%d); ' % len(bubble) +
'exec(compile(__bubble, \'bubble.py\', \'exec\'))'
]
The way we read stdin: 'import sys, os; sys.stdin = os.fdopen(0, \'rb\', 0); ' +
.
So in python3 os.fdopen
is an alias to open
and for a reason I don't yet fully understand (maybe unbuffered file descriptors are set to os.O_NONBLOCK
?) the os.stdin.read(size)
call will not return the good amount of bytes.
That said, in the python documentation it is stated
To read a file’s contents, call f.read(size), which reads some quantity of data and returns it as a string or bytes object. size is an optional numeric argument. When size is omitted or negative, the entire contents of the file will be read and returned; it’s your problem if the file is twice as large as your machine’s memory. Otherwise, at most size bytes are read and returned. If the end of the file has been reached, f.read() will return an empty string ('').
So it seems to be a normal behavior !?
Here's a line you can add in the boostrap code in order to check the fact (on my box it actually reads 16kb instead of 17030 bytes):
'import sys, os; sys.stdin = os.fdopen(0, \'rb\', 0); ' +
'__bubble = sys.stdin.read(%d); ' % len(bubble) +
+ 'sys.stderr.write(\'Read %d - Got %%d\' %% len(__bubble)); ' % len(bubble) +
'exec(compile(__bubble, \'bubble.py\', \'exec\'))'
And here's the result on my box with ssh throttling:
$ python3 test.py 192.168.56.103
Traceback (most recent call last):
File "test.py", line 10, in <module>
print(tunnel.call(test))
File "/home/amigrave/.local/lib/python3.5/site-packages/chopsticks/tunnel.py", line 296, in call
self.connect()
File "/home/amigrave/.local/lib/python3.5/site-packages/chopsticks/tunnel.py", line 142, in connect
raise RemoteException(res.msg)
chopsticks.tunnel.RemoteException: Unexpected EOF on stream
[[email protected]] Read 17030 - Got 16384
I always worked with such behavior for socket.recv
but I didn't knew it could happen to file objects (although I hope this only happens in special cases otherwise I won't sleep this week thinking about all the code I wrote so far).
So. I tried to fix it with a while
loop in order to ensure we get all bytes before compiling the bootstrap code but you cannot syntactically use a while
statement in an oneliner containing other statements.
I tried other approaches:
eval
ing a multiline string with \n
replaced to \\n
but all approaches failed somewhere mainly because I lack the time to do it properly and also because I felt bad during my attempts which always looked like a pile of hacks on top of hacks.
This is why I think this issue needs to be fixed by you. I tried to compile as much information as possible in this issue so you can think about what would be the best option.
Would be interesting to make Chopsticks available in Jupyter notebooks - ie. the ability to create a function without a notebook cell and call it on a remote host.
This would require serialising the function code, which might require a new serialisation method.
It should be possible to create a kernel provider for IPython that runs code in a Docker container. This would make it extremely easy to launch an kernel using an alternative version of Python (on Linux).
In 21722ba a binary encoding was added, and which is used for sending structured data from the host to the client. The motivation there was to avoid costly base64-in-JSON encoding and decoding for binary data, which is amplified when tunnelling because it would otherwise be performed at each hop.
However, our own encoding gives us the opportunity to safely support all Python primitive types, and not just be limited to JSON. The current encoding can already distinguish between list and tuples, bytes and strings. However, lots of interesting datastructures are precluded by being limited to JSON - frozenset-keyed dicts, for example! Meanwhile, we do not get the human-readable benefit of JSON, as it is very hard to inspect messages being passed already.
One problem with this proposal is that the encoding code will need to be present in both the orchestration host and the bubble. We ought to attempt to achieve this without copying and pasting. If we put it in a separate file we may be able to simply prepend it to to the bubble.py code.
We should also profile the encoding in comparison to JSON to avoid a possible performance regression.
Executing in Macosx box with cpython 2.7:
#!/usr/bin/env python
from chopsticks.tunnel import Tunnel
tun = Tunnel('localhost')
import time
print('Time on %s:' % tun.host, tun.call(time.time))
Results to
Traceback (most recent call last):
File "./py0.py", line 11, in <module>
print('Time on %s:' % tun.host, tun.call(time.time))
File "/Users/username/example/.chopsticks/lib/python2.7/site-packages/chopsticks/tunnel.py", line 179, in call
raise RemoteException(ret.msg)
chopsticks.tunnel.RemoteException: Unexpected EOF on stream
And the same if not executing from a virtualenv
Traceback (most recent call last):
File "./py0.py", line 11, in <module>
print('Time on %s:' % tun.host, tun.call(time.time))
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/chopsticks/tunnel.py", line 179, in call
raise RemoteException(ret.msg)
chopsticks.tunnel.RemoteException: Unexpected EOF on stream
Same scripts works when executed against a linux box.
('Time on localhost:', 1476624529.645448)
When ps
is run on a host with an active tunnel, the output isn't pretty:
mauve 12457 0.3 0.5 194128 12124 ? Ssl 22:05 0:00 /usr/bin/python3 -usS -c import sys, os; sys.stdin = os.fdopen(0, 'rb', 0); __bubble = sys.stdin.read(11522); exec(compile(__bubble, 'bubble.py', 'exec'
While it appears there is no single API to set the process arguments, it appears it is possible on most POSIX systems in some way. Chopsticks should take advantage of this to keep ps clean.
When the Tunnel.close()
is called, the ioloop writer and reader instances are closed and oddly this makes subsequent stderr writes not intercepted by the orchestrator.
Eg:
# -*- coding: utf-8 -*-
import time
from chopsticks.tunnel import Local
def func():
import __bubble__
__bubble__.debug("Hi there!")
time.sleep(1) # Wait stderr to be flushed
for i in range(3):
print("Call #%s" % i)
m = Local() # triggers __del__ in cpython starting 2nd iteration
m.call(func)
outputs the following:
$ python3 test.py server
Call #0
[localhost] Hi there!
Call #1
Call #2
...no second output...
My Env:
Sample program
from chopsticks.tunnel import Local
import git
def do_it():
repo=git.Repo('/my_git_repo')
return repo.working_dir
local_tun = Local('localhost')
print(local_tun.call(do_it))
on python 2.7 the above code fails with
Traceback (most recent call last):
File "test.py", line 11, in <module>
print(local_tun.call(do_it))
File "/usr/local/lib/python2.7/dist-packages/chopsticks-1.0-py2.7.egg/chopsticks/tunnel.py", line 299, in call
raise RemoteException(ret.msg)
chopsticks.tunnel.RemoteException: Host 'localhost' raised exception; traceback follows
Traceback (most recent call last):
File "bubble.py", line 172, in wrapper
File "bubble.py", line 236, in do_call
File "chopsticks://chopsticks/serialise_main.py", line 143, in execute_func
f = deserialise_func(*func_data)
File "chopsticks://chopsticks/serialise_main.py", line 132, in deserialise_func
__import__(mod)
ImportError: No module named glob
On python 3.5 , code works fine.
When I tried to debug using rpdb, I found the below list of imports from deserialise_func
(Pdb) p imports
set(['git.index.glob', 'git.index.stat', 'git.contextlib', 'git.logging', 'git.objects.mimetypes', 'git.platform', 'git.repo.git', 'git.index.sys', 'git.sys', 'git.objects.git', 'git.objects.logging', 'git.subprocess', 'git.objects.time', 'git.collections', 'git.os', 'git.config', 'git.functools', 'git.objects.calendar', 'git.refs.remote', 'git.objects.submodule.io', 'git.refs', 'git.locale', 'git.refs.git', 'git.diff', 'git.repo.fun', 'git.odict', 'git.objects.io', 'git.index', 'git.objects.os', 'git.objects', 'git.refs.symbolic', 'git.refs.log', 'git.index.typ', 'git.objects.submodule.util', 'git.repo.collections', 'git.objects.collections', 'git.repo.logging', 'git.objects.submodule.unittest', 'git.refs.tag', 'git.objects.fun', 'git.objects.submodule.base', 'git.objects.string', 'git.objects.util', 'git.db', 'git.objects.base', 'git.index.functools', 'git.util', 'git.objects.commit', 'git.getpass', 'git.refs.os', 'git.stat', 'git.ConfigParser', 'git.codecs', 'git.objects.blob', 'git.objects.tree', 'git.inspect', 'git.objects.submodule', 'git.objects.submodule.logging', 'git.objects.datetime', 'git.repo.string', 'git', 'git.refs.time', 'git.objects.gitdb', 'git.re', 'git.abc', 'git.index.os', 'git.index.fun', 'git.shutil', 'git.objects.stat', 'git.cmd', 'git.refs.reference', 'git.threading', 'git.index.subprocess', 'git.signal', 'git.refs.re', 'git.unittest', 'git.compat', 'git.objects.tag', 'git.index.git', 'git.time', 'git.index.struct', 'git.objects.submodule.git', 'git.repo', 'git.refs.head', 'git.repo.gitdb', 'git.gitdb', 'git.objects.submodule.weakref', 'git.index.base', 'git.index.util', 'git.io', 'git.index.gitdb', 'git.exc', 'git.objects.submodule.root', 'git.repo.re', 'git.remote', 'git.objects.re', 'git.repo.os', 'git.repo.gc', 'git.index.io', 'git.git', 'git.repo.sys', 'git.objects.submodule.os', 'git.objects.submodule.stat', 'git.refs.gitdb', 'git.index.tempfile', 'git.objects.submodule.uuid', 'git.index.binascii', 'git.repo.base'])
Using loop.stop()
to pass a result back to the calling thread can result in losing sync if the loop has previously crashed somehow.
I was able to get the following code:
print('num_users', tun.call(num_users))
print('getpass', tun.call(getpass.getuser))
to produce this output:
shadow_users root
getpass 40
This can be fixed by clearing the Tunnel's callbacks if the loop crashes. However, better would be to actually tie back the request ID to the response being waited for, and only terminate the loop if it matches.
I wrote this code in a Jupyter Notebook cell:
import os
from chopsticks.tunnel import SSHTunnel, Docker, Local
tun = Docker('docker')
class DockerLocal(Local):
"""A Python subprocess on a docker container"""
python2 = python3 = 'python'
def num_procs():
return sum(fname.isdigit() for fname in os.listdir('/proc'))
def local_tun():
with DockerLocal() as tun:
tun.call(num_procs)
tun.call(local_tun)
This crashes with this exception:
...snip...
/home/mauve/dev/chopsticks/chopsticks/tunnel.py in handle_imp(self, mod)
162 # Special-case main to find real main module
163 main = sys.modules['__main__']
--> 164 path = main.__file__
165 self.write_msg(
166 OP_IMP,
AttributeError: module '__main__' has no attribute '__file__'
While the importer maintains a cache on each client, the deployment host traverses sys.path every time an import request is received. This is wasteful when we expect that in most cases imports will be needed by multiple connected tunnels.
The variation in Python interpreter paths means that it is always going to be unreliable to assume it's at a fixed location, such as /usr/bin/python{2,3}
, as in the current implementation. Indeed we already work around this for the Docker Python images.
Instead, the bootstrap script could identify and exec an appropriate Python interpreter. By default we could just try python
- which is likely to exist on the majority of systems - and switch if this is not correct and we can identify a more likely candidate.
Making this more difficult, there are several desirable properties of the current implementation to preserve.
The SSHTunnel doesn't appear to detect connection interruption in some cases.
Steps to reproduce:
Mark Shannon ran his tool, lgtm, over Chopsticks, and it found errors (and Chopsticks found false positives in lgtm!)
https://lgtm.com/projects/g/lordmauve/chopsticks
We should triage the issues identified here.
The import handler on the host looks through sys.path for Python code to send to the client. However it only looks in physical paths. Many Python modules may be installed as zipped eggs. Code from these should be importable.
Python supports arbitrary import hooks (we even make use of these), so there should be an API to allow this lookup to be extended by users. Alternatively, perhaps we can make use of Python's import hooks themselves to load code - though this may involve importing it on the host, which we perhaps do not want to do.
Are you still using chopsticks, and if so would you like a PR to improve pencode performance?
I been experimenting, to improve the speed of dumping and loading. I've arrived at pencode_read5, a variant that uses dictionary dispatch in the decoder, removes the opcodes for singletons, and replaces obuf
with BytesIO.read()
. The result is
pdecode()
None
, True
, and False
becoming referencesBenchmark | CPython 2.7 | CPython 3.6 |
---|---|---|
cpickle,proto=2,dumps | 21.1 ms +- 0.5 ms | 9.09 ms +- 0.14 ms |
cpickle,proto=2,loads | 19.0 ms +- 0.9 ms | 9.71 ms +- 0.22 ms |
pencode,proto=None,dumps | 49.7 ms +- 0.5 ms | 64.0 ms +- 0.7 ms |
pencode,proto=None,loads | 141 ms +- 7 ms | 194 ms +- 13 ms |
pencode_read5,proto=None,dumps: | 49.0 ms +- 0.8 ms | 64.1 ms +- 0.8 ms |
pencode_read5,proto=None,loads | 59.9 ms +- 0.9 ms | 78.0 ms +- 8.1 ms |
In 4e6028b a REPL was added that allows running Python code in a number of Docker containers at once.
This is an interesting proof-of-concept, but needs documentation, command-line parameters such as hosts to connect to, and finally documentation, to be useful in the general case.
If Chopsticks is to be usable for configuration management, it needs to be capable of acquiring root permissions.
I'm having a problem running this simple function on a remote host:
import os
def print_env():
print(os.environ)
The following is a copy of my terminal session:
Python 3.5.1 (default, Apr 18 2016, 11:46:32)
Type "copyright", "credits" or "license" for more information.
IPython 5.1.0 -- An enhanced Interactive Python.
? -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help -> Python's own help system.
object? -> Details about 'object', use 'object??' for extra details.
In [1]: from chopsticks.tunnel import Tunnel
...: tun = Tunnel(host='my.remote.host', user='my_user_name')
...:
In [2]: import time
...: print('Time on %s:' % tun.host, tun.call(time.time))
...:
Time on my.remote.host: 1499942192.4012492
In [3]:
In [3]: import os
...: def print_env():
...: print(os.environ)
...:
In [4]: print_env()
...:
environ({'LC_CTYPE': 'UTF-8', 'COMMAND_MODE': 'unix2003', 'Apple_PubSub_Socket_Render': '/private/tmp/com.apple.launchd.P6DZWedA1U/Render', 'LANG': 'en_US.utf-8', ... 'TERM_PROGRAM': 'iTerm.app'})
In [5]: tun.call(print_env)
...:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-5-5ef9d5c918df> in <module>()
----> 1 tun.call(print_env)
/Users/my_user_name/.virtualenvs/default/lib/python3.5/site-packages/chopsticks/tunnel.py in call(self, callable, *args, **kwargs)
286 """
287 self.connect()
--> 288 self._call_async(loop.stop, callable, *args, **kwargs)
289 ret = self._run_loop()
290 if isinstance(ret, ErrorResult):
/Users/my_user_name/.virtualenvs/default/lib/python3.5/site-packages/chopsticks/tunnel.py in _call_async(self, on_result, callable, *args, **kwargs)
295 id = self._next_id()
296 self.callbacks[id] = on_result
--> 297 params = prepare_callable(callable, args, kwargs)
298 self.reader.start()
299 self.write_msg(
/Users/my_user_name/.virtualenvs/default/lib/python3.5/site-packages/chopsticks/serialise_main.py in prepare_callable(func, args, kwargs)
124 """Prepare a callable to be called even if it is defined in __main__."""
125 if isinstance(func, types.FunctionType) and func.__module__ == '__main__':
--> 126 func_data = serialise_func(func)
127 return execute_func, (func_data,) + args, kwargs
128 return func, args, kwargs
/Users/my_user_name/.virtualenvs/default/lib/python3.5/site-packages/chopsticks/serialise_main.py in serialise_func(f, seen)
60 # expressions
61 code = compile(source, '<main>', 'exec')
---> 62 names = trace_globals(code)
63
64 imported_names = {}
/Users/my_user_name/.virtualenvs/default/lib/python3.5/site-packages/chopsticks/serialise_main.py in trace_globals(code)
14 global_ops = (LOAD_GLOBAL, LOAD_NAME)
15 loads = set()
---> 16 for op, arg in iter_opcodes(code.co_code):
17 if op in global_ops:
18 loads.add(code.co_names[arg])
/Users/my_user_name/.virtualenvs/default/lib/python3.5/site-packages/chopsticks/serialise_main.py in iter_opcodes(code)
31 if sys.version_info >= (3, 4):
32 # Py3 has a function for this
---> 33 for _, op, arg in dis._unpack_opargs(code):
34 yield (op, arg)
35 return
AttributeError: module 'dis' has no attribute '_unpack_opargs'
A user reports that some versions of Python lack dis._unpack_opargs()
, causing a crash.
Currently we use a version check, which is unreliable. We should catch the AttributeError and only use the fallback in that case.
Trying to pass a function in __main__
to the remote agent in Python 2 causes
ErrorResult(u'Host \'worker-1\' raised exception; traceback follows\n\n Traceback (most recent call last):\n File "bubble.py", line 144, in wrapper\n File "bubble.py", line 160, in handle_call_thread\n ImportError: Cannot re-init internal module __main__')
Remote processes should be able to import Chopsticks and construct their own tunnels. This enables several things - such as tunneling to a remote host and then using the Sudo() tunnel for escalated privileges.
There are a couple of issues blocking this; one is that remote processes can not currently find the bubble code. Also, the imp handler doesn't know how to consult the bubble's importer to resolve imports.
It is preferable that this cannot result in infinite recursion. To avoid this, it might be possible to set a (global) depth limit like setrecursionlimit().
Need API for sending and receiving arbitrarily large files.
Sending we need the ability to pass arbitrary large files as parameters to remote hosts, while receiving we may just need a single API to retrieve files by path.
If we pass the wrong arguments to a tunnel (in this case a Docker was constructed without passing a name), then a traceback is printed (however the program does not crash).
Exception ignored in: <object repr() failed>
Traceback (most recent call last):
File "/home/mauve/dev/chopsticks/chopsticks/tunnel.py", line 533, in __del__
self.close()
File "/home/mauve/dev/chopsticks/chopsticks/tunnel.py", line 511, in close
if not self.connected:
AttributeError: 'Docker' object has no attribute 'connected'
We can get around this by setting a class variable connected = False
in the relevant base class.
There is currently no synchronisation between stdout (ie. call-return) and stderr streams. In particular, once we get a response on the stdout stream, the tunnel may be closed - before the stderr data can be received.
To avoid this we could try to sync up the stderr and ensure it is flushed before connection close. This could be quite tricky to achieve. The remote process could have spawned a long-running subprocess that writes to stderr, for example. We could perhaps insert a marker into stderr and read until we see it before shutting the tunnel down.
Alternatively perhaps we there is a way of inserting some grace time to finish reading stderr. It would be important that this doesn't involve blocking the main thread. For example, closing tunnels could be handed off to the stderr thread to shut down fully if stdin and stdout are closed.
The importer hook in the remote bubble does not support the get_data() method, meaning that packages that include package data cannot be imported correctly.
As Chopsticks' own bubble.py is loaded as package data, this precludes importing chopsticks itself on the agent.
Chopsticks cannot currently deal with interactive password authentication.
We should ensure that this is well-documented; we could also consider adding documentation on how to configure SSH to ensure that this is the case (if it is not).
We might also want to look at how Chopsticks works with commandline ssh-askpass prompts when keys are passphrase-encrypted.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.