xlab-uiuc / acto-cloudlab Goto Github PK
View Code? Open in Web Editor NEWProfile and startup scripts for running Acto on CloudLab
Profile and startup scripts for running Acto on CloudLab
We suspect it's an issue with CloudLab.
An easy workaround for the time being is to rerun the startup manually:
sudo su - geniuser
bash /local/repository/scripts/cloudlab_startup_run_by_geniuser.sh
exit
The startup occasionally fails. The "Startup" column will finally become Exited (2)
instead of Finished
.
One of our captured logs says:
TASK [Install python packages using pip] ***************************************
fatal: [127.0.0.1]: FAILED! => {"changed": false, "cmd": ["/usr/bin/python3", "-m", "pip.__main__", "install", "-r", "/users/alice/workdir/acto/requirements.txt"], "msg": "stdout: Collecting deepdiff~=6.3.0\n\n:stderr: WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f0ca2a1d130>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution')': /packages/fe/b3/81bb598d24f1a48eaceb32243a91016385c0599196a59eaff6cd29299334/deepdiff-6.3.1-py3-none-any.whl\n WARNING: Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f0ca2a1d0d0>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution')': /packages/fe/b3/81bb598d24f1a48eaceb32243a91016385c0599196a59eaff6cd29299334/deepdiff-6.3.1-py3-none-any.whl\n WARNING: Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f0ca2a1d1f0>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution')': /packages/fe/b3/81bb598d24f1a48eaceb32243a91016385c0599196a59eaff6cd29299334/deepdiff-6.3.1-py3-none-any.whl\n WARNING: Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f0ca2a1d3a0>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution')': /packages/fe/b3/81bb598d24f1a48eaceb32243a91016385c0599196a59eaff6cd29299334/deepdiff-6.3.1-py3-none-any.whl\n WARNING: Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f0ca2a1d5e0>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution')': /packages/fe/b3/81bb598d24f1a48eaceb32243a91016385c0599196a59eaff6cd29299334/deepdiff-6.3.1-py3-none-any.whl\nERROR: Could not install packages due to an EnvironmentError: HTTPSConnectionPool(host='[files.pythonhosted.org](http://files.pythonhosted.org/)', port=443): Max retries exceeded with url: /packages/fe/b3/81bb598d24f1a48eaceb32243a91016385c0599196a59eaff6cd29299334/deepdiff-6.3.1-py3-none-any.whl (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f0ca2a1d7c0>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))\n\n"}
which suggests it's possibly a DNS problem.
The default resolv.conf
is
nameserver 130.127.132.51
search clemson.cloudlab.us
nameserver 155.98.60.2
of which the entries are within the clemson cluster.
...
Things yet to understand:
c8220
?What makes things hard is the problem is occasional and unpredictable, thus hard to reproduce.
Long-term solutions:
resolv.conf
.(Continued from here: xlab-uiuc/acto#247)
Currently we always checkout the sosp-ae
branch in the Ansible script:
acto-cloudlab/scripts/ansible/acto.yaml
Line 18 in 9c61d55
But this repo (acto-cloudlab) should better be used not only for AE. And this would become a tiny problem when we indeed want to run code on main
and others. Possible solutions:
acto-cloudlab/scripts/ansible/go.yaml
Line 30 in 61f1bfd
E.g. certain software (other than Acto and its dependencies) will modify $PATH
before this line or even before .bashrc
and this line would invalidate all their changes.
Not sure what was the intention here so let me open an issue first instead of directly changing it.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.