Coder Social home page Coder Social logo

Comments (3)

BoPeng avatar BoPeng commented on August 30, 2024

It is possible to modify sos and allow automatic entering of newlines, but overall we are hitting a dead end. This is because sos uses this ssh channel quite often, for example, it will call something like ssh server sos status to retrieve the status of the tasks every so often (configurable, default to 30s), and it is simply not possible for you to enter that six-digit code every now and then. You can disable remote task status query but that will disrupt the execution of the entire workflow if there are tasks after the completion of the remote tasks.

There are several options:

  1. Login to the server and submit the jobs over there. The problem is that the master sos process will remain active on the head node, and will be killed at least on our clusters.
  2. Use a daemon process on the head node to communicate with outside sos instances, bypassing the ssh channel. I explored this option a while ago but did not finish it. The biggest problem is still keeping a process running on the head node.
  3. Submit the entire workflow to working nodes, namely using one worker node as the master node, and multiple other nodes as slave nodes. sos in theory supports this running mode but we have never seriously tested this mode.

from sos.

pgcudahy avatar pgcudahy commented on August 30, 2024

Thanks Bo, that's what I suspected. I have actually gotten option 3 working very well on our old cluster, but still with the jupyter notebook running on my own computer. With the new cluster I've tried to port my hosts.yml config and run jobs from a jupyter notebook running on a cluster instance, but it has been very brittle for unclear reasons. I'll keep looking into it.

from sos.

BoPeng avatar BoPeng commented on August 30, 2024

Thanks. Our Jupyter instance is out of the cluster but I will try to start a jupyter instance from within the cluster and submit jobs over there next time.

from sos.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.