Coder Social home page Coder Social logo

apsis's People

Contributors

alexhsamuel avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar

apsis's Issues

archive runs

Add a mechanism to archive runs so they aren't visible anymore.

handle exceptions from config/user code

Handle exceptions carefully when:

  • performing any template expansion
  • binding arguments
  • calling programs, actions, or any other user-extensible types

Produce useful diagnostics and put the run into the error state.

ssh options

When invoking ssh:

BatchMode: yes
CheckHostIP: no
StrictHostKeyChecking: no

also disable X11 forwarding and whatever else we don't need.

better rerun indication

On a run view, show:

  • the original rerun
  • all reruns in the set
  • the latest rerun

Hide the rerun button unless this is a the latest rerun.

ScheduleAction reruns

If a ScheduleAction is about to schedule an existing run, it should rerun instead?

normalize arg order

In jobs, order the arguments in the same order they're specified in the job.

add license

(This repo was originally derived from github.com/alexhsamuel/ora.)

link to load output in run view is broken

When a run output is large, its output isn't loaded automatically. The link to load it is broken.

Uncaught TypeError: Cannot read property 'output_url' of undefined

from RunView.vue:120

update history in run view

When watching a run, the history does not live-update.
Add this to the run websocket? or poll?

This is partly because log history is immediately added after run transitions.

persist state in run view

Put it in the store, so that when we come back to the view the search/filter state is still there.

agent ssh timeout

If ssh for starting the agent fails slowly, this blocked Scheduled. Add a timeout.

"apsis runs" fails

$ apsis runs
Error: unknown error [API status: 500]
18:11:41.836 root                     [E] Traceback (most recent call last):
  File "/home/alex/sw/conda/envs/apsis/lib/python3.7/site-packages/sanic/app.py", line 603, in handle_request
    response = await response
  File "/home/alex/dev/apsis/python/apsis/service/api.py", line 416, in runs
    until   =until,
TypeError: query() got an unexpected keyword argument 'until'

start runs concurrently

The scheduler should start all ready runs concurrently, rather than awaiting each one to start.

multiple agents started in race

If two jobs are started simultaneously on the same host as the same user, a race can cause multiple agent processes to start. Only the last one is live, as it overwrites the pid file.

Improve atomicity in apsis.agent.main.

hierarchical job names

We want to to support hierarchical job names, e.g. folder / job.
Currently jobs.load_yaml_files() reads subdirectories recursively and creates job ids containing /.
This breaks the REST API as the job id goes into URL paths and / is not allowed.

  • Do we want to separate job id from job name now?
  • For job names, how do we specify paths? Using /?
  • How does this get mapped to URLs?
  • How is this displayed in the web UI?

scheduled run shows no program

For expected runs, we should populate the program (i.e. bind the job program) so that API requests show it. However, don't persist the program. Likewise for precos.

  • Bind program, precos when run is created.
  • Don't persist program, precos for expected runs in run DB.
  • When recreating expected runs, re-bind them.

better debug logging of agent interaction

In apsis log, show agent process number and other stuff.
In agent log, show agent process number on POST.
In command line agent, show proc and port of connected agent.
Better debugging of agents going up and down, to figure out why they are left behind.

websocket close after event loop ends

17:30:06.408 asyncio                  [E] Task was destroyed but it is pending!
task: <Task pending coro=<websocket_log() running at /Users/alex/dev/apsis/python/apsis/service/main.py:82> wait_for=<Future finished result=None> cb=[<TaskWakeupMethWrapper object at 0x11c1d1948>()]>
17:30:06.408 asyncio                  [E] Task was destroyed but it is pending!
task: <Task pending coro=<Sanic.handle_request() running at /Users/alex/sw/conda/envs/apsis/lib/python3.6/site-packages/sanic/app.py:556> wait_for=<Task pending coro=<websocket_log() done, defined at /Users/alex/dev/apsis/python/apsis/service/main.py:77> wait_for=<Future finished result=None> cb=[<TaskWakeupMethWrapper object at 0x11c1d1948>()]>>
17:30:06.410 asyncio                  [E] Task was destroyed but it is pending!
task: <Task pending coro=<WebSocketCommonProtocol.transfer_data() running at /Users/alex/sw/conda/envs/apsis/lib/python3.6/site-packages/websockets/protocol.py:530> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x11c1d1fd8>()]> cb=[<TaskWakeupMethWrapper object at 0x11c1d1bb8>()]>
17:30:06.410 asyncio                  [E] Task was destroyed but it is pending!
task: <Task pending coro=<WebSocketCommonProtocol.close_connection() running at /Users/alex/sw/conda/envs/apsis/lib/python3.6/site-packages/websockets/protocol.py:807> wait_for=<Task pending coro=<WebSocketCommonProtocol.transfer_data() running at /Users/alex/sw/conda/envs/apsis/lib/python3.6/site-packages/websockets/protocol.py:530> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x11c1d1fd8>()]> cb=[<TaskWakeupMethWrapper object at 0x11c1d1bb8>()]>>
Exception ignored in: <generator object WebSocketCommonProtocol.close_connection at 0x11b276db0>
Traceback (most recent call last):
  File "/Users/alex/sw/conda/envs/apsis/lib/python3.6/site-packages/websockets/protocol.py", line 855, in close_connection
  File "/Users/alex/sw/conda/envs/apsis/lib/python3.6/asyncio/streams.py", line 312, in close
  File "/Users/alex/sw/conda/envs/apsis/lib/python3.6/asyncio/selector_events.py", line 621, in close
  File "/Users/alex/sw/conda/envs/apsis/lib/python3.6/asyncio/base_events.py", line 575, in call_soon
  File "/Users/alex/sw/conda/envs/apsis/lib/python3.6/asyncio/base_events.py", line 358, in _check_closed
RuntimeError: Event loop is closed

run_ids are reused

Some expected runs produce entries in run_history but their run_ids are reused for other runs if Apsis is restarted. That causes the run_history for a run_id to mix history from multiple runs.

Either store the largest run_id in the DB, or make "expected" runs real and store them in the database. Or both.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.