Coder Social home page Coder Social logo

jobproxy's Introduction

CircleCI

Bibiserv

Web application framework mainly for bioinformatic developers to publish their tools with an user-friendly web interface.

Development-Guidelines

https://github.com/BiBiServ/Development-Guidelines

How to build the project?

dependency: >= Maven 3.3.9

Run

mvn package

How to version the project?

We decided that all modules should have the same version as the parent module. By using the below command in the project root you can update all child modules at once.

mvn  versions:set -DnewVersion=<version>

where

  • version = 2.1.0.alpha.2

jobproxy's People

Contributors

jkrue avatar mwiens avatar pbelmann avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

tlaufkoetter

jobproxy's Issues

Change JavaDockerJobProxy to LocalJobProxy

As mentioned by @jkrue in #29 , JavaDocker should be just a part of a Framework that executes tasks on the local machine.

  • Docker based tasks should be handled by JavaDocker
  • local processes by e.g. ProcessBuilder.

[S/|O|U]GE / DRMAA1 JobProxy

Batch grid systems like [S|O|U] Grid Engine or Torque are still popular and often available/used on non "cloud" compute clusters to distribute job requests. Most batch grid systems supports the DRMAA API for a general - framework independent - way of job control. Since the latest DRMAA specification (version 2) is only supported by the Univa Grid Engine (June 2016) this plugin should use the widely spread version 1.

JobProxy endpoints documentation

We need a documentation for the client endpoints. Such a documentation would allow us produce client api in multiple languages.

Optional: One way could be to document them using swagger.io .

JobProxyServer API

JobProxyServer should expose an API layer. This API could be used by other systems/platforms and by an internal commandline handler.

Add test for daemon mode

The current behave tests are not testing the daemon mode of jobproxy.
To make the current list of tests complete we should add a test for it.

Update Pom Artifact versions

  • Implementing Interfaces should state the current JobProxy version number
  • Update Readme, so that on every release version numbers of maven child modules state the correct version number.

revise stateGet rest endpoint

The following points must be revised:

  • decide and document if stderr and stdout are paths to a file or should contain the output of the corresponding files.
  • code should be an int
  • since each framework reports to a different degree the exit code of the process we should clarify that at least the succesful exit of a job should be 0. Any other exit code determined by the framework is up to the jobproxy implementation
  • the stateGet rest endpoint returns an object called States that contains a list of state objects. In my opinion we can skip this object and simply returns a list of state objects
  • How about adding a status field? It should return running or finished.
  • Rename rest method statePost to specificState because statePost is not using post.

Framework URL and Port

The idea is to determine in a generic way the url of a framework and the port it listens to.
I figured out two different approaches:

  1. Ask Mesos-Master by using the rest api: http://mesos.apache.org/documentation/latest/endpoints/
    Problem: The only available tag here is webui_url. Which is useful for chronos but is maybe not reliable for other framworks which distinguish between webui_url and a rest api which listens on a different endpoint.
  2. Use zookeeper to find the framework url and port.

Problem: This approach demands knowledge of the path structure that zookeeper uses which may not be consistent for each framework.

For now I will use the second approach.

Security / Authentication

Up to now security and authentication wasn't important, because the JoBProxy is only used in "safe" environments (vm's, docker, local machine with NO external access). But nevertheless with have to think about this in near future.

However, the integration of security and authentication is quite easy:

  • https instead of http
  • basic http authentication

more detailed definition of task constraints (memory, cputime)

We have to enhance the task specification for memory and cpu time ? For now it is unclear if given constraints limits the task for all requested cores or just for a single one. My "natural" understanding (limitation aims the whole task) could be complete different from another user (e.g. limitation aims a single core).

example :

{
  "user" : "bibiserv",
  "cores" : 32,
  "memory" : 120,
  "cputime" : 10,
  "stdout" : "/vol/.../bibiserv2_2016-07-13_122031_Ts5W6",
  "stderr" : "/vol/.../bibiserv2_2016-07-13_122031_Ts5W6",
  "cmd" : "/vol/.../bibiserv2_2016-07-13_122031_Ts5W6/BatchFile_751909506511939005.sh",
  "container" : null
}

For my understanding the whole task needs 32 core to run and is limited to 120 GB memory and 10 hour cpu time. A typically grid engine user understands the above definition as memory limitation per core (3840 GB = 32 * 120 GB) and cpu time limitation for the whole task.

List available Frameworks

JobProxyServer should be able to list all available frameworks.
We need a REST Endpoint and a JobProxyServer call.

Querying a single task's state

Is currently done via POST request, which should be used for submitting/creating content.

Suggestion:

  • using GET request, submit ID as a request or URL parameter

Endpoint Version

We have to prepend a version in our endpoints:

i.e.:

v1/jobproxy/submit

JobProxy Documentation

We have to document our project with a User and Developer Guide. We should also provide a download for the fat jar (see #5 )

User Guide:

  • Where do download the jobProxy
  • How to start the JobProxy
  • Dependencies (Java Version?)
  • How to configure JobProxy
  • current framework supported

Developer Guide

  • How to build JobProxy
  • How to package JobProxy
  • Branch naming schema(?)
  • Where to find the endpoint documentation for building own client

revise submit endpoint

While writing tests for #36 I noticed a few possible improvements regarding the submit endpoint:

Ports

  • The container object has a list of ports objects. The mounts object contains a port object. I suggest that the container has a list of port objects. So we do not need the ports object.
  • The fields host and container should be marked as NotEmpty and NotNull

Mounts

  • The container object has a list of mounts objects. The mounts object contains a mount object. I suggest that the container has a list of mount objects. So we do not need the mounts object.

Mount

  • Modi: At the moment the rest endpoint accepts any string. We should restrict it to the allowed values (rw and ro)
  • all fields should be marked as NotNull
  • host path is allowed to be null since, a volume can be specified by using container

Additional Points:

  • add container cmd field as mentioned by @jkrue in #56 .
  • clarify whether stderr and stdout are path or not as mentioned in #58

Add daemon mode for JobProxyServerCli implementation !

The current JobProxyServerCli implementation only have a console mode (stopping after hit a key on keyboard). We should offer a run_for_ever option to JobProxyServerCli and also add a init script to run it as daemon.

Maven Parent Module

We should split the existing project in different subprojects:

  • JobProxyInterfaces : Model classes that must be implemented.
  • JobProxyChronos : Chronos JobProxy implementation
  • JobProxy : Main Class that add dependencies to the existing Implementations (JobProxyChronos, ....)

Deleting tasks

Submitting the task ID via the request body of the delete request isn't common practice and not supported by some request APIs.

Suggestion:

  • ID as request parameter
  • ID as URL parameter

status code specification / state enhancement

To be really independent from the framework used by the JobProxy server, we must specify general status codes (and description) used by all frameworks. The BiBiServ defines an exhausted list of status codes/descriptions we could use for this purpose, e.g.:

Code Description
600 task finished/completed
601 task submitted
602 task preprocessing
603 task pending
604 task running
700 general error / description is not a static string

See for complete list. For JobProxy we just need a few of them.

The state object could/should then provide two another properties like internal_statuscode and internal_statusdescription. However, this might be only useful for debugging ...

Error handling

  • errors should at least result in the appropriate HTTP response status code
    • suggestions:
      • 400 - bad request (in case of input error)
      • 502 - bad gateway (in case of a missing mesos scheduler)

register CI service for develop and master branch

With the first tests created by issue #36 we should register a continuous integration service that checks each push to this repo against current tests.

This also demands that we continue our work by using Pull Requests.
This means that each feature or fix we are working on must be a separate branch that is merged into the develop branch by using a Pull Request. If the pull request passes all existing tests we can merge it into the developer branch.

I suggest using Circle Ci because we are allowed to use docker container if we want to.

Prepare beta.1.release

This issue is for collecting minor updates before releasing beta.1.release:

  • check if shaded jar is still build by using the current documentation

mvn 3.0 vs . mvn 3.3 // resolve dependency problems

Currently only maven 3.0.x creates a full functional running shaded jar. Building a shaded jar with mvn 3.3.x leads to a nonfunctional jar. Maven reports during packaging a lot of overlapping classes, which all comes from a com.spotify#docker-client dependency (depends on an outdated jackson.core#databind and maybe other libs). A fast solution is to remove any dependency to the JobProxyJavaDocker module. However we have to find a good solution to resolve such dependency problems ...

Verbose option for CLI / Replace Logger Framework

The CLI should have at least one verbose switch to increase the log level. However, the current Logger framework (slf4j) has (as I know) no option to change the logger properties during runtime, so we have to replace it with a more flexible Logger.

Release

We should make an alpha release and provide it on a maven based artifactory.
(I suggest jitpack)

Unit Tests for JobProxy

We need unit tests that test the cli server, model and various implementations of the JobProxy Interface.

I suggest

  • model based tests for testing the rest interface (using jerseytests)
  • simple BDD tests for testing the cli (maybe using jbehave)
  • server tests that checks starting/stoping the server
  • framework tests for testing the implementations
  • Basic tests that directly operate on the released jar.

Framework Configuration

Each framework needs a different kind of Configuration.

Example:

  • Mesos based frameworks need zookeeper url
  • Java Docker needs the path to docker.sock or the url

We need a configuration interface that each framework implements.

This interface should for example demand

  • that the configuration is exposed with a to string option.
  • that the name of the framework is reported.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.