bibiserv / jobproxy Goto Github PK
View Code? Open in Web Editor NEWLicense: Apache License 2.0
License: Apache License 2.0
This issue is for collecting minor updates before releasing beta.1.release:
ChronosJobProxy should be able to delete tasks.
According to issue #76, the JobProxyDRMAA framework must translate the meaning of the limitation task constraints from the specification to the underlying DRMAA grid scheduler.
Solved in attached patch.zip
The current JobProxyServerCli implementation only have a console mode (stopping after hit a key on keyboard). We should offer a run_for_ever option to JobProxyServerCli and also add a init script to run it as daemon.
Add Framework exception for every response status that is not successful
Each framework needs a different kind of Configuration.
Example:
We need a configuration interface that each framework implements.
This interface should for example demand
We need a documentation for the client endpoints. Such a documentation would allow us produce client api in multiple languages.
Optional: One way could be to document them using swagger.io .
volumes
is not initialized in JobConfig.Container
JobProxy should be packaged to one jar containing all dependencies.
Currently only maven 3.0.x creates a full functional running shaded jar. Building a shaded jar with mvn 3.3.x leads to a nonfunctional jar. Maven reports during packaging a lot of overlapping classes, which all comes from a com.spotify#docker-client dependency (depends on an outdated jackson.core#databind and maybe other libs). A fast solution is to remove any dependency to the JobProxyJavaDocker module. However we have to find a good solution to resolve such dependency problems ...
With the first tests created by issue #36 we should register a continuous integration service that checks each push to this repo against current tests.
This also demands that we continue our work by using Pull Requests.
This means that each feature or fix we are working on must be a separate branch that is merged into the develop branch by using a Pull Request. If the pull request passes all existing tests we can merge it into the developer branch.
I suggest using Circle Ci because we are allowed to use docker container if we want to.
Submitting the task ID via the request body of the delete request isn't common practice and not supported by some request APIs.
Suggestion:
We should implement java-docker jobproxy.
To be really independent from the framework used by the JobProxy server, we must specify general status codes (and description) used by all frameworks. The BiBiServ defines an exhausted list of status codes/descriptions we could use for this purpose, e.g.:
Code | Description |
---|---|
600 | task finished/completed |
601 | task submitted |
602 | task preprocessing |
603 | task pending |
604 | task running |
700 | general error / description is not a static string |
See for complete list. For JobProxy we just need a few of them.
The state object could/should then provide two another properties like internal_statuscode and internal_statusdescription. However, this might be only useful for debugging ...
Batch grid systems like [S|O|U] Grid Engine or Torque are still popular and often available/used on non "cloud" compute clusters to distribute job requests. Most batch grid systems supports the DRMAA API for a general - framework independent - way of job control. Since the latest DRMAA specification (version 2) is only supported by the Univa Grid Engine (June 2016) this plugin should use the widely spread version 1.
Since jobproxy is just a REST layer, we have to prepare a client which could produce human readable responses anyway.
Thats why I suggest to remove human readable endpoints like this one:
We should split the existing project in different subprojects:
In order to test jobproxy with each implementing framework already installed, I suggest to prepare vagrant config files for each framework. Maybe this can directly combined with maven (i.e.:http://nicoulaj.github.io/vagrant-maven-plugin/examples/running-a-vm-during-integration-tests.html)
User should be able to decide on which port/ url the JobProxyServer should be run.
At the moment it is hardcoded to
http://localhost:9999
We have to prepend a version in our endpoints:
i.e.:
v1/jobproxy/submit
I think we should include the Maven Enforcer Plugin. It will help us to avoid issues like #66 .
Especially the rules:
But also other rules could be useful:
We have to document our project with a User and Developer Guide. We should also provide a download for the fat jar (see #5 )
User Guide:
Developer Guide
A README.md file is missing. The framework built-in documentation should be good starting point.
The following points must be revised:
Since delete rest method accepts just a path parameter @consumes annotation is not necessary.
Update guide with the newest swagger2markup jar.
Implement and document the state endpoint.
We should make an alpha release and provide it on a maven based artifactory.
(I suggest jitpack)
Is currently done via POST request, which should be used for submitting/creating content.
Suggestion:
While writing tests for #36 I noticed a few possible improvements regarding the submit endpoint:
Ports
Mounts
Mount
Additional Points:
Up to now security and authentication wasn't important, because the JoBProxy is only used in "safe" environments (vm's, docker, local machine with NO external access). But nevertheless with have to think about this in near future.
However, the integration of security and authentication is quite easy:
The idea is to determine in a generic way the url of a framework and the port it listens to.
I figured out two different approaches:
Problem: This approach demands knowledge of the path structure that zookeeper uses which may not be consistent for each framework.
For now I will use the second approach.
JobProxyServer should expose an API layer. This API could be used by other systems/platforms and by an internal commandline handler.
JobProxy should accept url and port of the zookeeper instance.
When starting JobProxy an error is thrown that a specific logging class can not be found.
Replace swing Window with logging output.
We have to enhance the task specification for memory and cpu time ? For now it is unclear if given constraints limits the task for all requested cores or just for a single one. My "natural" understanding (limitation aims the whole task) could be complete different from another user (e.g. limitation aims a single core).
example :
{
"user" : "bibiserv",
"cores" : 32,
"memory" : 120,
"cputime" : 10,
"stdout" : "/vol/.../bibiserv2_2016-07-13_122031_Ts5W6",
"stderr" : "/vol/.../bibiserv2_2016-07-13_122031_Ts5W6",
"cmd" : "/vol/.../bibiserv2_2016-07-13_122031_Ts5W6/BatchFile_751909506511939005.sh",
"container" : null
}
For my understanding the whole task needs 32 core to run and is limited to 120 GB memory and 10 hour cpu time. A typically grid engine user understands the above definition as memory limitation per core (3840 GB = 32 * 120 GB) and cpu time limitation for the whole task.
API should provide readonly and writeable Docker Volumes.
We should create a commandline jobproxy client. This is also a good opportunity to to test the swagger code generation tool (https://github.com/jkrue/jobproxy#how-to-build-your-own-jobproxy-client)
Chronos response has the status 500.
We need unit tests that test the cli server, model and various implementations of the JobProxy Interface.
I suggest
The CLI should have at least one verbose switch to increase the log level. However, the current Logger framework (slf4j) has (as I know) no option to change the logger properties during runtime, so we have to replace it with a more flexible Logger.
The current behave tests are not testing the daemon mode of jobproxy.
To make the current list of tests complete we should add a test for it.
We need a second reference implementation.
I think Aurora is good candidate: https://github.com/apache/aurora .
JobProxyServer should be able to list all available frameworks.
We need a REST Endpoint and a JobProxyServer call.
The cmd attribute of an task is currently annotated as required. When running a docker container with default command it isn't necessary to set a cmd here.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.