Coder Social home page Coder Social logo

codewars-docker's People

Contributors

rthewitt avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Forkers

hammingcube

codewars-docker's Issues

Error messaging

Currently when things such as syntax errors are thrown, a pretty confusing stack trace is returned. For example. Given the following user code:

function a(){
[stdin]:2
ESS__, output: $STDOUT};
    console._log(JSON.stringify(json));

} catch(ex){
                                                                    ^^^^^
SyntaxError: Unexpected token catch
    at Object. ([stdin]-wrapper:6:22)
    at Module._compile (module.js:456:26)
    at evalScript (node.js:532:25)
    at Socket. (node.js:154:11)
    at Socket.EventEmitter.emit (events.js:117:20)
    at _stream_readable.js:920:16
    at process._tickCallback (node.js:415:13)

I'm currently using a regexp to pull out the basic error message. So we get:
SyntaxError: Unexpected token catch

User thinks to himself: Catch? WTF? There is no catch in my code...

The problem stems from the fact that we are combining kata framework code along with the user submitted code. We need to wrap a try/catch around the user's solution so that if there are any errors, we can catch them and send them out as a JSON response.

So the above code gets turned into something along these lines:

try{
    function a(){ 

    console.log(JSON.stringify({success: true, output: []}))
}catch(ex){
  console.log(JSON.stringify({success: false, error: ex.message}))
}

This isn't an issue with the current system because there is no need to wrap the code, since the node runner uses a shovel script along with the built-in node sandbox. The shovel/shim does the wrapping for us, but it happens in a seperate file.

In the case of Ruby code, we actually pre-check the code before attempting to execute it using a Ruby parsing engine.

Ideally we can remedy this for all lanuages by using a format that is more natural to typical programming... using seperate files. In the case of JavaScript we could create a shim that does this:

try{
    var result = require './kata.js' 
    console.log(JSON.stringify(result))
}catch(ex){
  console.log(JSON.stringify({success: false, error: ex.message}))
}

The ruby version would look much the same.

Container timeouts

Each docker container should only run for an established maximum time. Docker provides memory and CPU limits but not duration limits OOTB. Related issues raised to Docker indicate that this is considered out of scope. Existing solutions such as:

cont=$(docker run -d "$@")
code=$(timeout "$to" docker wait "$cont" || true)
docker kill $cont &> /dev/null

will not work with the Remote API. The timeout must be present in the bootstrap script, the success of which can be determined by the exit code of the container.

Is container run-time a sufficient metric?

Using the time received from inspect (real) will give us a rough idea of the efficiency of the solution and a control metric for load testing servers. However there exists many notions of time. For instance:

time node solution.js
real    0m0.294s
user    0m0.247s
sys 0m0.008s

This says that the real time of the process was more than the actual time it took for the entire process to run as perceived by the end-user. User+Sys time is a more accurate time involving actual CPU usage. The difference can be apparent if you try:

time sleep 2
real    0m2.003s
user    0m0.001s
sys 0m0.002s

In this case the CPU is not being used and real time is a more accurate reflection of reality.

There are two points to consider. The first is that communicating anything but the real time of the docker container would have to be measured from inside the container and communicated to the outside world. Since we are trying to avoid disk-IO and superfluous networking we are currently restricted to stdout/stderr. We would need to somehow restructure our sandboxes to separate the stdout/stderr of the kata solution and the bootstrap script. We have no concrete plan for this yet but it may be a good idea.

The second point is that due to CPU sharing the duration of the docker container may be affected by server/cluster load. This needs to be confirmed by load tests.

Is this notion of time sufficient in the short-term?

STDOUT response is being truncated

There seems to be a limit on the string buffer size that can be returned by stdout. Currently some responses are being truncated. Ryan I sent you an email with more details.

Add timing metrics to containers

Containers should provide metrics about run duration for multiple reasons. The first is to establish how cluster load affects execution, and the second is to provide a measure of the efficiency of a given kata solution.

Use "forever" to keep server in operation

I had an error that I'm not catching last night around midnight which caused a full derail. It was related to a container that is behaving and responding perfectly well. This is the last known output of that container:

{"exitCode": 8, "stderr": "\n[stdin]:70\n                        throw error;\n                              ^\nTest:Error: sumStrings('712569312664357328695151392', '8100824045303269669937') - Expected: 712577413488402631964821329, instead got: 7.125774134884027e+26\n", "stdout": "PASSED:::Test Passed: Value == 579\nPASSED:::Test Passed: Value == 8842\nPASSED:::Test Passed: Value == 10367\nPASSED:::Test Passed: Value == 100\nPASSED:::Test Passed: Value == 8670\nPASSED:::Test Passed: Value == 5\nFAILED:::sumStrings('712569312664357328695151392', '8100824045303269669937') - Expected: 712577413488402631964821329, instead got: 7.125774134884027e+26\n"}

This output is valid JSON, however my logs indicate there was a problem communicating with that container, and somehow node threw an error:

throw er; // Unhandled 'error' event
              ^
Error: connect ETIMEDOUT

These errors are handled in my code. Although I am using a third party library for some of the docker communication, looking at the logs indicated that docker had a strange hiccup. Namely 34 jobs started for this container, but 35 closed. The very last job closed twice.

Instead of hunting down and trying to defend against any docker bugs encountered in our external libs, I think it's probably best if I move to using "forever" for our node process, and have some sort of defensive "before cleanup" to ensure that the pool was properly closed the last time around.

Perhaps a configuration option like

{
cleanOnStartup: true
}

would be in order to safeguard developers contributing to the project.

Docker rm not working correctly

Since the last docker release I've noticed that the docker remove via the remote API fails 100% of the time. For informational purposes I've left it as is, so that I can see plainly how many containers are killed due to inactivity based on the last patch. The intention was to use a cron job to manually clean these up once or twice a day as a workaround.

Last night I noticed that docker kill was also failing. Over 330 containers were running on the system.

TODO: improve error handling for this cleanup method AND add cron job to handle cleanup OR change strategy for inactivity timeout.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.