Coder Social home page Coder Social logo

sge-gpuprolog's Introduction

Gridengine GPU prolog

Scripts to manage NVIDIA GPU devices in SGE 6.2u5.

The last Sun Grid Engine that is packaged in Ubuntu 14.04 LTS does not contain the RSMAP functionality that is implemented in recent Univa Grid Engine. The ad-hoc scripts in this package implement resource allocation for NVIDIA devices.

Installation

First, set up consumable complex gpu.

qconf -mc

#name               shortcut   type        relop   requestable consumable default  urgency
#----------------------------------------------------------------------------------------------
gpu                 gpu        INT         <=      YES         JOB        0        0

At each exec-host, add gpu resource complex. For example,

qconf -aattr exechost complex_values gpu=1 node01

Set up prolog and epilog in the queue.

qconf -mq gpu.q

prolog                sgeadmin@/path/to/sge-gpuprolog/prolog.sh
epilog                sgeadmin@/path/to/sge-gpuprolog/epilog.sh

Alternatively, you may set up a parallel environment for GPU and set start_proc_args and stop_proc_args to the packaged scripts.

Usage

Request gpu resource in the designated queue.

qsub -q gpu.q -l gpu=1 gpujob.sh

The job script can access CUDA_VISIBLE_DEVICES variable.

#!/bin/sh
echo $CUDA_VISIBLE_DEVICES

The variable contains a comma-delimited device IDs, such as 0 or 0,1,2 depending on the number of gpu resources to be requested. Use the device ID for cudaSetDevice().

sge-gpuprolog's People

Contributors

chawater avatar kyamagu avatar petronny avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

sge-gpuprolog's Issues

rmdir does not have -f

It looks like the rmdir on ubuntu 16.04 LTS does not have the -f option:

rmdir: invalid option -- 'f'
Try 'rmdir --help' for more information.

from the man page:

DESCRIPTION
       Remove the DIRECTORY(ies), if they are empty.

       --ignore-fail-on-non-empty

              ignore each failure that is solely because a directory

              is non-empty

       -p, --parents
              remove DIRECTORY and its ancestors; e.g., 'rmdir -p a/b/c' is similar to 'rmdir a/b/c a/b a'

       -v, --verbose
              output a diagnostic for every directory processed

       --help display this help and exit

       --version
              output version information and exit
...
GNU coreutils 8.25                                                                                                    February 2016                                                                                                              RMDIR(1)

Race condition

There can be a race condition in the lock acquisition. Need to use mkdir or noclobber shell option.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.