Coder Social home page Coder Social logo

compute's Introduction

Language and Voice Laboratory Computing Resources

Table of Contents

Introduction

The Language and Voice Laboratory (LVL) runs a tiny computing “cluster” called Terra. This cluster consists of a few physical nodes, terra, torpaq and gaia.

Access is granted by request by a sysadmin in the LVL. Once you have a user account you can log into the main node:

Scheduler

The LVL cluster uses Slurm to handle compute job scheduling and resource allocation. All resource intensive tasks must use the scheduling system.

The command sbatch is used to submit batch jobs to the scheduler. This is the most common way to run tasks on the cluster. A batch job is described by a batch script and the command-line arguments to sbatch.

A batch script is a bash script with some special preprocessor directives, as seen in the example below.

#!/bin/bash
#SBATCH --gres=gpu:titanx:2
#SBATCH --mem=12G
#SBATCH --output=test-sbatch.log
echo "I have these GPUs:" $CUDA_VISIBLE_DEVICES
echo "On this machine" $(hostname)
exit 0

We send this job to the scheduler with

sbatch example-job.sbatch

This defines a job that will request two NVidia Titan X GPUs, 12 GB of memory and write stdout/stderr to the file test-sbatch.log in the current directory. Once the scheduler is able to allocate the necessary resources it will execute the job, writing the IDs of the allocated GPUs and the hostname of the allocated node to test-sbatch.log.

We can use sacct to see the job history and squeue to see queued and running jobs.

Storage

There are a few file systems available on Terra. None of these are backed up. All, except /scratch, are raided for fault-tolerance.

Mount pathPurposeSizeSpeedlocal node
/dataShared datasets and archives. Read-only for users.1.8 TiBFast reads & slow writesterra
/scratch“Unimportant” temporary files with many writes and reads.2 TiBFastestterra
/mnt/scratchLinks to /scratch for legacy reasons
/workMore important temporary files3.4 TiBFastest reads & fast writestorpaq
/homeCode, configuration files, etc5.4TSlowterra

Containers

Singularity (FAQ) is a container solution for scientific computing that allows unprivileged use of containers. Singularity supports building its own images from scratch and ready-made Docker images.

A user can build their own containerized application/project which can be run in a Slurm batch job.

Examples

Coming soon

compute's People

Contributors

rkjaran avatar judyfong avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.