Coder Social home page Coder Social logo

neusomatic-batchai's Introduction

Neusomatic on Azure BatchAI

Neusomatic with Azure BatchAI

In this example 'preprocess.py' script is executed on prem, the data are transfered to the cloud with 'upload_data.sh', and the training phase is executed with Azure BatchAI.

Directory structure:

  • dataout/models blob container contains the pretrained models
  • file share data contains the input files for the training (artificial case)
  • file share test contains the input files for the training (AJT case)

Job setup is described in 1stgpujob.json

Prior the training phase install_stable.sh script is executed on every node. install.sh contains the possible optimizations.

Setup variables

rgname = hpc-batchai
wsname = neusomatic_workspace
storaccname=neusomaticstorage
expname=pytorch_experiment

Create BatchAI workspace

az group create -n $rgname -l westeurope
az batchai workspace create -g $rgname -n $wsname -l westeurope

Create BatchAI experiment

az batchai experiment create -g $rgname -n $expname -l westeurope -

Create a computing cluster

clustername=nc6
az batchai cluster create -n $clustername -g $rgname -w $wsname -s Standard_NC6 -t 2 --generate-ssh-keys

Setup the storage account

az storage account create -n $storaccname --sku Standard_LRS -g $rgname
az storage share create -n logs --account-name $storaccname
az storage share create -n scripts --account-name $storaccname
az storage share create -n data --account-name $storaccname
az storage share create -n test --account-name $storaccname
az storage directory create -n dataout -s data --account-name $storaccname

Upload files

az storage file upload -s scripts --source install.sh --path prep --account-name $storaccname
az storage file upload-batch -s /mnt/bigdata/output_dir/standalone/dataset --pattern */candidates*.tsv* --destination data --account-name $storaccname

Create a job

jobname=n1
az batchai job create -c $clustername -n $jobname -g $rgname -w $wsname -e $expname -f 1stgpujob.json --storage-account-name $storaccname 

Monitor the execution

az batchai job file stream -j $jobname -g $rgname -w $wsname -e $expname -f stdout-0.txt

Job output

az batchai job show -n distributed_pytorch -g $rgname -w $wsname -e $expname --query jobOutputDirectoryPathSegment

Download the job results

Results can be viewed with Azure Storage Explorer

Delete the cluster

az batchai cluster delete -n $clustername -g $rgname -w $wsname

More information

neusomatic-batchai's People

Contributors

lmiroslaw avatar

Watchers

 avatar James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.