Coder Social home page Coder Social logo

sync-scripts's Introduction

Useful, but somewhat crude spripts for syncing the datasets across the various nodes in our DGX cluster.

WARNING these scripts make quite a few assumptions... and need to be used with some caution.

use scp to copy files... all these scripts will overwirte whatever is in the target

a-scp copies all the files in all the sub-diretories under /raid/parquet_sf3000 form the local node to the target node specified (by node number)

scp-all a more parametized version a-scp (accpet dataset parameter). has not been tested very much - could be buggy

p-scp similar to a-scp, but copies the sub-directories in parallel and binds to the high-speed nic... this was buggy early on (issues with network setup) so it was not used (much)... may work now but an additional word of caution.

these use rysyn instead of scp ... generally slower but if if you have partial copies they are much more friendly (the will not overwrite what is already copied and in that case can be faster).

rsync-small just does the smaller datasets ... this one does not take long at all

rsync-medium similar but does the medium sized datasets... can easily be done in parallel with rsync-small

rsync-big does on big dastset at a time (you pass it store_sales, catalog_sales, or web_sales)

Verification

verfy used to collect up "contents_dataset_node" files that can be diffed to veify the contents of on all nodes

sync-scripts's People

Contributors

badscooter23 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.