Coder Social home page Coder Social logo

de_novo-identification's Introduction

de_novo-identification

Finding de novo repeats within a genome assembly and filtering out simple repeats and genes.

Amarel

This pipeline is run on Amarel, the Rutgers University high performance computing server. The following are the most recent specifications of the Amarel system that the pipeline was run on.

  • 52 CPU-only nodes, each with 28 Xeon e5-2680v4 (Broadwell) cores + 128 GB RAM
  • 20 CPU-only nodes, each with 28 Xeon e5-2680v4 (Broadwell) cores + 256 GB RAM
  • 4 28-core e5-2680v4 nodes each with 2 x Nvidia Pascal P100 GPUs onboard
  • 2 high-memory nodes, each with 56 e7-4830v4 (Broadwell) cores + 1.5 TB RAM
  • 53 CPU-only nodes, each with 16 Intel Xeon e5-2670 (Sandy Bridge) cores + 128 GB RAM
  • 5 CPU-only nodes, each with 20 Intel Xeon e5-2670 (Ivy Bridge) cores + 128 GB RAM
  • 26 CPU-only nodes, each with 24 Intel Xeon e5-2670 (Haswell) cores + 128 GB RAM
  • 4 CPU-only nodes, each with 16 Intel Xeon e5-2680 (Broadwell) cores + 128 GB RAM
  • 3 12-core e5-2670 nodes with 8 Nvidia Tesla M2070 GPUs onboard
  • 2 28-core e5-2680 nodes with 4 Quadro M6000 GPUs onboard
  • 1 16-core e5-2670 node with 8 Xeon Phi 5110P accelerators onboard

For more information, please contact [email protected] or look at the Amarel user guide located at https://rutgers-oarc.github.io/amarel/.

You may also contact the Amarel director, Kristen Klepping at [email protected] for sudo permissions on your node.

Run

The pipeline is run by using the following command.

sbatch de_novo_repeat_shell.sh

de_novo_repeat_shell.sh is a shell file that contains subcommands which runs the individual programs and tools.

Absent files

Some files are absent from this repository due to the inability to store large files on the free version of Github. Those files are the nucleotide sequence of the multiple fly species as well as their respective peptide sequences; all of which can be found within the FTP client of flybase.org.

Program Version information

All the programs and pipelines used within this master are tht latest versions upto the last date of run.

All of them have been kept up to date using Anaconda.

Author Information

For more information and data, please contact me here, or at [email protected].

de_novo-identification

de_novo-identification's People

Forkers

sea200k

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.