Coder Social home page Coder Social logo

srapy's Introduction

SRApy

A python library and scripts to make working with NCBI's SRA less arcane.

Installation

pip install srapy

is all anyone on a modern system with pip should need. Dependencies are:

  • six
  • BioPython
  • lxml
  • docopt
  • progressbar2

lxml depends on the libxml and libxslt C libraries, so you may need to install them with sudo aptitude install libxml2-dev libxslt1-dev or similar.

License

SRApy is licensed under the GNU General Public License, version 3 or any later version. See ./LICENSE or https://www.gnu.org/licenses/gpl-3.0.en.html

Tools

get-project-sras.py

Script to download ALL sra run files for a given BioProject.

Usage:

USAGE:
    get-project-sras.py [-e EMAIL -d OUTDIR -F FMT] -p PROJECT_ID

OPTIONS:
    -e EMAIL        Your email, to provide to Bio.Entrez
                    [default: ] # defaults to empty string
    -d OUTDIR       Output directory, must exist. [default: .]
    -F FMT          Filename format. Fields 'name', 'id', and 'acc' are
                    recognised. Use python string formatting syntax.
                    [default: {acc}~{name}.sra]
    -p PROJECT_ID   BioProject ID

Example:

Search for a project on the BioProject search engine: http://www.ncbi.nlm.nih.gov/bioproject , e.g, the 1001 genomes project (search link).

Copy the ID of the BioProject of interest, after manual curation, e.g. 30811 for Joe Ecker's contribution to the 1001 genomes project.

To download all SRA files, run:

get-project-sras.py -d /path/to/sras -e [email protected] -p 30811

This will fetch the project, and search SRA for the metadata and run accessions. It will download the SRA files, naming them by their SRA accession and the submitter's sample label.

get-run.py

Given an ID or accession, or list thereof, download the run file.

Usage:

USAGE:
    get-run.py [-e EMAIL -d OUTDIR -F FMT -a] (-i SRA_ID | -f FILE)

OPTIONS:
    -e EMAIL        Your email, to provide to Bio.Entrez
                    [default: ] # defaults to empty string
    -d OUTDIR       Output directory, must exist. [default: .]
    -a              The IDs are accessions. Unless '-a' is specified, the
                    identifier is assumed to be an SRA ID if it is numeric,
                    otherwise is interpreted as an accession. This option
                    forces the ID to be interpreted as an accession.
                    [default: False]
    -F FMT          Filename format. Fields 'name', 'id', and 'acc' are
                    recognised. Use python string formatting syntax.
                    [default: {acc}~{name}.sra]
    -i SRA_ID       A single identifier to download. See above for
                    interpretation
    -f FILE         New-line delimited list of identifiers

Example:

echo ERR605369 >sra_accessions.txt
echo ERR612613 >>sra_accessions.txt
echo ERR612614 >>sra_accessions.txt

# Download a single accession
get-run.py -e [email protected] -i ERR605369

# Download a single accession (ERR612613), by run ID
get-run.py -e [email protected] -i 1018875

# Download all accessions
get-run.py -e [email protected] -f sra_accessions.txt

srapy's People

Contributors

kdm9 avatar

Watchers

James Cloos avatar Camille Scott avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.