Coder Social home page Coder Social logo

moka-guys / dx_api_bridge Goto Github PK

View Code? Open in Web Editor NEW
0.0 4.0 0.0 71 KB

DNAnexus API bridge. Retrieve data object URIs. Manage archival processes and perform cost audits.

Dockerfile 0.57% Makefile 0.20% Python 96.41% R 0.59% Shell 2.24%
api archive archiving bridge igv url vcf sanger

dx_api_bridge's Introduction

DNAnexus file bridge and tools

This is a simple webservice that allows fast retrieval of data object URIs for use in IGV etc. It also provides helpers to manage archival processes and perform cost audits.

run.py

Starts the DNAnexus file service whioch provides a simple HTTP interface to get files by sample and project. This is also the default application ran by the docker image.

Available API routes

All API routes only support GET requests. A token must be provided in the authentication header (Bearer XXXXXXXXX)

/whoami Returns the users identity based on the supplied authentication token.

/project Returns the project list

/project/<string:dx_project> Return the samples in a given project

/url/<string:dx_project>/<string:dx_file> Return the ephemeral URL for a given file in a project/sample.

dxarc.py

This API native helper functions to manage file archival. If performing archiving and/or renamin options ensure the script will have the expected effect by supplying the --dryrun option.

Example use cases

Archive all production TSO500 runs older than 3 months, excluding files that are also in 001_ToolsReferenceData

python dxarc.py --token XXXXXXX -f --project "^002(_.+TSO.*)$" --before 12w --visibility visible --notin "^001_Tool" --archive --rename "802\1"

  • --token XXXX Provide a DNAnexus access token
  • -f --project "^002_.+TSO" Find project matichin pattern of any name in project starting with 001_Tool
  • --before 12w Only return projects created more than 12 weeks ago and files that have not been modified for 12 weeks
  • --visibility hidden Only return files that are hidden
  • --notin "^001_Tool" Excludes any files that ar also in any project matching the search regular expression
  • --archive Archive found objects
  • --rename "802$1" Renames projects with this pattern (used in conjunction with --project).

Unarchivng all files that are also in hidden 001_ToolsReferenceData

THis can be used to ensure any shared resources in 001 are live of they are also on any other project.

python dxarc.py --token XXXXXXX -f --object ".*" --type file --project "^001_Tool" --visibility hidden --follow --output reference_data.tsv --unarchive

  • --token XXXX Provide a DNAnexus access token
  • -f --objects '.*' --type file --project "^001_Tool" Find files of any name in project starting with 001_Tool
  • --visibility hidden Only return files that are hidden
  • --follow also return the same files in other projects
  • --output reference_data.tsv Writes objects summarty to file (before any updates)
  • --unarchive Unarchive found objects

Show storage and compute costs for all development projects and write analysis level compute cost audit

python dxarc.py --token XXXXXXX -f --project "^003_" --compute compute_audit.tsv

  • --token XXXX Provide a DNAnexus access token
  • -f --objects '.*' --type file --project "^001_Tool" Find files of any name in project starting with 001_Tool
  • --compute compute_audit.tsv

The output from the compute cost audit can be visualised with the included R script compute_plot.R.

e.g. Rscript compute_plot.R compute_audit.tsv compute_audit.pdf

extract_vcf.py

DEVELOPMENT ONLY Extract VCF calls from DNAnexus.

extract_sanger.py

*** DEVELOPMENT ONLY *** Extracts Sanger results from directory containing results in excel format

dx_api_bridge's People

Contributors

preciserobot avatar sqvdusers avatar

Watchers

 avatar  avatar  avatar  avatar

dx_api_bridge's Issues

dxarc.py: Specify in readme that any part of the project name that is not the identifier e.g. 002/ or 003 must be in round brackets

For example, must be provided as:

python dxarc.py --token NAa5dnNdGSQ9DY6d1OiBfHPHCw6g5tqC -f --project "^003(_BAM_DOPS_RD)$" --before 12w --visibility visible --notin "^001_Tool" --archive --rename "803\1" --dryrun

As opposed to:

python dxarc.py --token NAa5dnNdGSQ9DY6d1OiBfHPHCw6g5tqC -f --project "^003_BAM_DOPS_RD$" --before 12w --visibility visible --notin "^001_Tool" --archive --rename "803\1" --dryrun

The latter will fail to rename the file.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.