Coder Social home page Coder Social logo

barski-lab / biowardrobe-airflow-analysis Goto Github PK

View Code? Open in Web Editor NEW
5.0 3.0 2.0 103 KB

BioWardrobe's backend rewritten for Apache Airflow.

Home Page: https://datirium.com/

License: Apache License 2.0

Python 40.40% TSQL 59.60%
airflow cwl cwltool python biowardrobe

biowardrobe-airflow-analysis's Introduction

BioWardrobe backend (airflow+cwl)

About

Python package to replace BioWardrobe's python/cron scripts. It uses Apache-Airflow functionality with CWL v1.0.

Install

  1. Add biowardrobe MySQL connection into Airflow connections
    select * from airflow.connection;
    insert into airflow.connection values(NULL,'biowardrobe','mysql','localhost','ems','wardrobe','',null,'{"cursor":"dictcursor"}',0,0);
  2. Install
    sudo pip3 install .

Requirements

  1. Make sure your system satisfies the following criteria:

    • Ubuntu 16.04.3
      • python3.6
        sudo add-apt-repository ppa:jonathonf/python-3.6
        sudo apt-get update
        sudo apt-get install python3.6
      • pip3
        curl https://bootstrap.pypa.io/get-pip.py | sudo python3.6
        pip3 install --upgrade pip3
      • setuptools
        pip3 install setuptools
      • docker
        sudo apt-get update
        sudo apt-get install apt-transport-https ca-certificates curl software-properties-common
        curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
        sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
        sudo apt-get update
        sudo apt-get install docker-ce
        sudo groupadd docker
        sudo usermod -aG docker $USER
        Log out and log back in so that your group membership is re-evaluated.
      • libmysqlclient-dev
        sudo apt-get install libmysqlclient-dev
      • nodejs
        sudo apt-get install nodejs
  2. Get the latest version of cwl-airflow-parser. If Apache-Airflow or cwltool aren't installed, installation will be done automatically with recommended versions. Set AIRFLOW_HOME environment variable to airflow config directory default is ~/airflow/.

    git clone https://github.com/datirium/cwl-airflow-parser.git
    cd cwl-airflow-parser
    sudo pip3 install .
  3. If required, add extra airflow packages for extending Airflow functionality, for instance, with MySQL support pip3 install apache-airflow[mysql].

Running

  1. To create BioWardrobe's dags run biowardrobe-init in airflow's dags directory

    cd ~/airflow/dags
    ./biowardrobe-init 
    
  2. Run Airflow scheduler:

    airflow scheduler
  3. Use airflow trigger_dag with input parameter --conf "JSON" where JSON is either job definition or biowardrobe_uid and explicitly specified cwl descriptor dag_id.

    airflow trigger_dag --conf "{\"job\":$(cat ./hg19.job)}" "bowtie-index"

    where hg19.job is:

    {
      "fasta_input_file": {
        "class": "File", 
        "location": "file:///wardrobe/indices/bowtie/hg19/chrM.fa", 
        "format":"http://edamontology.org/format_1929",
        "size": 16909,
        "basename": "chrM.fa",
        "nameroot": "chrM",
        "nameext": ".fa"
      },
      "output_folder": "/wardrobe/indices/bowtie/hg19/",
      "threads": 6,
      "genome": "hg19"
    }
  4. All the output will be moved from temporary directory into output_folder parameter of the job.

biowardrobe-airflow-analysis's People

Contributors

carcassona avatar michael-kotliar avatar portah avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.