Coder Social home page Coder Social logo

cmo's Introduction

For users of the luna cluster at MSKCC CMO

Add this to your ~/.profile to get access to the cmo_* and cmoflow_* tools:

# Set PATH to include MSKCC's bin of tools, if found
if [ -d "/opt/common/CentOS_6-dev/bin/current" ]; then
    PATH="/opt/common/CentOS_6-dev/bin/current:$PATH"
fi

# Set PATH to include MSKCC's bin of python tools, if found
if [ -d "/opt/common/CentOS_6-dev/python/python-2.7.10/bin" ]; then
    PATH="/opt/common/CentOS_6-dev/python/python-2.7.10/bin:$PATH"
fi

Documentation lives here, and running workflows can be tracked here.

For external users

Here is how to install these tools without sudo rights:

curl -LO https://github.com/mskcc/cmo/archive/master.zip
unzip master.zip
cd cmo-master
python setup.py install --user

Add this to your ~/.profile to get access to the cmo_* and cmoflow_* tools:

# Set PATH to include local python bin if found
if [ -d "$HOME/.local/bin" ]; then
    PATH="$HOME/.local/bin:$PATH"
fi

For all other users

Data Yes!

cmo's People

Contributors

alexpenson avatar allanbolipata avatar cband avatar ckandoth avatar hisplan avatar ionox0 avatar kpjonsson avatar lemetrec avatar lordzappo avatar md09 avatar nikhil avatar rhshah avatar timosong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cmo's Issues

cmo_bwa_mem returns incorrect exit code

When cmo_bwa_mem is called with sam=False (default), basically it pipes bwa mem with samtools:

bwa mem fasta fastq | samtools view -bh - > out.bam

The problem is if bwa fails, the exit code is 0, when we actually expects 1 (or throws an exception). Because of this, LSF reports that the job is actually successful.

This behavior can be tested with this simple example:

exit_code = subprocess.check_call("false | true", shell=True)

Again, this returns 0 when expecting 1.

One of the workarounds is using set -o pipefail to force bash to behave.

set -o pipefail; bwa mem fasta fastq | samtools view -bh - > out.bam

Another is spawning two subprocesses and chaining them later.

StopIteration

Hi !

I am trying to run facets with these command lines (which has already been successfully used one very similar samples) :

export PATH=$PATH:/opt/common/CentOS_6-dev/python/python-2.7.10/bin/
cmoflow_facets \
    --tumor-bam pathToBam \
    --normal-bam pathToBam \
    --workflow-mode LSF \
    --output-dir `pwd` \
    --purity_cval 300 \
    --cval 100 \
    —force

and I am getting this error :

Found one sample key for this bam: W0000436F
Found one sample key for this bam: W0000438F
Traceback (most recent call last):
  File "/opt/common/CentOS_6-dev/python/python-2.7.10/bin/cmoflow_facets", line 4, in <module>
    __import__('pkg_resources').run_script('cmo===e766ccf-dirty', 'cmoflow_facets')
  File "/opt/common/CentOS_6-dev/python/python-2.7.10/lib/python2.7/site-packages/pkg_resources/__init__.py", line 719, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/opt/common/CentOS_6-dev/python/python-2.7.10/lib/python2.7/site-packages/pkg_resources/__init__.py", line 1483, in run_script
    exec(code, namespace, namespace)
  File "/opt/common/CentOS_6-dev/python/python-2.7.10/lib/python2.7/site-packages/cmo-e766ccf_dirty-py2.7.egg/EGG-INFO/scripts/cmoflow_facets", line 126, in <module>
    (jobs, dependencies, name, terminal_job, count_jobs) = construct_workflow(args.tumor_bam, args.normal_bam, args.tag, facets_args, args.output_dir, snps=args.vcf, tumor_sample = args.tumor_name, normal_sample=args.normal_name, force=args.force)
  File "/opt/common/CentOS_6-dev/python/python-2.7.10/lib/python2.7/site-packages/cmo-e766ccf_dirty-py2.7.egg/EGG-INFO/scripts/cmoflow_facets", line 77, in construct_workflow
    value = next(it)
StopIteration

Not sure what's happening...

Thank very much for your help !

Marc

ValueError: Could not get next FW id!

I'm trying to run cmoflow_facets:

/opt/common/CentOS_6-dev/python/python-2.7.10/bin/cmoflow_facets

However, It seems that I missed some configuration and I'm getting this error:

ValueError: Could not get next FW id! If you have not yet initialized the database, please do so by performing a database reset (e.g., lpad reset)

As suggested, I ran:

/opt/common/CentOS_6-dev/python/python-2.7.10/bin/lpad reset

But I kept getting the same error.

Would appreciate any help, first time running a cmo worflow!

Clone this repo at luna.

Hello,

We would like to include some of your pipelines in our workflows, however my team is concerned with the fact that we can't control how the repo is updated. We would like to have some sort of a "sealed" copy that will work always the same way. Wondering if I clone the repo in luna, using my own python 2 installation, would the pipelines still work? I'm not sure if the mongodb settings will still work.

What are your thoughts?

Best!

Unable to install

An attempt to install cmo on a fresh Linux box fails with the following message:

File "/usr/lib/python2.7/site-packages/setuptools/command/easy_install.py", line 230, in finalize_options
'dist_version': self.distribution.get_version(),
File "/tmp/easy_install-DatdNK/python-daemon-2.1.2/version.py", line 656, in get_version
File "/tmp/easy_install-DatdNK/python-daemon-2.1.2/version.py", line 651, in get_version_info
File "/tmp/easy_install-DatdNK/python-daemon-2.1.2/version.py", line 552, in get_changelog_path
File "/usr/lib64/python2.7/posixpath.py", line 129, in dirname
i = p.rfind('/') + 1
AttributeError: 'NoneType' object has no attribute 'rfind'

This seems something to do with python-daemon.

Setuptools warning

An attempt to install cmo on a fresh Linux box gives me this warning messages with regards to setuptools, pip, and PyPI.

==> default: Cloning into 'cmo'...
==> default: /usr/lib/python2.7/distutils/dist.py:251: UserWarning: 'licence' distribution option is deprecated; use 'license'
==> default: warnings.warn(msg)
==> default: /usr/lib/python2.7/dist-packages/setuptools/dist.py:294: UserWarning: The version specified ('4515eb7') is an invalid version, this may not work as expected with newer versions of setuptools, pip, and PyPI. Please see PEP 440 for more details
Workaround

Minor fixes needed for maf2maf/vcf2maf wrappers

  • They make temporary folders in /scratch even if the user just want to run --help
  • cmo_vcf2maf --vcf-tumor-id defaults to TUMOR instead of the value passed into --tumor-id
  • Separate vep path/cache to make it easier to containerize to script, while keeping the cache in a network drive
  • 'tmp_dir' is not defined error shows up after a successful run of cmo_maf2maf.

argparse does not allow multicharacter short args that overlap a shorter argument, e.g. "-R" and "-RMQT" conflict

When parsing arguments which is a subset of other arguments, cmo_gatk throws an error.

Example:
When running a command like:
cmo_gatk -T SplitNCigarReads -R GRCh37 -I dedupped.bam -o split.bam -rf ReassignOneMappingQuality -RMQF 255 -RMQT 60 -U ALLOW_N_CIGAR_READS

I get an error:
cmo_gatk: error: argument -R/--reference_sequence: invalid choice: 'MQF' (choose from u'GRCm38', u'ncbi36', u'mm9', u'GRCh37', u'GRCh38', u'hg18', u'hg19', u'mm10')

cmo_gdc tool bug in creating log folder

Here's a sample command for Luna that tries to download a small 100MB miRNA-Seq BAM:

cmo_gdc --experimental_strategy miRNA-Seq --data_type AlignedReads --output_file test.tsv --token_file /ifs/tcga/gdc/token/gdc-user-token.taylorb.txt --participant_id TCGA-44-2656

The gdc-client script works fine on its own for downloading this file. The issue is this exception:

Exception caught: <type 'exceptions.IOError'>, line number: 820
Check logs in: /ifs/tcga/gdc/jobs/job.14301562.20170503-144945/a233689d-00c6-4eeb-ad82-5122317f42e6
Continuing ...

For some reason, the mkdir_p function in cmo_gdc is unable to create the subfolder named a233689d-00c6-4eeb-ad82-5122317f42e6 (based on the GDC file ID), and errors out when it tries to open it for reading (to parse for a line that says the job was "successful").

Outdated cmo_resources.json

There are some discrepancies between the cmo_resources.json file in the git repo vs. the one being used in the cluster.

  • Git Repo: https://github.com/mskcc/cmo/blob/master/cmo/data/cmo_resources.json
  • Cluster: /opt/common/CentOS_6-dev/cmo/cmo_resources.json

For instance, the one in the git repo does not have trim galore, missing versions of some tools, and etc.

For instance, running cmo_trimgalore on a fresh Linux box fails with the key-not-found error message.

Traceback (most recent call last):
  File "/usr/local/bin/cmo_trimgalore", line 4, in <module>
    __import__('pkg_resources').run_script('cmo===4515eb7', 'cmo_trimgalore')
  File "/usr/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 719, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/usr/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 1504, in run_script
    exec(code, namespace, namespace)
  File "/usr/local/lib/python2.7/dist-packages/cmo-4515eb7-py2.7.egg/EGG-INFO/scripts/cmo_trimgalore", line 35, in <module>
    verparser.add_argument("--version", choices=cmo.util.programs['trimgalore'].keys(), default="default")
KeyError: 'trimgalore'

cmo_maf2maf spits out core dumps

Just ran cmo_maf2maf --input-maf big.maf --output-maf big.vep.maf --custom-enst /opt/common/CentOS_6-dev/vcf2maf/develop/data/isoform_overrides_at_mskcc --vep-forks 20. I got an output file that looks allright, but my directory was also filled up with 2222 files along the lines of core.30920.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.