Coder Social home page Coder Social logo

ecogenomics / checkm Goto Github PK

View Code? Open in Web Editor NEW
324.0 324.0 73.0 6.78 MB

Assess the quality of microbial genomes recovered from isolates, single cells, and metagenomes

Home Page: https://ecogenomics.github.io/CheckM/

License: GNU General Public License v3.0

Python 100.00%

checkm's People

Contributors

alienzj avatar bernt-matthias avatar ctb avatar ctskennerton avatar donovan-h-parks avatar finesim97 avatar hunter-cameron avatar jjacobson95 avatar maxibor avatar misialq avatar sjaenick avatar wwood avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

checkm's Issues

Prodigal output parsing problem

Hi, the prodigal output format seems to have changed between versions 2.60 and 2.61/2.62. The new output can't be parsed.

Example of old output line:

# Model Data: version=Prodigal.v2.60;run_type=Single;model="Ab initio";gc_cont=35.88;transl_table=11;uses_sd=1

New output line:

# Model Data: gc_cont=35.88;transl_table=4;uses_sd=1

The error is:

  File "/apps/python/lib/python2.7/site-packages/checkm/prodigal.py", line 178, in __parseGFF
    self.translationTable = line.split(';')[4]
IndexError: list index out of range

Also, another small request: In the file "checkm", please change the line

versionFile = open(os.path.join(binDir, '..', 'checkm', 'VERSION'))

to

import checkm; versionFile = open(os.path.join(checkm.__path__[0], 'VERSION'))

or something similar, this would be more stable.

"checkm tetra" fails on lowercase sequence data

Depending on the assembler and primary sequence data, contigs can contain lowercase letters. This leads to 'nan' values in the output of the tetra command. The workaround is to map the sequence data to upper case before. However, this should be handled in Python very easily.

missing_duplicate_genes_50.tsv

Hey Donovan, Mike,

it was good meeting both of you at ISME a few weeks ago.
I've tried installing and running CheckM, but run into an issue when running 'lineage_set'. The treeParser.py script seems to be looking for a file called 'missing_duplicate_genes_50.tsv' (line 385) in the genome_tree/ folder of the data directory.

I can't find that file, neither locally nor on https://data.ace.uq.au/public/CheckM_databases/

hope you can fix that (and that I'm not overlooking something silly here)
Thanks!
Daan Speth

unknown command data

Following the installation instructions here: https://github.com/Ecogenomics/CheckM/wiki/Installation

After installing version 0.4 using pip, I ran 'checkm data update', and got the following error:

unknown command: data

The checkm help also does not list a 'data' command:

checkm
checkm [options] command filenames

Commands:

checkm write [checkm filename (default:checkm.txt) [filepath (default='.')]]
- writes a checkm manifest file to disc for the files in the given filepath.
Use -r to include all files under a given path in a single manifest.

checkm print [filepath (default='.')]
- As for 'write', but will print the manifest to the screen.

checkm multi [checkm filename (default:checkm.txt) [filepath (default='.')]]
- writes a checkm manifest file to disc for the files in the given filepath, recursively creating a manifest file within each subdirectory and using the '@' designation in the parent checkm files above it.

checkm check [checkm filename (default:checkm.txt)]
- checks the given checkm manifest against the files on disc.
Use -m to recursively scan through any multilevel checkm files it finds in this manifest as well.

checkm remove_multi [checkm filename (default:checkm.txt)]
- scans through the checkm file, recursively gathering a list of all included checkm manifests, returning the list of files.
Use the option '-f' or '--force' to cause the tool to try to delete these checkm files.

checkm tree_qa throws AttributeError

Hi,

I am using checkM version 1.0.3. I installed all yesterday, ran the "checkm test" and no issues came up.

Now, I got an error when analyzing my own genome bins, as explained below:

command 1: checkm tree -t 20 ./ ./checkm_tree -x .fa

completed with no issues

command 2: checkm tree_qa -f ./checkm_tree.txt ./checkm_tree

error


[CheckM - tree_qa] Assessing phylogenetic markers found in each bin.


Reading HMM info from file.
Parsing HMM hits to marker genes:
Finished parsing hits for 3600 of 3600 (100.00%) bins.

Unexpected error: <type 'exceptions.AttributeError'>
Traceback (most recent call last):
File "/srv/sw/checkm/1.0.3/bin/checkm", line 709, in
checkmParser.parseOptions(args)
File "/srv/sw/checkm/1.0.3/lib/python2.7/site-packages/checkm/main.py", line 1197, in parseOptions
self.treeQA(options)
File "/srv/sw/checkm/1.0.3/lib/python2.7/site-packages/checkm/main.py", line 177, in treeQA
treeParser.printSummary(options.out_format, options.tree_folder, RP, options.bTabTable, options.file, binStats)
File "/srv/sw/checkm/1.0.3/lib/python2.7/site-packages/checkm/treeParser.py", line 45, in printSummary
self.reportBinTaxonomy(outDir, resultsParser, bTabTable, outFile, binStats, bLineageStatistics=False)
File "/srv/sw/checkm/1.0.3/lib/python2.7/site-packages/checkm/treeParser.py", line 633, in reportBinTaxonomy
binIdToTaxonomy = self.getBinTaxonomy(outDir, binIds)
File "/srv/sw/checkm/1.0.3/lib/python2.7/site-packages/checkm/treeParser.py", line 216, in getBinTaxonomy
domainNode = self.__findDomainNode(node)
File "/srv/sw/checkm/1.0.3/lib/python2.7/site-packages/checkm/treeParser.py", line 237, in __findDomainNode
curNode = curNode.parent_node

AttributeError: 'NoneType' object has no attribute 'parent_node'

Would it be possible to clarify? Maybe I am running the commands incorrectly or my installation is faulty ...

Thank you very much in advance for your help.

Dieter.

CheckM installation check error

Hi,
I don't have admin rights on our server and our IT guy installed CheckM for us. The command options show up just fine on type checkm, but when on trying the installation check test command I got the following error. It doesn't make a whole lot of sense to me, but my guess is that Python version 2.6 is one issue and have already requested an upgrade. But can you have a peek at the error message and let me know if something else is missing too (e.g. Error: Unrecognized format, trying to open hmm file /tmp/be078517-7f20-4792-9116-a243633e5715 for reading) so that I can ask our admin to install it ?

Command used : checkm test ~/checkm_test_results


[CheckM - Test] Processing E.coli K12-W3310 to verify operation of CheckM.


[Step 1]: Verifying tree command.


[CheckM - tree] Placing bins in reference genome tree.


Identifying marker genes in 1 bins with 1 threads:
Finished processing 0 of 1 (0.00%) bins.
Error: Unrecognized format, trying to open hmm file /tmp/be078517-7f20-4792-9116-a243633e5715 for reading.

Finished processing 1 of 1 (100.00%) bins.

Saving HMM info to file.

Unexpected error: <type 'exceptions.AttributeError'>
Traceback (most recent call last):
File "/usr/bin/checkm", line 717, in
checkmParser.parseOptions(args)
File "/usr/lib/python2.6/site-packages/checkm/main.py", line 1267, in parseOptions
self.test(options)
File "/usr/lib/python2.6/site-packages/checkm/main.py", line 1156, in test
verifyEcoli.run(self, options.output_dir)
File "/usr/lib/python2.6/site-packages/checkm/test/test_ecoli.py", line 64, in run
parser.tree(options)
File "/usr/lib/python2.6/site-packages/checkm/main.py", line 129, in tree
markerSetParser.writeBinModels(binIdToModels, hmmModelInfoFile)
File "/usr/lib/python2.6/site-packages/checkm/markerSets.py", line 471, in writeBinModels
with gzip.open(filename, 'wb') as output:
AttributeError: GzipFile instance has no attribute 'exit'

Thanks,
Neha

Is there any way to de-contamination?

Hi ~

Since checkm can find the contamination as multiple copies of marker genes, is there any way to locate and remove these copies in contigs/scaffolds files?

Best
Yifan

AttributeError when using checkm coverage

When I run "checkm coverage" I keep getting AttributeErrors. an outputfile is produced, but the coverage of every contig in every bin is given as zero.
I am using sorted an indexed Bam files (mappings done with bowtie, then converted to .bam-format, sorted and indexed using samtools).

This is the error I run into:
Traceback (most recent call last): File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run self._target(*self._args, **self._kwargs) File "/usr/local/lib/python2.7/dist-packages/checkm/coverage.py", line 208, in __workerThread elif read.is_secondary or read.is_supplementary: AttributeError: 'pysam.calignmentfile.AlignedSegment' object has no attribute 'is_supplementary'

I am using CheckM v1.0.4
the command I ran is:
checkm coverage -x fa -t 8 ./ checkm_coverage.out input/myinput_reads.BAM

Edit: my pysam version is 0.8.2.1 (in case that may be part of the problem)

checkm lineage_wf memory usage

Hi,
I've been recently started to use CheckM to evaluate the bins from my metagenomic data but I'm getting stuck in the first step of lineage_wf due to the amount of memory the program is using. I even tried the -r option but it exceeds 80 GB of requested memory instead of being limited to ~14GB as exposed in the Quick Start page of the CheckM wiki.

Any solution for this? Thank you

My bins folder contains 507 bins

My command:
checkm lineage_wf -r -t 8 -x fa ./bins ./outdir

Log:


[CheckM - tree] Placing bins in reference genome tree.


Identifying marker genes in 507 bins with 8 threads:
Finished processing 0 of 507 (0.00%) bins.^M [.....] Finished processing 507 of 507 (100.00%) bins.^M
Saving HMM info to file.

Calculating genome statistics for 507 bins with 8 threads:
Finished processing 0 of 507 (0.00%) bins.^M [.....] Finished processing 507 of 507 (100.00%) bins.^M

Extracting marker genes to align.
Parsing HMM hits to marker genes:
Finished parsing hits for 1 of 507 (0.20%) bins.^M [.....] Finished processing 507 of 507 (100.00%) bins.^M$
Extracting 43 HMMs with 8 threads:
Finished extracting 0 of 43 (0.00%) HMMs.^M [.....] Finished processing 507 of 507 (100.00%) bins.^M
Aligning 43 marker genes with 8 threads:
Finished aligning 0 of 43 (0.00%) marker genes.^M [.....] Finished processing 507 of 507 (100.00%) bins.^M

Reading marker alignment files.
Concatenating alignments.
Placing 507 bins into the genome tree with pplacer (be patient).
=>> PBS: job killed: vmem 135223554048 exceeded limit 85899345920

problem with pfam.hmm formatting?

I think I'm having issue with the pfam format of the "phyla.hmm" file that was created by

checkm data update
however, it could be something else. As a test, I tried the lineage_wf command with the ecoli genome you supplied in test_data. Please see errors below. Thanks.

what I did to install and test checkm lineage_wf

installing with pip

pip install numpy
pip install checkm-genome
./bin/checkm data update
I typed "DATA" to save data in

I added prodigal, hmmer, and pplacer to PATH

test installation with test data (ecoli)

./bin/checkm lineage_wf DATA/test_data/ Results

The output was


[CheckM - tree] Placing bins in reference genome tree.


Identifying marker genes in 1 bins with 1 threads:
Finished processing 0 of 1 (0.00%) bins.

Error: Unrecognized format, trying to open hmm file /global/projectb/scratch/jfroula/965c9772-6a8a-48d3-9b67-a5a541d84085 for reading.

Finished processing 1 of 1 (100.00%) bins.

Saving HMM info to file.

Calculating genome statistics for 1 bins with 1 threads:
Finished processing 0 of 1 (0.00%) bins.
Process Process-4:
Traceback (most recent call last):
File "/usr/common/usg/languages/python/2.7.4/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/common/usg/languages/python/2.7.4/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(_self._args, *_self._kwargs)
File "/global/scratch2/sd/jfroula/CheckM/git/CheckM/checkm/binStatistics.py", line 113, in __processBin
codingDensity, translationTable, numORFs = self.calculateCodingDensity(binDir, genomeSize, scaffoldStats)
File "/global/scratch2/sd/jfroula/CheckM/git/CheckM/checkm/binStatistics.py", line 229, in calculateCodingDensity
codingBasePairs = self.__calculateCodingBases(aaGenes, seqStats)
File "/global/scratch2/sd/jfroula/CheckM/git/CheckM/checkm/binStatistics.py", line 240, in __calculateCodingBases
seqStats[scaffoldId]['# ORFs'] = seqStats[scaffoldId].get('# ORFs', 0) + 1
KeyError: 'AC_000091_1'

Extracting marker genes to align.
Parsing HMM hits to marker genes:
Finished parsing hits for 1 of 1 (100.00%) bins.
[Errno 2] No such file or directory: 'Results/bins/637000110/hmmer.tree.txt'

Extracting 43 HMMs with 1 threads:
Finished extracting 0 of 43 (0.00%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 1 of 43 (2.33%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 2 of 43 (4.65%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 3 of 43 (6.98%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 4 of 43 (9.30%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 5 of 43 (11.63%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 6 of 43 (13.95%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 7 of 43 (16.28%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 8 of 43 (18.60%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 9 of 43 (20.93%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 10 of 43 (23.26%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 11 of 43 (25.58%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 12 of 43 (27.91%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 13 of 43 (30.23%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 14 of 43 (32.56%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 15 of 43 (34.88%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 16 of 43 (37.21%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 17 of 43 (39.53%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 18 of 43 (41.86%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 19 of 43 (44.19%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 20 of 43 (46.51%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 21 of 43 (48.84%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 22 of 43 (51.16%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 23 of 43 (53.49%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 24 of 43 (55.81%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 25 of 43 (58.14%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 26 of 43 (60.47%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 27 of 43 (62.79%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 28 of 43 (65.12%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 29 of 43 (67.44%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 30 of 43 (69.77%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 31 of 43 (72.09%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 32 of 43 (74.42%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 33 of 43 (76.74%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 34 of 43 (79.07%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 35 of 43 (81.40%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 36 of 43 (83.72%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 37 of 43 (86.05%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 38 of 43 (88.37%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 39 of 43 (90.70%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 40 of 43 (93.02%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 41 of 43 (95.35%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 42 of 43 (97.67%) HMMs.

Error: File /global/scratch2/sd/jfroula/CheckM/git/CheckM/DATA/hmms/phylo.hmm does not appear to be in a recognized HMM format.

Finished extracting 43 of 43 (100.00%) HMMs.

Aligning 43 marker genes with 1 threads:
Finished aligning 0 of 43 (0.00%) marker genes.
Finished aligning 1 of 43 (2.33%) marker genes.
Finished aligning 2 of 43 (4.65%) marker genes.
Finished aligning 3 of 43 (6.98%) marker genes.
Finished aligning 4 of 43 (9.30%) marker genes.
Finished aligning 5 of 43 (11.63%) marker genes.
Finished aligning 6 of 43 (13.95%) marker genes.
Finished aligning 7 of 43 (16.28%) marker genes.
Finished aligning 8 of 43 (18.60%) marker genes.
Finished aligning 9 of 43 (20.93%) marker genes.
Finished aligning 10 of 43 (23.26%) marker genes.
Finished aligning 11 of 43 (25.58%) marker genes.
Finished aligning 12 of 43 (27.91%) marker genes.
Finished aligning 13 of 43 (30.23%) marker genes.
Finished aligning 14 of 43 (32.56%) marker genes.
Finished aligning 15 of 43 (34.88%) marker genes.
Finished aligning 16 of 43 (37.21%) marker genes.
Finished aligning 17 of 43 (39.53%) marker genes.
Finished aligning 18 of 43 (41.86%) marker genes.
Finished aligning 19 of 43 (44.19%) marker genes.
Finished aligning 20 of 43 (46.51%) marker genes.
Finished aligning 21 of 43 (48.84%) marker genes.
Finished aligning 22 of 43 (51.16%) marker genes.
Finished aligning 23 of 43 (53.49%) marker genes.
Finished aligning 24 of 43 (55.81%) marker genes.
Finished aligning 25 of 43 (58.14%) marker genes.
Finished aligning 26 of 43 (60.47%) marker genes.
Finished aligning 27 of 43 (62.79%) marker genes.
Finished aligning 28 of 43 (65.12%) marker genes.
Finished aligning 29 of 43 (67.44%) marker genes.
Finished aligning 30 of 43 (69.77%) marker genes.
Finished aligning 31 of 43 (72.09%) marker genes.
Finished aligning 32 of 43 (74.42%) marker genes.
Finished aligning 33 of 43 (76.74%) marker genes.
Finished aligning 34 of 43 (79.07%) marker genes.
Finished aligning 35 of 43 (81.40%) marker genes.
Finished aligning 36 of 43 (83.72%) marker genes.
Finished aligning 37 of 43 (86.05%) marker genes.
Finished aligning 38 of 43 (88.37%) marker genes.
Finished aligning 39 of 43 (90.70%) marker genes.
Finished aligning 40 of 43 (93.02%) marker genes.
Finished aligning 41 of 43 (95.35%) marker genes.
Finished aligning 42 of 43 (97.67%) marker genes.
Finished aligning 43 of 43 (100.00%) marker genes.

Reading marker alignment files.
Concatenating alignments.
Placing 1 bins into the genome tree with pplacer (be patient).
Uncaught exception: Sys_error("Results/storage/tree/concatenated.pplacer.json: No such file or directory")
Fatal error: exception Sys_error("Results/storage/tree/concatenated.pplacer.json: No such file or directory")

{ Current stage: 0:00:28.930 || Total: 0:00:28.930 }


[CheckM - lineage_set] Inferring lineage-specific marker sets.


Reading HMM info from file.
Parsing HMM hits to marker genes:
Finished parsing hits for 1 of 1 (100.00%) bins.

Unexpected error: <type 'exceptions.KeyError'>
Traceback (most recent call last):
File "./bin/checkm", line 646, in
checkmParser.parseOptions(args)
File "/global/scratch2/sd/jfroula/CheckM/git/CheckM/checkm/main.py", line 1159, in parseOptions
self.lineageSet(options)
File "/global/scratch2/sd/jfroula/CheckM/git/CheckM/checkm/main.py", line 209, in lineageSet
DefaultValues.HMMER_TABLE_PHYLO_OUT)
File "/global/scratch2/sd/jfroula/CheckM/git/CheckM/checkm/resultsParser.py", line 64, in analyseResults
self.parseBinHits(outDir, hmmTableFile, bSkipOrfCorrection, bIgnoreThresholds, evalueThreshold, lengthThreshold, binStats, seqStats)
File "/global/scratch2/sd/jfroula/CheckM/git/CheckM/checkm/resultsParser.py", line 97, in parseBinHits
resultsManager = ResultsManager(binId, self.models[binId], bIgnoreThresholds, evalueThreshold, lengthThreshold, binStats[binId], seqStats[binId])
KeyError: '637000110'

CheckM crashes while doing "[CheckM - analyze] Identifying marker genes in bins" and "[CheckM - qa] Tabulating genome statistics"

I analyzid few genome without any problem.
I then moved to a bigger dataset (48 genomes) amd the program crashed at the above mentioned steps. I guess tehre is a sort of problem with the ID of my seq, but I cannot figure out what.

checkm taxonomy_wf family Bacillaceae -x faa comparative/ comparative_checkm

*******************************************************************************
 [CheckM - taxon_set] Generate taxonomic-specific marker set.
*******************************************************************************

  Marker set for Bacillaceae contains 418 marker genes arranged in 155 sets.
    Marker set inferred from 162 reference genomes.
 Marker set for Bacillales contains 319 marker genes arranged in 134 sets.
    Marker set inferred from 331 reference genomes.
  Marker set for Bacilli contains 250 marker genes arranged in 136 sets.
    Marker set inferred from 821 reference genomes.
  Marker set for Firmicutes contains 172 marker genes arranged in 99 sets.
    Marker set inferred from 1349 reference genomes.
  Marker set for Bacteria contains 104 marker genes arranged in 58 sets.
    Marker set inferred from 5449 reference genomes.

  Marker set written to: comparative_checkm/Bacillaceae.ms

  { Current stage: 0:00:04.462 || Total: 0:00:04.462 }

*******************************************************************************
 [CheckM - analyze] Identifying marker genes in bins.
*******************************************************************************

  Identifying marker genes in 48 bins with 1 threads:
    Finished processing 48 of 48 (100.00%) bins.
  Saving HMM info to file.

  { Current stage: 0:19:46.139 || Total: 0:19:50.602 }

  Parsing HMM hits to marker genes:
    Finished parsing hits for 48 of 48 (100.00%) bins.
  Aligning marker genes with multiple hits in a single bin:
    Finished processing 48 of 48 (100.00%) bins.

  { Current stage: 0:00:02.475 || Total: 0:19:53.077 }

  Calculating genome statistics for 48 bins with 1 threads:
Process Process-6:ssing 0 of 48 (0.00%) bins.
Traceback (most recent call last):
  File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python2.7/dist-packages/checkm/binStatistics.py", line 114, in __processBin
    codingDensity, translationTable, numORFs = self.calculateCodingDensity(binDir, genomeSize,     scaffoldStats)
  File "/usr/local/lib/python2.7/dist-packages/checkm/binStatistics.py", line 232, in     calculateCodingDensity
    codingBasePairs = self.__calculateCodingBases(aaGenes, seqStats)
  File "/usr/local/lib/python2.7/dist-packages/checkm/binStatistics.py", line 247, in __calculateCodingBases
    seqStats[scaffoldId]['# ORFs'] = seqStats[scaffoldId].get('# ORFs', 0) + 1
KeyError: 'Prodigal_Seq_432'


  { Current stage: 0:00:00.263 || Total: 0:19:53.341 }

*******************************************************************************
 [CheckM - qa] Tabulating genome statistics.
*******************************************************************************

  Calculating AAI between multi-copy marker genes.

  Reading HMM info from file.
  Parsing HMM hits to marker genes:
    Finished parsing hits for 1 of 48 (2.08%) bins.
Unexpected error: <type 'exceptions.KeyError'>
Traceback (most recent call last):
  File "/usr/local/bin/checkm", line 717, in <module>
    checkmParser.parseOptions(args)
  File "/usr/local/lib/python2.7/dist-packages/checkm/main.py", line 1214, in parseOptions
    self.qa(options)
  File "/usr/local/lib/python2.7/dist-packages/checkm/main.py", line 393, in qa
    bSkipOrfCorrection=options.bSkipOrfCorrection
  File "/usr/local/lib/python2.7/dist-packages/checkm/resultsParser.py", line 65, in analyseResults
    self.parseBinHits(outDir, hmmTableFile, bSkipOrfCorrection, bIgnoreThresholds,    evalueThreshold, lengthThreshold, binStats, seqStats)
  File "/usr/local/lib/python2.7/dist-packages/checkm/resultsParser.py", line 98, in parseBinHits
    resultsManager = ResultsManager(binId, self.models[binId], bIgnoreThresholds,     evalueThreshold, lengthThreshold, binStats[binId], seqStats[binId])
KeyError: 'NC017196

Improve detection of nucleotide and protein input sequences

Currently, CheckM will run to completion on both nucleotide and protein sequences with or without the --genes flag. This is confusing as a nucleotide sequence erroneously run with the --genes flag will simply report that no marker genes was found. It would be far better to verify the input type first and warn users.

Feature request: taxonomy of individual sequences

Hi,
I'm using checkm to get the taxonomy of individual contigs - right now I'm sort of rigging this all with GNU parallel.

Right now tree and tree_qa is setup to work on bins and individual folders but what would helpful is if I could just give it a multifasta file and get the taxonomy of each contig (basically the output of tree_qa). Right now I'm making a bunch of pseudo bins for each fasta file using parallel but that won't really work out when the contig numbers get higher than a few hundred thousand.

Thanks a bunch

CheckM and pplacer correctly installed and in PATH but checkm not running: [Errno 2] No such file or directory: 'ffn_checkM/.../hmmer.tree.txt'/[Error] Make sure pplacer is on your system path.

I installed ChcekM and ppler as reported in the isntructions.
I updated the CheckM db manaully and changed the checkm data setRoot succsesfully.

However, when I run the test I got the follwong errors:

sudo checkm test ~/checkm_test_result # I have to run in sudo otherwise does not work

[CheckM - Test] Processing E.coli K12-W3310 to verify operation of CheckM.
*******************************************************************************
 [Step 1]: Verifying tree command.
*******************************************************************************
 [CheckM - tree] Placing bins in reference genome tree.
*******************************************************************************
Identifying marker genes in 1 bins with 1 threads:
Finished processing 0 of 1 (0.00%) bins.
FATAL: No such option "--domtblout".
Usage: hmmsearch [-options] <hmmfile> <sequence file or database>
Available options are:
   -h        : help; print brief help on version and usage
   -A <n>    : sets alignment output limit to <n> best domain alignments
   -E <x>    : sets E value cutoff (globE) to <= x
   -T <x>    : sets T bit threshold (globT) to >= x
   -Z <n>    : sets Z (# seqs) for E-value calculation
 Finished processing 1 of 1 (100.00%) bins.
  Saving HMM info to file.
  Calculating genome statistics for 1 bins with 1 threads:
  Finished processing 1 of 1 (100.00%) bins.
 Extracting marker genes to align.
 Parsing HMM hits to marker genes:
[Errno 2] No such file or directory: u'/home/xxx/checkm_test_result/results/bins/637000110/hmmer.tree.txt'

Extracting 43 HMMs with 1 threads:
Aborted
*** Error in `hmmfetch': double free or corruption (out): 0x00000000025ad370 ***
Aborted
*** Error in `hmmfetch': double free or corruption (out): 0x0000000000e5c370 ***
Aborted
*** Error in `hmmfetch': double free or corruption (out): 0x00000000024e0370 ***
Aborted
*** Error in `hmmfetch': double free or corruption (out): 0x0000000000cd0370 ***
Aborted
*** Error in `hmmfetch': double free or corruption (out): 0x0000000002270370 ***
Aborted
*** Error in `hmmfetch': double free or corruption (out): 0x0000000001acb370 ***
Aborted
*** Error in `hmmfetch': double free or corruption (out): 0x0000000001629370 ***
Aborted
*** Error in `hmmfetch': double free or corruption (out): 0x0000000000d62370 ***
Aborted
*** Error in `hmmfetch': double free or corruption (out): 0x000000000216b370 ***
 Aborted
 *** Error in `hmmfetch': double free or corruption (out): 0x0000000001a44370 ***
Aborted
*** Error in `hmmfetch': double free or corruption (out): 0x0000000001fa5370 ***
Aborted
*** Error in `hmmfetch': double free or corruption (out): 0x0000000000c03370 ***
Aborted
*** Error in `hmmfetch': double free or corruption (out): 0x0000000000c63370 ***
Aborted
*** Error in `hmmfetch': double free or corruption (out): 0x000000000224f370 ***
Aborted
*** Error in `hmmfetch': double free or corruption (out): 0x0000000002545370 ***
Aborted
*** Error in `hmmfetch': double free or corruption (out): 0x000000000201f370 ***
Aborted
*** Error in `hmmfetch': double free or corruption (out): 0x00000000014c9370 ***
Aborted
*** Error in `hmmfetch': double free or corruption (out): 0x0000000000a0d370 ***
Aborted
*** Error in `hmmfetch': double free or corruption (out): 0x00000000021f9370 ***
Aborted
*** Error in `hmmfetch': double free or corruption (out): 0x00000000024be370 ***
Aborted
*** Error in `hmmfetch': double free or corruption (out): 0x0000000000f31370 ***
Aborted
*** Error in `hmmfetch': double free or corruption (out): 0x0000000001f9f370 ***
....... # it repeats this message few times more
Aborted
*** Error in `hmmfetch': double free or corruption (out): 0x0000000001e4b370 ***
Aborted
*** Error in `hmmfetch': double free or corruption (out): 0x00000000025dc370 ***
Aborted
Finished extracting 43 of 43 (100.00%) HMMs.
Aligning 43 marker genes with 1 threads:
Finished aligning 43 of 43 (100.00%) marker genes.
[Error] Make sure pplacer is on your system path.
Controlled exit resulting from an unrecoverable error or warning

echo $PATH
/usr/local/lib/partitionfinder-master/programs:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/local/lib/Prodigal-2.6.1:/usr/local/lib/prokka-1.10/bin:/usr/local/lib:/usr/local/lib/get_homologues-x86_64-20141112:/usr/local/lib/rnammer-1.2:/usr/local/lib/pplacer-v1.1:/usr/local/lib/pplacer-v1.1/scripts:/usr/local/lib/CheckM-master

Do you know what I am doing wrong?

Is bin_compare's binned_bases estimate off when contigs have not been uniquely binned

Small thing, but annoying. The best kind of bug, not.

Some binning methods do not uniquely assign contigs to bins - a single contig can be assigned to multiple bins. I haven't looked properly at the code, but I believe that bin_compare's # binned bases is simply adding up the number of bases in each fasta file independently, rather than checking for duplication? It is therefore overstating the number of binned bases.

More output formats for qa

Two new output formats they may be useful are an output where contigs with two copies of a single copy gene are reported. A second output which is tabular with the columns of contig id, bin id and marker id

Too many bins to plot?

Hi,

I get the following error:

*******************************************************************************
 [CheckM - bin_qa_plot] Creating bar plot of bin quality.
*******************************************************************************

  Calculating AAI between multi-copy marker genes.
  Plotting bin completeness, contamination, and strain heterogeneity.

Unexpected error: <type 'exceptions.ValueError'>
Traceback (most recent call last):
  File "/data/tools/CheckM/0.9.7/bin/checkm", line 717, in <module>
    checkmParser.parseOptions(args)
  File "/data/tools/CheckM/0.9.7/lib/python2.7/site-packages/checkm/main.py", line 1240, in parseOptions
    self.binQAPlot(options)
  File "/data/tools/CheckM/0.9.7/lib/python2.7/site-packages/checkm/main.py", line 820, in binQAPlot
    plot.plot(binFiles, binStatsExt, options.bIgnoreHetero, aai.aaiHetero)
  File "/data/tools/CheckM/0.9.7/lib/python2.7/site-packages/checkm/plot/binQAPlot.py", line 66, in plot
    xLabelBounds = self.xLabelExtents(sortedBinIds, self.options.font_size)
  File "/data/tools/CheckM/0.9.7/lib/python2.7/site-packages/checkm/plot/AbstractPlot.py", line 105, in xLabelExtents
    bbox = label.get_window_extent(self.get_renderer())
  File "/mnt/data/home/NIOO/mattiash/.virtualenvs/groopm/local/lib/python2.7/site-packages/matplotlib/backends/backend_agg.py", line 481, in get_renderer
    self.renderer = RendererAgg(w, h, self.figure.dpi)
  File "/mnt/data/home/NIOO/mattiash/.virtualenvs/groopm/local/lib/python2.7/site-packages/matplotlib/backends/backend_agg.py", line 94, in __init__
    self._renderer = _RendererAgg(int(width), int(height), dpi, debug=False)
ValueError: width and height must each be below 32768

I assume there are too many bins to plot, because the width and height is too large to be plotted with matplotlib. I have 202 bins. Are this indeed too many bins to plot with CheckM? I did not find any parameter to limit/filter the number of bins to be plotted.

Filtering of bins in QA plot

Some users have had problems with the bin_qa_plot command as it can't handle large numbers of bins. In the next version of CheckM, we should put in some options to only plot a specific number of the "best" bins.

Allow exclusion of specific marker genes

Expert review can often be used to identify marker genes that are legitimately absent. For example, this is common in novel phyla undergoing genome reduction. CheckM should accept a set of marker genes to be excluded from use when calculating the lineage-specific or taxonomic-specific marker sets.

Broken genome_tree package

I just downloaded the CheckM data from ACE (checkm data update), and the tree task is failing with the error message:

*******************************************************************************
 [CheckM - tree] Placing bins in reference genome tree.
*******************************************************************************

  Identifying marker genes in 1 bins with 12 threads:
    Finished processing 1 of 1 (100.00%) bins.
  Saving HMM info to file.

  Calculating genome statistics for 1 bins with 12 threads:
    Finished processing 1 of 1 (100.00%) bins.

  Extracting marker genes to align.
  Parsing HMM hits to marker genes:
    Finished parsing hits for 1 of 1 (100.00%) bins.
  Extracting 43 HMMs with 12 threads:
    Finished extracting 43 of 43 (100.00%) HMMs.
  Aligning 43 marker genes with 12 threads:
    Finished aligning 43 of 43 (100.00%) marker genes.

  Reading marker alignment files.
  Concatenating alignments.
  Placing 1 bins into the genome tree with pplacer (be patient).
Uncaught exception: syntax error lexing between line 1 character 0 and line 1 character 1 of /usr/local/opt/checkm/genome_tree/genome_tree_full.refpkg/phylo_modelytLSd6.json
Fatal error: exception Ppatteries.Sparse.Parse_error("syntax error lexing", _, _, _)
Uncaught exception: Sys_error("07.annotation/03.bins/checkm/storage/tree/concatenated.pplacer.json: No such file or directory")
Fatal error: exception Sys_error("07.annotation/03.bins/checkm/storage/tree/concatenated.pplacer.json: No such file or directory")

  { Current stage: 0:00:04.322 || Total: 0:00:04.322 }


This seems to be the result of a broken package in genome_tree/genome_tree_full.refpkg. The CONTENTS.json file has:

{
    "files": {
        "aln_fasta": "genome_tree.fasta",
        "phylo_model": "phylo_modelytLSd6.json",
        "tree": "genome_tree.tre",
        "tree_stats": "genome_tree.log"
    },
    "rollback": null,
    "log": [
        "Stripped refpkg (removed 0 files)",
        "Loaded initial files into empty refpkg"
    ],
    "metadata": {
        "create_date": "2015-01-14 09:54:02",
        "format_version": "1.1",
        "locus": "genome_tree_full"
    },
    "rollforward": null,
    "md5": {
        "aln_fasta": "69dada0c0070c7be4983e98ae0142e85",
        "phylo_model": "87c6c85d22b24bd1100d360cd843f75b",
        "tree": "f1b6e14d911cfdfea6af5f9bc0fc5dd7",
        "tree_stats": "8f2fcb7bca5c45632285394aab329908"
    }
}

However, the package doesn't contain a phylo_modelytLSd6.json file, instead it contains a phylo_modelytLSd6.json file, but it has an HTML error instead of the expected JSON data:

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>404 Not Found</title>
</head><body>
<h1>Not Found</h1>
<p>The requested URL /public/CheckM_databases//genome_tree/genome_tree_full.refpkg/phylo_modelytLSd6.json was not found on this server.</p>
<hr>
<address>Apache/2.2.15 (CentOS) Server at data.ace.uq.edu.au Port 443</address>
</body></html>

Is it possible that the data manifest in the ACE server is broken? Or can this be a bug in ScreamingBackpack?

Parsable output please

So the most interesting statistics outputed by CheckM are the following:

-----------------------------------------------------------------------------------------------------------------------------------------------------------
  Bin Id       Marker lineage      # genomes   # markers   # marker sets   0    1    2   3   4   5+   Completeness   Contamination   Strain heterogeneity
-----------------------------------------------------------------------------------------------------------------------------------------------------------
  contigs   k__Bacteria (UID203)      5449        104            58        70   33   1   0   0   0       40.17            1.72              100.00
-----------------------------------------------------------------------------------------------------------------------------------------------------------

The first thing is, why is this not included in a file somewhere ? I had to run it a couple times before figuring out that I should save the stdout to a file if I wanted to keep these statistics.

Then, it would make sense to have this easily parsable so that any pipeline can use these values for making graphs or anything else. But you have space delimited fields, with spaces in them ! The text (UID203) in this example is part of the Marker lineage but is itself separated by a space. Which makes a simple approach fail:

def checkm_stats(self):
     """The various statistics produced by checkm in a dictionary"""
    keys = OrderedDict((
        ("bin_id", str),
        ("lineage", str),
        ("genomes", int),
        ("markers", int),
        ("marker_sets", int),
        ("0", int), ("1", int), ("2", int), ("3", int), ("4", int), ("5+", int),
        ("completeness", float),
        ("contamination", float),
        ("heterogeneity", float)))
    values = list(self.checkm.p.stdout)[3].split()
    return {k: keys[k](values[i]) for i,k in enumerate(keys)}

Could you please think about making some more sound decisions when designing the main output of your program ?

Placing bins in reference genome tree

Hi, I am having some issues with "Placing bins in reference genome tree" the error message received is:


[CheckM - tree] Placing bins in reference genome tree.


Identifying marker genes in 118 bins with 2 threads:
Finished processing 118 of 118 (100.00%) bins.
Saving HMM info to file.

Calculating genome statistics for 118 bins with 2 threads:
Finished processing 118 of 118 (100.00%) bins.

Extracting marker genes to align.
Parsing HMM hits to marker genes:
Finished parsing hits for 118 of 118 (100.00%) bins.
Extracting 43 HMMs with 2 threads:
Finished extracting 43 of 43 (100.00%) HMMs.
Aligning 43 marker genes with 2 threads:
Finished aligning 43 of 43 (100.00%) marker genes.

Reading marker alignment files.
Concatenating alignments.
Placing 118 bins into the genome tree with pplacer (be patient).
Uncaught exception: Sys_error("/home/silas/checkdata/genome_tree/genome_tree_full.refpkg: No such file or directory")
Fatal error: exception Sys_error("/home/silas/checkdata/genome_tree/genome_tree_full.refpkg: No such file or directory")
Uncaught exception: Sys_error("tree_output/storage/tree/concatenated.pplacer.json: No such file or directory")
Fatal error: exception Sys_error("tree_output/storage/tree/concatenated.pplacer.json: No such file or directory")

{ Current stage: 0:11:08.877 || Total: 0:11:08.877 }

when I list the contents of the /home/silas/checkdata/genome_tree/ directory I see:

silas@silas-VirtualBox:~/Documents$ ls /home/silas/checkdata/genome_tree/
genome_tree.derep.txt genome_tree_prok.refpkg missing_duplicate_genes_50.tsv
genome_tree.metadata.tsv genome_tree.taxonomy.tsv missing_duplicate_genes_97.tsv

Is this an issue with my installation or is there meant to be a genome_tree_full.refpkg directory here.

Regards,
Silas.

Guppy

I downloaded and installed CheckM using pip as directed on this wiki, but when I run the 'checkm tree' command or execute 'checkm test' I get the following error:

[Error] Make sure guppy is on your system path.

Controlled exit resulting from an unrecoverable error or warning.

I don't know what guppy is or why it wasn't installed when I ran the pip installer. Can you please direct me?

Thanks!
Dan

checkm coverage determins no mapped reads

Hi, after upgrading pysam to the most recent version I was able to run "checkm coverage".

However "checkm coverage" keeps reporting zero percent properly mapped reads.
I am using read-mapping files generated from unpaired (single) reads by bowtie2 and subsequently converted to indexed (and sorted) BAM files using samtools.
my pipeline for mapping and BAM-conversion:
bowtie2 -x index -U reads1.fastq.gz;reads2.fastq.gz -S out.SAM
samtools view -bSh out.SAM > out.BAM
samtools sort out.BAM out.sorted
samtools index out.sorted.bam
I would prefer using bowtie2-files, because these are created anyway during my standard assembly&binning pipeline, and I would have to create new read mappings (taking additional caclulation time & clogging up my disk space) if i had to use another mapper for this.

the default alignment setting for bowtie2 is "--end-to-end" so each alignment should cover >98% of the corresponding reads.
My checkm command: (alignment-cutoff reduced to 90% to make totally sure that at least some hits should be reported):
checkm coverage -x fa -r -a 90 -t 8 bins checkm_coverage.out input/*.sorted.bam

from the bowtie2 summary output and from running the "get_abund.pl" script from MaxBin, I know that at least a portion of these reads should have mapped properly.

Maybe I'm just missing something rather basic here. Could you help me figure out what?

AttributeError in Unit tests step 1

Hi ~

I just installed CheckM successfully using 'pip'. But error occurs when I do unit tests.

I typed:
checkm test ~/checkm_test_results
the error like this:

yifan@yifan-VirtualBox:~$ checkm test ~/checkm_test_results

*******************************************************************************
[CheckM - Test] Processing E.coli K12-W3310 to verify operation of CheckM.
*******************************************************************************

[Step 1]: Verifying tree command.

*******************************************************************************
 [CheckM - tree] Placing bins in reference genome tree.
*******************************************************************************

 Identifying marker genes in 1 bins with 1 threads:
 Finished processing 1 of 1 (100.00%) bins.
 Saving HMM info to file.

Calculating genome statistics for 1 bins with 1 threads:
Finished processing 1 of 1 (100.00%) bins.

Extracting marker genes to align.
Parsing HMM hits to marker genes:
Finished parsing hits for 1 of 1 (100.00%) bins.
Extracting 43 HMMs with 1 threads:
Finished extracting 43 of 43 (100.00%) HMMs.
Aligning 43 marker genes with 1 threads:
Finished aligning 43 of 43 (100.00%) marker genes.


Unexpected error: <type 'exceptions.AttributeError'>`
Traceback (most recent call last):
File "/usr/local/bin/checkm", line 712, in <module>
checkmParser.parseOptions(args)`
 File "/usr/local/lib/python2.7/dist-packages/checkm/main.py", line 1306, in parseOptions
self.test(options)
File "/usr/local/lib/python2.7/dist-packages/checkm/main.py", line 1194, in test
verifyEcoli.run(self, options.output_dir)
File "/usr/local/lib/python2.7/dist-packages/checkm/test/test_ecoli.py", line 65, in run
parser.tree(options)
File "/usr/local/lib/python2.7/dist-packages/checkm/main.py", line 157, in tree
pplacer = PplacerRunner(threads=options.pplacer_threads)  # fix at one thread to keep memory requirements reasonable
AttributeError: Options instance has no attribute 'pplacer_threads'

Looking for any help, I will appreciate it!
Yifan

The prefix option is broken

If a user specifies the -p option in build it causes the qa step to break cause it does not take this into account.

Specify path for tmp directory

The /tmp directory on my server is on a separate partition that is limited to 5GB. When running checkM lineage_wf I end up with a whole bunch of errors from hmmer as the tmp directory fills up fast. It would be fantastic if I could specify the directory for all of those tmp files so I can put them in another partition with more space

Check-M test fails at step two.

hi

I have just installed Check M and get the following response in the installation test at step two

IOError: [Errno 2] No such file or directory: '/home/ashmita/checkm_test_results/results/storage/tree/concatenated.tre

Any advice as to how to fix it will be appreciated.

thank you
ashmita

Question regarding to taxonomy classification

This is one of typical examples I have:

tree_qa output:
TH03311 43 0 k__Bacteria;p__Bacteroidetes;c__Bacteroidia;o__Bacteroidales;f__Porphyromonadaceae

qa output:
TH03311 k__Bacteria (UID2569) 434 278 186 5 270 3 0 0 0 97.85 1.12 0.00

So, my question is that why in many cases qa output predict higher taxon rank than tree_qa. Should I ignore tree_qa output? Where does the discrepancy come from?

It would be great to have some good explanation of it. Thanks!

missing dependencies on custom install?

Hi,

I was forced to do a custom install which seemed to complete without error. Subsequently tried to run checkm --help and traceback indicated ScreamingBackpack was not found. I installed this into my local python lib directory then retried. Next traceback indicated dendropy was not found. After installing dendropy into my custom python lib directory, checkm runs fine. I see the ScreamingBackpack requirement in setup.py so I am not sure how/why this happened.

Also, I found that hmmer/3.1b1 is needed since hmmer/3.0 does not read the checkm hmm files correctly, or so checkm reports.

customHelperFormatter.py error

The following file customFormatterHelper.py is missing in checkm directory and is needed to run checkm:

# Based on the following SO discussion:
# http://stackoverflow.com/questions/15499955/argparse-subcommand-order-in-help-output
import argparse
class CustomHelpFormatter(argparse.HelpFormatter):
    def _iter_indented_subactions(self, action):
        try:
            get_subactions = action._get_subactions
        except AttributeError:
            pass
        else:
            self._indent()
            if isinstance(action, argparse._SubParsersAction):
                for subaction in sorted(get_subactions(), key=lambda
    x: x.dest):
                    yield subaction
            else:
                for subaction in get_subactions():
                    yield subaction
            self._dedent()

undefined symbol: gzopen64

After installing checkM using pip per instruction, when I issue the sudo checkm data command, I got the following error:

Traceback (most recent call last):
File "/usr/local/bin/checkm", line 36, in
from checkm import main
File "/usr/local/lib/python2.7/site-packages/checkm/main.py", line 28, in
from checkm.resultsParser import ResultsParser
File "/usr/local/lib/python2.7/site-packages/checkm/resultsParser.py", line 34, in
from checkm.coverage import Coverage
File "/usr/local/lib/python2.7/site-packages/checkm/coverage.py", line 30, in
import pysam
File "/usr/local/lib/python2.7/site-packages/pysam/init.py", line 1, in
from pysam.libchtslib import *
ImportError: /usr/local/lib/python2.7/site-packages/pysam/libchtslib.so: undefined symbol: gzopen64

We are running RHEL 6.2. Any suggestions would be great.

Thanks.

checkM build usage?

Hi folks,

Looks like a great tool that would hook in nicely with some of the hmm marker gene sets being developed in the Eisen lab. I'm having a bit of trouble getting things going here. I was trying to test if everything was working using a reference genome and a single one of the phylosift markers but must be messing up file formats or the usage.

Any pointers here? In what format should files/folders passed via -H be? Any ideas why build is thowing the error below and then hanging?

Thanks!
Lizzy

$ checkM build outdirTest DesulfotaleaPsychrophilaLSv54_177439.fasta -H phyeco/bacteria_and_archaea_dir/BA00001.hmm
Processing file DesulfotaleaPsychrophilaLSv54_177439.fasta (1 of 1)
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 763, in run
self.__target(_self.__args, *_self.__kwargs)
File "/usr/local/lib/python2.7/dist-packages/metachecka2000/dataConstructor.py", line 177, in processFasta
HR.search(hmm, out_file.name, out_dir, '--cpu 1')
TypeError: search() takes exactly 4 arguments (5 given)

Continue/Update Option

I think it would be useful to have a feature (probably as a command line option) to allow users to add new genomes to the genome bins directory of past CheckM analyses and then rerun CheckM in "update" mode where only the new files are processed and all output files are updated to reflect the additions.

We have a continually growing database of several hundred bacterial genomes that we like to keep up-to-date CheckM analyses for and it consumes a fair bit of unnecessary computation time to run the full analysis over when we add a few new genomes.

running error

Hi

I installed checkm and tried to test using this command line.

checkm test ~/checkm_test_results

But it printed this error message,

Traceback (most recent call last):
File "/usr/local/bin/checkm", line 36, in
from checkm import main
File "/usr/local/lib/python2.7/dist-packages/checkm/main.py", line 49, in
from checkm.plot.gcPlots import GcPlots
File "/usr/local/lib/python2.7/dist-packages/checkm/plot/gcPlots.py", line 24, in
from AbstractPlot import AbstractPlot
File "/usr/local/lib/python2.7/dist-packages/checkm/plot/AbstractPlot.py", line 22, in
from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvas
File "/usr/lib/pymodules/python2.7/matplotlib/backends/backend_agg.py", line 27, in
from matplotlib.backend_bases import RendererBase,
File "/usr/lib/pymodules/python2.7/matplotlib/backend_bases.py", line 50, in
import matplotlib.textpath as textpath
File "/usr/lib/pymodules/python2.7/matplotlib/textpath.py", line 11, in
import matplotlib.font_manager as font_manager
File "/usr/lib/pymodules/python2.7/matplotlib/font_manager.py", line 1356, in
_rebuild()
File "/usr/lib/pymodules/python2.7/matplotlib/font_manager.py", line 1343, in _rebuild
pickle_dump(fontManager, _fmcache)
File "/usr/lib/pymodules/python2.7/matplotlib/font_manager.py", line 939, in pickle_dump
with open(filename, 'wb') as fh:

IOError: [Errno 2] No such file or directory: '/tmp/matplotlib-Jinu/fontList.cache'

And I tried,

sudo checkm test ~/checkm_test_results

It printed this message:



[CheckM - Test] Processing E.coli K12-W3310 to verify operation of CheckM.


[Step 1]: Verifying tree command.


[CheckM - tree] Placing bins in reference genome tree.


[Error] Make sure prodigal is on your system path.

Controlled exit resulting from an unrecoverable error or warning.

Because I cannot run prodigal with sudo.

It's confusing, sorry.

I have all the python packages in the right version.
I am using ubuntu 14.04 on google cloud.

Thanks,

Jinu

Newick output format problems

Hey,

Can you give me an explanation of the following snippet of newick file that checkM produces. In the definitions of newick I've found there is nothing about pipe separated lists. I'm currently trying to modify a parser so that I can successfully read this type of file.

||0.716:0.08878)UID3298||1.000:0.03202)UID3214|p__Proteobacteria|0.772:0.03881)UID3193||1.000:0.02044)

Placing bins in reference genome tree problem

Hi,

I am having issues with "checkm tree". I installed checkm earlier this week and the "test" ran just fine. However, when I tried to analyze my bins, I got the error below. I tried running the "tree" command on the test data (which worked for "test") and still got the same error.


[CheckM - tree] Placing bins in reference genome tree.


Identifying marker genes in 1 bins with 4 threads:
Finished processing 1 of 1 (100.00%) bins.
Saving HMM info to file.

Calculating genome statistics for 1 bins with 4 threads:
Finished processing 1 of 1 (100.00%) bins.

Extracting marker genes to align.
Parsing HMM hits to marker genes:
Finished parsing hits for 1 of 1 (100.00%) bins.
Extracting 43 HMMs with 4 threads:
Finished extracting 43 of 43 (100.00%) HMMs.
Aligning 43 marker genes with 4 threads:
Finished aligning 43 of 43 (100.00%) marker genes.

Reading marker alignment files.
Concatenating alignments.
Placing 1 bins into the genome tree with pplacer (be patient).
Killed
Uncaught exception: Sys_error("./testOUT/storage/tree/concatenated.pplacer.json: No such file or directory")
Fatal error: exception Sys_error("./testOUT/storage/tree/concatenated.pplacer.json: No such file or directory")

{ Current stage: 0:21:44.524 || Total: 0:21:44.524 }

I tried updating the CheckM dependency data with "sudo checkm data update", which didn't help, so then I downloaded the CheckM data (v1.0.3) and used "sudo checkm data setRoot" to point to the new directory. I still got the same error.

I am not sure if the issue is in my installation or if something is missing.

Best,

Elena

Retrieve marker genes as flat fasta file

It would be a good additional feature to provide a simple way to recover the marker genes identified in each bin. These could be used to verify the marker gene assignment, be blasted to identify potential sources of contamination, and potentially used to build phylogenetic trees.

Bins get assigned a taxonomy when they have no markers

I've noticed that some of my bins get assigned a taxonomy, like deltaproteobacteria even when they have 0 markers. How is this possible, shouldn't they all get assigned to root?

|  Bin Id                             | Marker lineage                   | # genomes | # markers | # marker sets | Completeness | Contamination | Strain heterogeneity | Genome size
|-------------------------------------+----------------------------------+-----------+-----------+---------------+--------------+---------------+----------------------+-----------

final.contigs.fa.metabat-bins-.53  | c__Deltaproteobacteria (UID3216) | 83        | 248       | 156           | 0.0          | 0.0           | 0.0                  | 347526    
|  final.contigs.fa.metabat-bins-.54  | k__Bacteria (UID1452)            | 924       | 163       | 110           | 0.0          | 0.0           | 0.0                  | 1689298   
|  final.contigs.fa.metabat-bins-.59  | c__Deltaproteobacteria (UID3216) | 83        | 248       | 156           | 0.0          | 0.0           | 0.0                  | 2135342   
|  final.contigs.fa.metabat-bins-.6   | p__Euryarchaeota (UID49)         | 95        | 229       | 154           | 0.0          | 0.0           | 0.0                  | 815132    
|  final.contigs.fa.metabat-bins-.60  | k__Bacteria (UID203)             | 5449      | 104       | 58            | 0.0          | 0.0           | 0.0                  | 1485401 

checkm tree_qa: meaning of "Taxonomy (contained" and "Taxonomy (sister lineage)"

Hello,
Running "checkm tree_qa" gives me information on the taxonomic placement of the bins. These include the values "Taxonomy (contained)" and "Taxonomy (sister lineage)". What exactly do these mean and what is the difference?

I assumed that "Taxonomy (contained)" indicates the most dominant lineage (where the majority of the detected markergenes are assigned to) and that "Taxonomy (sister lineage)" could indicate the second most dominant lineage. Therfore highly related taxons for "Taxonomy (contained)" and "Taxonomy (sister lineage)" would be a further indication for a relatively "clean" bin, while more unrelated taxons (etc "contained" = alphaproteobacteria and "sister lineage" = deltaproteobacteria) would indicate contamination. Is that correct?

However, for some bins I have highly related Taxons for these values [e.g: "Taxonomy (contained)"= "f_Methylococcaceae" and "Taxonomy (sister lineage)" g_methylomonas"] but the selected Marker lineage was "k__Bacteria". Shouldn't the selected marker lineage have been [at least] Gammaproteobacteria in these cases [as happened for some other bins with the same "contained" and "sister lineage" Taxonomies]?

Plotting commands fail for complicated bin names

It appears that some of the plotting functions (gc_plot, tetra_plot) fail when bins have complicated names such as cck10.scaffolds_final.fa.metabat-bins---p1_94--p2_93_--minProb_85_--minBinned_20_-t_16.50.fa. I suspect this is due to 'fa' appearing twice.

Error running checkm test

Hi, I receive the following error when trying to run the checkm test after installation:

silas@silas-VirtualBox:~$ checkm test ~/checkm_test_results


[CheckM - Test] Processing E.coli K12-W3310 to verify operation of CheckM.


[Step 1]: Verifying tree command.


[CheckM - tree] Placing bins in reference genome tree.


Unexpected error: <type 'exceptions.AttributeError'>
Traceback (most recent call last):
File "/usr/local/bin/checkm", line 710, in
checkmParser.parseOptions(args)
File "/usr/local/lib/python2.7/dist-packages/checkm/main.py", line 1266, in parseOptions
self.test(options)
File "/usr/local/lib/python2.7/dist-packages/checkm/main.py", line 1156, in test
verifyEcoli.run(self, options.output_dir)
File "/usr/local/lib/python2.7/dist-packages/checkm/test/test_ecoli.py", line 60, in run
parser.tree(options)
File "/usr/local/lib/python2.7/dist-packages/checkm/main.py", line 123, in tree
options.bCalledGenes)
AttributeError: Options instance has no attribute 'bCalledGenes'

Error when updating data when changing data directory

I wanted to change the data directory for my checkM installation (version 0.9.4), which resulted in checkM wanting to update the data. It fails with an OSError

checkm data setRoot

*******************************************************************************
 [CheckM - data] Check for database updates. [setRoot]
*******************************************************************************

Where should CheckM store it's data?
Please specify a location or type 'abort' to stop trying: 
/export/data1/db/checkm/
Path [/export/data1/db/checkm] exists and you have permission to write to this folder
(re) creating manifest file (please be patient)
****************************************************************
10 new file(s) to be downloaded from source
2 existing file(s) to be updated
138.02 MB will need to be downloaded
Confirm you want to download this data
Changes *WILL* be permanent
Continue? (y,n) : y
****************************************************************
****************************************************************
The following 6 file(s) are scheduled to be removed
genome_tree/genome_tree_prok.refpkg
genome_tree/genome_tree_prok.refpkg/genome_tree.derep.log
genome_tree/genome_tree_prok.refpkg/genome_tree.final.tre
genome_tree/genome_tree_prok.refpkg/phylo_modelytLSd6.json
genome_tree/genome_tree_prok.refpkg/CONTENTS.json
genome_tree/genome_tree_prok.refpkg/genome_tree.concatenated.derep.fasta
Confirm you want to delete these files
Changes *WILL* be permanent
Delete files? (y,n) : y
****************************************************************
/export/data1/db/checkm/genome_tree/genome_tree_prok.refpkg/genome_tree.derep.log

Unexpected error: <type 'exceptions.OSError'>
Traceback (most recent call last):
  File "/export/data1/sw/checkm/0.9.4/bin/checkm", line 657, in <module>
    checkmParser.parseOptions(args)
  File "/export/data1/sw/checkm/0.9.4/lib/python2.7/site-packages/checkm/main.py", line 1134, in parseOptions
    self.updateCheckM_DB(options)
  File "/export/data1/sw/checkm/0.9.4/lib/python2.7/site-packages/checkm/main.py", line 84, in updateCheckM_DB
    self.DBM.runAction(options.action)
  File "/export/data1/sw/checkm/0.9.4/lib/python2.7/site-packages/checkm/checkmData.py", line 141, in runAction
    path = self.setRoot()
  File "/export/data1/sw/checkm/0.9.4/lib/python2.7/site-packages/checkm/checkmData.py", line 184, in setRoot
    self.update()
  File "/export/data1/sw/checkm/0.9.4/lib/python2.7/site-packages/checkm/checkmData.py", line 164, in update
    prompt=True)
  File "/export/data1/sw/screamingbackpack/0.2.2/lib/python2.7/site-packages/screamingbackpack/manifestManager.py", line 278, in updateManifest
    os.remove(delete)
OSError: [Errno 2] No such file or directory: '/export/data1/db/checkm/genome_tree/genome_tree_prok.refpkg/genome_tree.derep.log'

CheckM statistics of called gene files

CheckM calculates a number of auxiliary statistics (GC, genome size, ...). Many of these can not be calculated when CheckM is supplied with called genes as opposed to scaffolds in nucleotide space. However, at the moment CheckM is still supporting this statistics which is misleading.

tetra_plot: no files generated

Hi,
I've been trying to use tetra_plot, however although the program exits without error, it does not generate any graph or file. Any idea why it may happened?

Command:
checkm tetra_plot -x fa ./tetraout ./bins ./tetra/plots ./tetra_profile.tsv 95

Log file (some repetitive lines omitted):


[CheckM - tetra_plot] Creating tetra-distance histogram and delta-TD plot.


Plotting tetranuclotide distance plots for /srv/scratch/z3382651/meta_analysis/megahit_good/bins/water.1.fa (1 of 507)
Plot written to: /srv/scratch/z3382651/meta_analysis/megahit_good/megahitout/checkmout/tetra/plots/water.1.tetra_dist_plots.png
Plotting tetranuclotide distance plots for /srv/scratch/z3382651/meta_analysis/megahit_good/bins/water.10.fa (2 of 507)
Plot written to: /srv/scratch/z3382651/meta_analysis/megahit_good/megahitout/checkmout/tetra/plots/water.10.tetra_dist_plots.png
[............................]
Plotting tetranuclotide distance plots for /srv/scratch/z3382651/meta_analysis/megahit_good/bins/water.98.fa (506 of 507)
Plot written to: /srv/scratch/z3382651/meta_analysis/megahit_good/megahitout/checkmout/tetra/plots/water.98.tetra_dist_plots.png
Plotting tetranuclotide distance plots for /srv/scratch/z3382651/meta_analysis/megahit_good/bins/water.99.fa (507 of 507)
Plot written to: /srv/scratch/z3382651/meta_analysis/megahit_good/megahitout/checkmout/tetra/plots/water.99.tetra_dist_plots.png

{ Current stage: 0:27:41.679 || Total: 0:27:41.679 }

VERSION file missing

Hi,

Installation fails since a VERSION file is absent in checkm/ directory.

~/git_repos/CheckM $ sudo python setup.py install
Traceback (most recent call last):
  File "setup.py", line 13, in <module>
    version=version(),
  File "setup.py", line 8, in version
    versionFile = open(os.path.join(setupDir, 'checkm', 'VERSION'))
IOError: [Errno 2] No such file or directory: '/home/senthil/git_repos/CheckM/checkm/VERSION'

However, if I do this: ~/git_repos/CheckM $ echo "1.1" > checkm/VERSION, the installation goes well:

~/git_repos/CheckM $ sudo python setup.py install
/usr/local/lib/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'install_requires'
  warnings.warn(msg)
running install
running build
running build_py
creating build
creating build/lib
creating build/lib/checkm
copying checkm/profile.py -> build/lib/checkm
copying checkm/defaultValues.py -> build/lib/checkm
copying checkm/markerSets.py -> build/lib/checkm
copying checkm/common.py -> build/lib/checkm
copying checkm/hmmerAligner.py -> build/lib/checkm
copying checkm/coverage.py -> build/lib/checkm
copying checkm/binUnion.py -> build/lib/checkm
copying checkm/binStatistics.py -> build/lib/checkm
copying checkm/genomicSignatures.py -> build/lib/checkm
copying checkm/resultsParser.py -> build/lib/checkm
copying checkm/taxonParser.py -> build/lib/checkm
copying checkm/main.py -> build/lib/checkm
copying checkm/unbinned.py -> build/lib/checkm
copying checkm/prettytable.py -> build/lib/checkm
copying checkm/binComparer.py -> build/lib/checkm
copying checkm/treeParser.py -> build/lib/checkm
copying checkm/markerGeneFinder.py -> build/lib/checkm
copying checkm/binTools.py -> build/lib/checkm
copying checkm/uniqueMarkers.py -> build/lib/checkm
copying checkm/coverageWindows.py -> build/lib/checkm
copying checkm/hmmerModelParser.py -> build/lib/checkm
copying checkm/pplacer.py -> build/lib/checkm
copying checkm/checkmData.py -> build/lib/checkm
copying checkm/PCA.py -> build/lib/checkm
copying checkm/merger.py -> build/lib/checkm
copying checkm/aminoAcidIdentity.py -> build/lib/checkm
copying checkm/__init__.py -> build/lib/checkm
copying checkm/timeKeeper.py -> build/lib/checkm
copying checkm/prodigal.py -> build/lib/checkm
copying checkm/ssuFinder.py -> build/lib/checkm
copying checkm/hmmer.py -> build/lib/checkm
creating build/lib/checkm/plot
copying checkm/plot/binQAPlot.py -> build/lib/checkm/plot
copying checkm/plot/cumulativeLengthPlot.py -> build/lib/checkm/plot
copying checkm/plot/tetraDistPlots.py -> build/lib/checkm/plot
copying checkm/plot/markerGenePosPlot.py -> build/lib/checkm/plot
copying checkm/plot/lengthHistogram.py -> build/lib/checkm/plot
copying checkm/plot/gcBiasPlots.py -> build/lib/checkm/plot
copying checkm/plot/gcPlots.py -> build/lib/checkm/plot
copying checkm/plot/distributionPlots.py -> build/lib/checkm/plot
copying checkm/plot/parallelCoordPlot.py -> build/lib/checkm/plot
copying checkm/plot/pcaPlot.py -> build/lib/checkm/plot
copying checkm/plot/codingDensityPlots.py -> build/lib/checkm/plot
copying checkm/plot/__init__.py -> build/lib/checkm/plot
copying checkm/plot/nxPlot.py -> build/lib/checkm/plot
copying checkm/plot/AbstractPlot.py -> build/lib/checkm/plot
creating build/lib/checkm/test
copying checkm/test/test_seqUtils.py -> build/lib/checkm/test
copying checkm/test/test_markerSets.py -> build/lib/checkm/test
copying checkm/test/test_ecoli.py -> build/lib/checkm/test
copying checkm/test/test_taxonomyUtils.py -> build/lib/checkm/test
copying checkm/test/test_aminoAcidIdentity.py -> build/lib/checkm/test
copying checkm/test/test_genomicSignatures.py -> build/lib/checkm/test
copying checkm/test/test_binStatistics.py -> build/lib/checkm/test
copying checkm/test/__init__.py -> build/lib/checkm/test
creating build/lib/checkm/util
copying checkm/util/img.py -> build/lib/checkm/util
copying checkm/util/taxonomyUtils.py -> build/lib/checkm/util
copying checkm/util/seqUtils.py -> build/lib/checkm/util
copying checkm/util/pfam.py -> build/lib/checkm/util
copying checkm/util/__init__.py -> build/lib/checkm/util
copying checkm/VERSION -> build/lib/checkm
copying checkm/DATA_CONFIG -> build/lib/checkm
running build_scripts
creating build/scripts-2.7
copying and adjusting bin/checkm -> build/scripts-2.7
changing mode of build/scripts-2.7/checkm from 644 to 755
running install_lib
creating /usr/local/lib/python2.7/site-packages/checkm
copying build/lib/checkm/profile.py -> /usr/local/lib/python2.7/site-packages/checkm
copying build/lib/checkm/defaultValues.py -> /usr/local/lib/python2.7/site-packages/checkm
copying build/lib/checkm/markerSets.py -> /usr/local/lib/python2.7/site-packages/checkm
copying build/lib/checkm/common.py -> /usr/local/lib/python2.7/site-packages/checkm
copying build/lib/checkm/hmmerAligner.py -> /usr/local/lib/python2.7/site-packages/checkm
copying build/lib/checkm/coverage.py -> /usr/local/lib/python2.7/site-packages/checkm
copying build/lib/checkm/binUnion.py -> /usr/local/lib/python2.7/site-packages/checkm
copying build/lib/checkm/binStatistics.py -> /usr/local/lib/python2.7/site-packages/checkm
creating /usr/local/lib/python2.7/site-packages/checkm/test
copying build/lib/checkm/test/test_seqUtils.py -> /usr/local/lib/python2.7/site-packages/checkm/test
copying build/lib/checkm/test/test_markerSets.py -> /usr/local/lib/python2.7/site-packages/checkm/test
copying build/lib/checkm/test/test_ecoli.py -> /usr/local/lib/python2.7/site-packages/checkm/test
copying build/lib/checkm/test/test_taxonomyUtils.py -> /usr/local/lib/python2.7/site-packages/checkm/test
copying build/lib/checkm/test/test_aminoAcidIdentity.py -> /usr/local/lib/python2.7/site-packages/checkm/test
copying build/lib/checkm/test/test_genomicSignatures.py -> /usr/local/lib/python2.7/site-packages/checkm/test
copying build/lib/checkm/test/test_binStatistics.py -> /usr/local/lib/python2.7/site-packages/checkm/test
copying build/lib/checkm/test/__init__.py -> /usr/local/lib/python2.7/site-packages/checkm/test
copying build/lib/checkm/genomicSignatures.py -> /usr/local/lib/python2.7/site-packages/checkm
copying build/lib/checkm/resultsParser.py -> /usr/local/lib/python2.7/site-packages/checkm
copying build/lib/checkm/taxonParser.py -> /usr/local/lib/python2.7/site-packages/checkm
copying build/lib/checkm/main.py -> /usr/local/lib/python2.7/site-packages/checkm
copying build/lib/checkm/unbinned.py -> /usr/local/lib/python2.7/site-packages/checkm
copying build/lib/checkm/prettytable.py -> /usr/local/lib/python2.7/site-packages/checkm
copying build/lib/checkm/binComparer.py -> /usr/local/lib/python2.7/site-packages/checkm
copying build/lib/checkm/treeParser.py -> /usr/local/lib/python2.7/site-packages/checkm
copying build/lib/checkm/DATA_CONFIG -> /usr/local/lib/python2.7/site-packages/checkm
copying build/lib/checkm/markerGeneFinder.py -> /usr/local/lib/python2.7/site-packages/checkm
copying build/lib/checkm/binTools.py -> /usr/local/lib/python2.7/site-packages/checkm
copying build/lib/checkm/uniqueMarkers.py -> /usr/local/lib/python2.7/site-packages/checkm
copying build/lib/checkm/coverageWindows.py -> /usr/local/lib/python2.7/site-packages/checkm
creating /usr/local/lib/python2.7/site-packages/checkm/plot
copying build/lib/checkm/plot/binQAPlot.py -> /usr/local/lib/python2.7/site-packages/checkm/plot
copying build/lib/checkm/plot/cumulativeLengthPlot.py -> /usr/local/lib/python2.7/site-packages/checkm/plot
copying build/lib/checkm/plot/tetraDistPlots.py -> /usr/local/lib/python2.7/site-packages/checkm/plot
copying build/lib/checkm/plot/markerGenePosPlot.py -> /usr/local/lib/python2.7/site-packages/checkm/plot
copying build/lib/checkm/plot/lengthHistogram.py -> /usr/local/lib/python2.7/site-packages/checkm/plot
copying build/lib/checkm/plot/gcBiasPlots.py -> /usr/local/lib/python2.7/site-packages/checkm/plot
copying build/lib/checkm/plot/gcPlots.py -> /usr/local/lib/python2.7/site-packages/checkm/plot
copying build/lib/checkm/plot/distributionPlots.py -> /usr/local/lib/python2.7/site-packages/checkm/plot
copying build/lib/checkm/plot/parallelCoordPlot.py -> /usr/local/lib/python2.7/site-packages/checkm/plot
copying build/lib/checkm/plot/pcaPlot.py -> /usr/local/lib/python2.7/site-packages/checkm/plot
copying build/lib/checkm/plot/codingDensityPlots.py -> /usr/local/lib/python2.7/site-packages/checkm/plot
copying build/lib/checkm/plot/__init__.py -> /usr/local/lib/python2.7/site-packages/checkm/plot
copying build/lib/checkm/plot/nxPlot.py -> /usr/local/lib/python2.7/site-packages/checkm/plot
copying build/lib/checkm/plot/AbstractPlot.py -> /usr/local/lib/python2.7/site-packages/checkm/plot
copying build/lib/checkm/hmmerModelParser.py -> /usr/local/lib/python2.7/site-packages/checkm
copying build/lib/checkm/VERSION -> /usr/local/lib/python2.7/site-packages/checkm
copying build/lib/checkm/pplacer.py -> /usr/local/lib/python2.7/site-packages/checkm
copying build/lib/checkm/checkmData.py -> /usr/local/lib/python2.7/site-packages/checkm
copying build/lib/checkm/PCA.py -> /usr/local/lib/python2.7/site-packages/checkm
copying build/lib/checkm/merger.py -> /usr/local/lib/python2.7/site-packages/checkm
copying build/lib/checkm/aminoAcidIdentity.py -> /usr/local/lib/python2.7/site-packages/checkm
creating /usr/local/lib/python2.7/site-packages/checkm/util
copying build/lib/checkm/util/img.py -> /usr/local/lib/python2.7/site-packages/checkm/util
copying build/lib/checkm/util/taxonomyUtils.py -> /usr/local/lib/python2.7/site-packages/checkm/util
copying build/lib/checkm/util/seqUtils.py -> /usr/local/lib/python2.7/site-packages/checkm/util
copying build/lib/checkm/util/pfam.py -> /usr/local/lib/python2.7/site-packages/checkm/util
copying build/lib/checkm/util/__init__.py -> /usr/local/lib/python2.7/site-packages/checkm/util
copying build/lib/checkm/__init__.py -> /usr/local/lib/python2.7/site-packages/checkm
copying build/lib/checkm/timeKeeper.py -> /usr/local/lib/python2.7/site-packages/checkm
copying build/lib/checkm/prodigal.py -> /usr/local/lib/python2.7/site-packages/checkm
copying build/lib/checkm/ssuFinder.py -> /usr/local/lib/python2.7/site-packages/checkm
copying build/lib/checkm/hmmer.py -> /usr/local/lib/python2.7/site-packages/checkm
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/profile.py to profile.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/defaultValues.py to defaultValues.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/markerSets.py to markerSets.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/common.py to common.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/hmmerAligner.py to hmmerAligner.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/coverage.py to coverage.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/binUnion.py to binUnion.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/binStatistics.py to binStatistics.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/test/test_seqUtils.py to test_seqUtils.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/test/test_markerSets.py to test_markerSets.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/test/test_ecoli.py to test_ecoli.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/test/test_taxonomyUtils.py to test_taxonomyUtils.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/test/test_aminoAcidIdentity.py to test_aminoAcidIdentity.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/test/test_genomicSignatures.py to test_genomicSignatures.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/test/test_binStatistics.py to test_binStatistics.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/test/__init__.py to __init__.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/genomicSignatures.py to genomicSignatures.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/resultsParser.py to resultsParser.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/taxonParser.py to taxonParser.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/main.py to main.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/unbinned.py to unbinned.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/prettytable.py to prettytable.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/binComparer.py to binComparer.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/treeParser.py to treeParser.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/markerGeneFinder.py to markerGeneFinder.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/binTools.py to binTools.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/uniqueMarkers.py to uniqueMarkers.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/coverageWindows.py to coverageWindows.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/plot/binQAPlot.py to binQAPlot.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/plot/cumulativeLengthPlot.py to cumulativeLengthPlot.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/plot/tetraDistPlots.py to tetraDistPlots.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/plot/markerGenePosPlot.py to markerGenePosPlot.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/plot/lengthHistogram.py to lengthHistogram.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/plot/gcBiasPlots.py to gcBiasPlots.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/plot/gcPlots.py to gcPlots.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/plot/distributionPlots.py to distributionPlots.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/plot/parallelCoordPlot.py to parallelCoordPlot.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/plot/pcaPlot.py to pcaPlot.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/plot/codingDensityPlots.py to codingDensityPlots.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/plot/__init__.py to __init__.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/plot/nxPlot.py to nxPlot.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/plot/AbstractPlot.py to AbstractPlot.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/hmmerModelParser.py to hmmerModelParser.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/pplacer.py to pplacer.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/checkmData.py to checkmData.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/PCA.py to PCA.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/merger.py to merger.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/aminoAcidIdentity.py to aminoAcidIdentity.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/util/img.py to img.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/util/taxonomyUtils.py to taxonomyUtils.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/util/seqUtils.py to seqUtils.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/util/pfam.py to pfam.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/util/__init__.py to __init__.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/__init__.py to __init__.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/timeKeeper.py to timeKeeper.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/prodigal.py to prodigal.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/ssuFinder.py to ssuFinder.pyc
byte-compiling /usr/local/lib/python2.7/site-packages/checkm/hmmer.py to hmmer.pyc
running install_scripts
copying build/scripts-2.7/checkm -> /usr/local/bin
changing mode of /usr/local/bin/checkm to 755
running install_egg_info
Writing /usr/local/lib/python2.7/site-packages/checkm_genome-1.1-py2.7.egg-info

Sen

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.