sanger-pathogens / iva Goto Github PK

de novo virus assembler of Illumina paired reads

Home Page: http://sanger-pathogens.github.io/iva/

License: Other

Python 64.46% Java 9.93% Perl 20.61% Shell 3.84% Roff 0.50% Dockerfile 0.66%

genomics sequencing next-generation-sequencing research bioinformatics bioinformatics-pipeline global-health infectious-diseases pathogen

iva's Introduction

IVA

Iterative Virus Assembler - de novo virus assembler of Illumina paired reads.

PLEASE NOTE: we currently do not have the resources to provide support for IVA, so please do not expect a reply if you flag any issue.

Introduction
Installation
Running the tests
Usage
License
Feedback/Issues
Citation

Introduction

IVA is a de novo assembler designed to assemble virus genomes that have no repeat sequences, using Illumina read pairs sequenced from mixed populations at extremely high and variable depth.

For more information, please read the IVA publication.

Installation

For installation instructions, please refer to the IVA website

Running the tests

The test can be run with dzil from the top level directory:

python setup.py test

Usage

usage: iva [options] {-f reads_fwd -r reads_rev | --fr reads} <output directory>

positional arguments:
  Output directory      Name of output directory (must not already exist)

optional arguments:
  -h, --help            show this help message and exit

Input and output:
  -f filename[.gz], --reads_fwd filename[.gz]
                        Name of forward reads fasta/q file. Must be used in
                        conjunction with --reads_rev
  -r filename[.gz], --reads_rev filename[.gz]
                        Name of reverse reads fasta/q file. Must be used in
                        conjunction with --reads_fwd
  --fr filename[.gz]    Name of interleaved fasta/q file
  --keep_files          Keep intermediate files (could be many!). Default is
                        to delete all unnecessary files
  --contigs filename[.gz]
                        Fasta file of contigs to be extended. Incompatible
                        with --reference
  --reference filename[.gz]
                        EXPERIMENTAL! This option is EXPERIMENTAL, not
                        recommended, and has not been tested! Fasta file of
                        reference genome, or parts thereof. IVA will try to
                        assemble one contig per sequence in this file.
                        Incompatible with --contigs
  -v, --verbose         Be verbose by printing messages to stdout. Use up to
                        three times for increasing verbosity.

SMALT mapping options:
  -k INT, --smalt_k INT
                        kmer hash length in SMALT (the -k option in smalt
                        index) [19]
  -s INT, --smalt_s INT
                        kmer hash step size in SMALT (the -s option in smalt
                        index) [11]
  -y FLOAT, --smalt_id FLOAT
                        Minimum identity threshold for mapping to be reported
                        (the -y option in smalt map) [0.5]

Contig options:
  --ctg_first_trim INT  Number of bases to trim off the end of every contig
                        before extending for the first time [25]
  --ctg_iter_trim INT   During iterative extension, number of bases to trim
                        off the end of a contig when extension fails (then try
                        extending again) [10]
  --ext_min_cov INT     Minimum kmer depth needed to use that kmer to extend a
                        contig [10]
  --ext_min_ratio FLOAT
                        Sets N, where kmer for extension must be at least N
                        times more abundant than next most common kmer [4]
  --ext_max_bases INT   Maximum number of bases to try to extend on each
                        iteration [100]
  --ext_min_clip INT    Set minimum number of bases soft clipped off a read
                        for those bases to be used for extension [3]
  --max_contigs INT     Maximum number of contigs allowed in the assembly. No
                        more seeds generated if the cutoff is reached [50]

Seed generation options:
  --make_new_seeds      When no more contigs can be extended, generate a new
                        seed. This is forced to be true when --contigs is not
                        used
  --seed_start_length INT
                        When making a seed sequence, use the most common kmer
                        of this length. Default is to use the minimum of
                        (median read length, 95). Warning: it is not
                        recommended to set this higher than 95
  --seed_stop_length INT
                        Stop extending seed using perfect matches from reads
                        when this length is reached. Future extensions are
                        then made by treating the seed as a contig
                        [0.9*max_insert]
  --seed_min_kmer_cov INT
                        Minimum kmer coverage of initial seed [25]
  --seed_max_kmer_cov INT
                        Maximum kmer coverage of initial seed [1000000]
  --seed_ext_max_bases INT
                        Maximum number of bases to try to extend on each
                        iteration [50]
  --seed_overlap_length INT
                        Number of overlapping bases needed between read and
                        seed to use that read to extend [seed_start_length]
  --seed_ext_min_cov INT
                        Minimum kmer depth needed to use that kmer to extend a
                        contig [10]
  --seed_ext_min_ratio FLOAT
                        Sets N, where kmer for extension must be at least N
                        times more abundant than next most common kmer [4]

Read trimming options:
  --trimmomatic FILENAME
                        Provide location of trimmomatic.jar file to enable
                        read trimming. Required if --adapters used
  --trimmo_qual STRING  Trimmomatic options used to quality trim reads
                        [LEADING:10 TRAILING:10 SLIDINGWINDOW:4:20]
  --adapters FILENAME   Fasta file of adapter sequences to be trimmed off
                        reads. If used, must also use --trimmomatic. Default
                        is file of adapters supplied with IVA
  --min_trimmed_length INT
                        Minimum length of read after trimming [50]
  --pcr_primers FILENAME
                        FASTA file of primers. The first perfect match found
                        to a sequence in the primers file will be trimmed off
                        the start of each read. This is run after trimmomatic
                        (if --trimmomatic used)

Other options:
  -i INT, --max_insert INT
                        Maximum insert size (includes read length). Reads with
                        inferred insert size more than the maximum will not be
                        used to extend contigs [800]
  -t INT, --threads INT
                        Number of threads to use [1]
  --kmc_onethread       Force kmc to use one thread. By default the value of
                        -t/--threads is used when running kmc
  --strand_bias FLOAT in [0,0.5]
                        Set strand bias cutoff of mapped reads when trimming
                        contig ends, in the interval [0,0.5]. A value of x
                        means that a base needs min(fwd_depth, rev_depth) /
                        total_depth <= x. The only time this should be used is
                        with libraries with overlapping reads (ie fragment
                        length < 2*read length), and even then, it can make
                        results worse. If used, try a low value like 0.1 first
                        [0]
  --test                Run using built in test data. All other options will
                        be ignored, except the mandatory output directory, and
                        --trimmomatic and --threads can be also be used
  --version             show program's version number and exit

For usage help and examples, see the IVA wiki page.

License

IVA is free software, licensed under GPLv3.

Feedback/Issues

Please report any issues to the issues page.

PLEASE NOTE: we currently do not have the resources to provide support for IVA, so please do not expect a reply if you flag any issue.

Citation

If you use this software please cite:

IVA: accurate de novo assembly of RNA virus genomes.
Hunt M, Gall A, Ong SH, Brener J, Ferns B, Goulder P, Nastouli E, Keane JA, Kellam P, Otto TD.
Bioinformatics. 2015 Jul 15;31(14):2374-6. doi: 10.1093/bioinformatics/btv120. Epub 2015 Feb 28.

Adapter sequences:
Optimal enzymes for amplifying sequencing libraries.
Quail, M. a et al. Nat. Methods 9, 10-1 (2012).

GAGE:
GAGE: A critical evaluation of genome assemblies and assembly algorithms.
Salzberg, S. L. et al. Genome Res. 22, 557-67 (2012).

KMC:
Disk-based k-mer counting on a PC.
Deorowicz, S., Debudaj-Grabysz, A. & Grabowski, S. BMC Bioinformatics 14, 160 (2013).

Kraken:
Kraken: ultrafast metagenomic sequence classification using exact alignments.
Wood, D. E. & Salzberg, S. L. Genome Biol. 15, R46 (2014).

MUMmer:
Versatile and open software for comparing large genomes.
Kurtz, S. et al. Genome Biol. 5, R12 (2004).

R:
R: A language and environment for statistical computing.
R Core Team (2013). R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/.

RATT:
RATT: Rapid Annotation Transfer Tool.
Otto, T. D., Dillon, G. P., Degrave, W. S. & Berriman, M. Nucleic Acids Res. 39, e57 (2011).

SAMtools:
The Sequence Alignment/Map format and SAMtools.
Li, H. et al. Bioinformatics 25, 2078-9 (2009).

Trimmomatic:
Trimmomatic: A flexible trimmer for Illumina Sequence Data.
Bolger, A. M., Lohse, M. & Usadel, B. Bioinformatics 1-7 (2014).

iva's People

Contributors

Stargazers

Watchers

Forkers

martinghunt fridge004 andrewjpage el-mat satta bgistone sanbihiv pathogen-informatics alfredug cfe-lab xinggui007 gunzivan28 sanjaymsh gtonkinhill gsarfo-boateng jingydz1 flywind2 hmacleod

iva's Issues

multithreading smalt error

When running IVA on full genome influenza samples and enabling multiple threads, I get the following smalt error:

smalt.c:807 ERROR: The two FASTA/FASTQ input file have different numbers of reads
[W::sam_read1] parse error at line 812880
[main_samview] truncated file.

I do not get this error if I run the same sample with a single thread.

IVA sensitivity threshold

Hi,
I am new with IVA. I am analyzing Illumina HIV data (paired end) with reads ~250-350 bp covering regions of up to ~2kb.
With my local pipeline, I cleaned up and filtered the reads and got the allele frequencies along these regions. I was willing to use IVA for reconstructing representative variants for these regions.
With the standard command, it only generated one contig. I was wondering whether I could adjust some of the options to generate less predominant variants and not only the contig?
I tried various ways (e.g. smalt_id threshold) but I got one contig only though I am expecting some diversity.
Is there anything you could suggest as parameters to adapt to get more representative /less frequent haplotypes?
Can we get the contigs relative frequencies along with the contigs themselves?

thanks!!!!

example:
iva --smalt_id 0.01 --fr Data_Interleaved.fastq Output_dir

samtools sort command fails for samtools 1.3.1

$ iva -f new_1.fastq -r new_2.fastq IVAout
The following command failed with exit code 1
samtools sort -@1 -m 500M IVAout/tmp.trim_strand_biased_ends.iqmm5fcr/out.unsorted.bam IVAout/tmp.trim_strand_biased_ends.iqmm5fcr/out

The output was:

[bam_sort] Use -T PREFIX / -o FILE to specify temporary and final output files
Usage: samtools sort [options...] [in.bam]
Options:
-l INT Set compression level, from 0 (uncompressed) to 9 (best)
-m INT Set maximum memory per thread; suffix K/M/G recognized [768M]
-n Sort by read name
-o FILE Write final output to FILE rather than standard output
-T PREFIX Write temporary files to PREFIX.nnnn.bam
-@, --threads INT
Set number of sorting and compression threads [1]
--input-fmt-option OPT[=VAL]
Specify a single input file format option in the form
of OPTION or OPTION=VALUE
-O, --output-fmt FORMAT[,OPT[=VAL]]...
Specify output format (SAM, BAM, CRAM)
--output-fmt-option OPT[=VAL]
Specify a single output file format option in the form
of OPTION or OPTION=VALUE
--reference FILE
Reference sequence FASTA FILE [null]

Tests fail with KMC 2.3 from Homebrew on OS X

The testsuite (python3 setup.py test) fails when using KMC 2.3 (from Homebrew) on OS X 10.11:

[...]
Test run_trimmomatic ... ok
test_process_seeds (seed_processor_test.TestSeedProcessor)
Test process_seeds ... The following command failed with exit code 1
bash run_kmc.sh

The output was:

*
Error: Cannot open temporary file /Users/satta/foss/iva/tmp.run_kmc.4rxfrvg2/kmc_00253.bin

The following command failed with exit code 1
bash run_kmc.sh

The output was:

*
Error: Cannot open temporary file /Users/satta/foss/iva/tmp.run_kmc.8_00xctz/kmc_00253.bin

This error is apparently from KMC itself, all the previous temporary files have length zero. All dependencies are also installed from Homebrew.

It has to be noted that the tests work fine on Linux when KMC 2.3 is used. This may be an OS X quirk.

Influenza segments missing

Hi,
For most cases, IVA works great for flu samples but sometimes, if the depth is not very high (<50x), a segment (usually PB2) is completely missing in the contigs results. I have been playing with --seed_ext_min_cov 1 --seed_min_kmer_cov 10 and seems to work. Anyway, is it a good idea to set the --seed_ext_min_cov 1?
Thanks

thread option not passed to kmc

Hi,

I noticed that IVA's thread option (which is used for SMALT) is not passed down to KMC (__run_kmc() and _run_kmc_with_script()), even though it also supports multithreading via -t. No big problem, but an easy to implement improvement.

Thanks for IVA
Andreas

chimeric/misassembled contigs generated by IVA?

Hi all,
I am interested in using IVA to assemble RNA virus genomes.
What is the reported level of chimeric contigs yielded using this assembler?
Does IVA provide an improved performance regarding misassembled contigs?

Thank you,
Guillermo

Blank sequence causes TypeError

Thanks for publishing this assembly tool, it's been very useful for us. I wanted to let you know about a problem we ran into with some bad input data that was hard to track down.

If one of the input reads has a blank sequence, then the pysam reader reads it in as None instead of a blank string. Then, I see the following error:

$ iva -f 2140A-HCV_S17_L001_R1_001.fastq -r 2140A-HCV_S17_L001_R2_001.fastq scratch
Traceback (most recent call last):
  File "/mnt/data/don/git/MiCall/venv_micall/bin/iva", line 286, in <module>
    assembly.read_pair_extend(reads_prefix, 'iteration')
  File "/mnt/data/don/git/MiCall/venv_micall/lib/python3.6/site-packages/iva/assembly.py", line 423, in read_pair_extend
    self._read_pair_extension_iterations(current_reads_prefix, out_prefix + '.' + str(i))
  File "/mnt/data/don/git/MiCall/venv_micall/lib/python3.6/site-packages/iva/assembly.py", line 358, in _read_pair_extension_iterations
    bases_added = self._extend_with_reads(reads_prefix, out_prefix + '.1', no_map_contigs)
  File "/mnt/data/don/git/MiCall/venv_micall/lib/python3.6/site-packages/iva/assembly.py", line 340, in _extend_with_reads
    bases_added = self._extend_contigs_with_bam(bam, out_prefix=reads_prefix)
  File "/mnt/data/don/git/MiCall/venv_micall/lib/python3.6/site-packages/iva/assembly.py", line 183, in _extend_contigs_with_bam
    print(mapping.sam_to_fasta(sam), file=fa_out1)
  File "/mnt/data/don/git/MiCall/venv_micall/lib/python3.6/site-packages/pyfastaq/sequences.py", line 420, in __str__
    return '>' + self.id + '\n' + '\n'.join(self.seq[i:i+Fasta.line_length] for i in range(0, len(self), Fasta.line_length))
  File "/mnt/data/don/git/MiCall/venv_micall/lib/python3.6/site-packages/pyfastaq/sequences.py", line 173, in __len__
    return len(self.seq)
TypeError: object of type 'NoneType' has no len()
$

To reproduce this error, unzip the attached file, and try it out. This file is a minimal example with about 30 Hepatitis C reads that will assemble successfully if you remove the last read.
2140A-HCV_S17.zip

It would be nice if mapping.sam_to_fasta() either checked for None and wrote out a blank sequence, or if the earlier code checked for blank sequences and failed with a more helpful error message.

Thanks again.

Excessive number of iterations

I am using IVA to assemble sequence reads from multiple samples from the same virus. While most samples will assemble in hours, some are taking days with hundreds of iterations without ever completing (I killed a few jobs after a week of running). Is this expected behavior? If so, is there a way to limit the number of iterations and successfully output the contigs that were able to be assembled? Any ideas what aspects of the data might cause this behavior?

test run_gage fails

When I do "python setup.py test", the "run_gage" test fails with the following message:

Traceback (most recent call last):
File "/Users/myname/iva/iva/tests/qc_external_test.py", line 36, in test_run_gage
self.assertEqual(got_lines[1:], expected_lines[1:])
AssertionError: Lists differ: ['Con[223 chars]n', 'Chaff bases: 0\n', 'Duplicated Reference [529 chars]0\n'] != ['Con[223 chars]n', 'Genome Size: 3000\n', 'Assembly Size: 299[870 chars]s\n']

I am running tests from a University Owned CentOS 7.4, where I do not have root privilege. I tried running tests using python 2 and 3, and both gave me this error. I tried proceeding with installation anyway, and the iva executable seems to be working just fine.

Attached is the temporary folder created for the test. I am not sure what other information I should provide at this point, please let me know and I will provide them. Thanks for your help.
tmp.qc_external_test_run_gage.zip

iva test failing with samtools1.4

$ iva --test --trimmomatic /data/Trimmomatic-0.36/trimmomatic-0.36.jar /home/user123/ivatest/
Running iva in test mode...
Copied input test files into here: /home/user123/ivatest
Current working directory: /data/home/user123/ivatest
Running iva on the test data with the command:
/opt/python3.6/bin/iva --threads 1 --trimmomatic /data/Trimmomatic-0.36/trimmomatic-0.36.jar --pcr_primers hiv_pcr_primers.fa -f reads_1.fq.gz -r reads_2.fq.gz iva.out
The following command failed with exit code 1
/opt/python3.6/bin/iva --threads 1 --trimmomatic /data/Trimmomatic-0.36/trimmomatic-0.36.jar --pcr_primers hiv_pcr_primers.fa -f reads_1.fq.gz -r reads_2.fq.gz iva.out

The output was:

Traceback (most recent call last):
File "/opt/python3.6/bin/iva", line 129, in
iva.external_progs.get_all_versions(iva.external_progs.assembly_progs)
File "/opt/python3.6/lib/python3.6/site-packages/iva/external_progs.py", line 103, in get_all_versions
raise Error('Found version ' + version + ' of ' + prog + ' but must be at least ' + minimum_versions[prog] + '. Cannot continue')
iva.external_progs.Error: Found version 0.1.18 of samtools but must be at least 0.1.19. Cannot continue

support an explicit output argument

Support for a -o or --output_dir argument for the output folder. Currently attempting to run IVA as part of an automated pipeline and getting an error

iva --threads $T --verbose 2 -f $forwards -r $reverse $output_dir

$output_dir is a path to the output directory

I get an error
iva: error: unrecognized argurments assembly/samplename

How can this be resolved?

Failed to make first seed. Cannot continue

I am running into the error
"Failed to make first seed. Cannot continue"

I am running on mac with the command

ulimit -n 2048; iva --threads 8 -f read1_val_1.fq.gz -r read2_val_2.fq.gz Assembly

samtools errors caused by mapping.py

Hi all,

changing these two lines in mapping.py
seems to prevent an error thrown by samtools
which made IVA runs exit.

old:

    #sort_cmd = 'samtools sort -@' + str(threads) + ' -m ' + str(thread_mem) + 'M ' + intermediate_bam + ' ' + out_prefix

new:

    sort_cmd = 'samtools sort -@' + str(threads) + ' -m ' + str(thread_mem) + 'M ' + intermediate_bam + ' -o' + final_bam

old:

    #index_cmd = 'samtools index ' + out_prefix + '.bam'

new:

    index_cmd = 'samtools index ' + final_bam

Cheers,
Katharina

IVA run error

The following command failed with exit code 1
bash run_kmc.sh

The output was:

**
Error: Cannot open temporary file /Users/Nicholas/Desktop/shiver/MyOutputDirectory/tmp.run_kmc.9_d1enoh/kmc_00253.bin

iva --test "Error! File hiv_pcr_primers.fa" not found. Cannot continue.

Hi,

I tried installing iva via pip3 or with docker on my Mac and each time I tried to run iva --test testdir, it gives me this:

Running iva in test mode...
Copied input test files into here: /Users/user/testdir
Current working directory: /Users/user/testdir
Running iva on the test data with the command:
/Library/Frameworks/Python.framework/Versions/3.8/bin/iva --threads 1 --pcr_primers hiv_pcr_primers.fa -f reads_1.fq.gz -r reads_2.fq.gz iva.out
The following command failed with exit code 1
/Library/Frameworks/Python.framework/Versions/3.8/bin/iva --threads 1 --pcr_primers hiv_pcr_primers.fa -f reads_1.fq.gz -r reads_2.fq.gz iva.out

The output was:

Error! File "hiv_pcr_primers.fa" not found. Cannot continue
Error! File "hiv_pcr_primers.fa" not found. Cannot continue
Error in input reads files. Cannot continue

The file hiv_pcr_primers.fa exists in testdir. Why can't it find it?

I've also tried using iva with real data and I get the same "file not found" errors.

Thanks for your assistance!

Support samtools 1.3

"samtools sort" in version 1.3 dropped support for old usage. Need to make IVA compatible with samtools 0.1.19-1.3.

iva --version does not give the correct version

I have installed the latest version from iva (1.0.11) from GitHub however:
$ iva --version
1.0.8

I am 100% certain that I am using the latest installed version of iva:
$ which iva
/usr/local/bin/lmod/iva/1.0.11/venv/bin/iva

Funny enough we had the same issue with 1.0.10 and 1.0.9-2.

How can we know that we are using the good version?

Kind regards,
Elisabeth

Dead link to sanger virtual machine on main page (https://sanger-pathogens.github.io/iva/)

Under "Introduction" at https://sanger-pathogens.github.io/iva/ there is a link in the sentence: "IVA is installed on the Sanger pathogens virtual machine" that points to http://sanger-pathogens.github.io/pathogens-vm/, which is dead.

IVA: kmc error

Hi,

I tried the following command:

iva --trimmomatic trimmomatic-0.39.jar --adapters ../adapters.fasta -f ../AKULO_S1_L001_R1_001.fastq.gz -r ../AKULO_S1_L001_R2_001.fastq.gz MyOutputDirectory/ --threads 96

But came with this error:

`The following command failed with exit code 1
bash run_kmc.sh

The output was:

KMC dump ver. 2.1.1 (2015-01-22)

Usage:
kmc_dump [options] <kmc_database> <output_file>
Parameters:
<kmc_database> - kmer_counter's output
Options:
-ci - print k-mers occurring less than times
-cx - print k-mers occurring more of than times`

I am using UBUNTU 20.0.4. The info.txt file has:

/home/admin1/anaconda3/bin/iva --trimmomatic trimmomatic-0.39.jar --adapters ../adapters.fasta -f ../AKULO_S1_L001_R1_001.fastq.gz -r ../AKULO_S1_L001_R2_001.fastq.gz MyOutputDirectory/ --threads 96 IVA version 1.0.8 Using kmc version 2.1.1 Using kmc_dump version 2.1.1 Using nucmer version UNKNOWN ... I tried running this to get the version: "nucmer --version" and the output didn't match this regular expression: "^NUCmer $NUCleotide MUMmer$ version (.*)$" Using samtools version 1.10 Using smalt version 0.7.6

Any help is appreciated,

Waqas.

Different results on repeated runs

Thanks for publishing IVA, it's been really helpful in our project. However, when I try to assemble the same reads multiple times, I sometimes get different results. Here's a small example that tries to assemble 200 read pairs that were randomly generated from the first 2000 bases of HIV's pol gene. The test script assembles them 20 times, and prints the location of each contig within the pol gene. You can see that the contig sizes and locations change from run to run.

$ sudo docker run -it --rm sangerpathogens/iva
root@8d1ab295e484:/# wget -qq https://raw.githubusercontent.com/cfe-lab/MiCall/55f622e236286f2d003b255bc11a8b1080168335/micall/tests/microtest/2180A-HIV_S22_L001_R1_001.fastq https://raw.githubusercontent.com/cfe-lab/MiCall/55f622e236286f2d003b255bc11a8b1080168335/micall/tests/microtest/2180A-HIV_S22_L001_R2_001.fastq https://raw.githubusercontent.com/cfe-lab/MiCall/55f622e236286f2d003b255bc11a8b1080168335/micall/tests/repeat_iva.py
root@8d1ab295e484:/# python3 repeat_iva.py 
contig.00001: 125-1080, contig.00002: 1383-1917, contig.00003: 1031-1474
contig.00001: 125-1071, contig.00002: 1031-1474, contig.00003: 1447-1806
contig.00001: 125-1080, contig.00002: 1031-1474, contig.00003: 1447-1806
contig.00001: 125-1080, contig.00002: 1383-1917, contig.00003: 1031-1474
contig.00001: 125-1071, contig.00002: 1383-1917, contig.00003: 1031-1474
contig.00001: 125-1080, contig.00002: 1383-1917, contig.00003: 1031-1474
contig.00001: 125-1071, contig.00002: 1031-1474, contig.00003: 1447-1806
contig.00001: 125-1080, contig.00002: 1031-1474, contig.00003: 1447-1806
contig.00001: 125-1080, contig.00002: 1383-1917, contig.00003: 1031-1474
contig.00001: 125-1071, contig.00002: 1383-1917, contig.00003: 1031-1474
contig.00001: 125-1080, contig.00002: 1031-1474, contig.00003: 1447-1806
contig.00001: 125-1071, contig.00002: 1031-1474, contig.00003: 1447-1806
contig.00001: 125-1071, contig.00002: 1031-1474, contig.00003: 1447-1806
contig.00001: 125-1080, contig.00002: 1383-1917, contig.00003: 1031-1474
contig.00001: 125-1071, contig.00002: 1383-1917, contig.00003: 1031-1474
contig.00001: 125-1071, contig.00002: 1031-1474, contig.00003: 1447-1806
contig.00001: 125-1080, contig.00002: 1031-1474, contig.00003: 1447-1806
contig.00001: 125-1071, contig.00002: 1383-1917, contig.00003: 1031-1474
contig.00001: 125-1080, contig.00002: 1383-1917, contig.00003: 1031-1474
contig.00001: 125-1071, contig.00002: 1383-1917, contig.00003: 1031-1474

I traced through the IVA code, and it looks like the problem is inconsistent results from the smalt mapper. I reported the problem to the smalt developers, but I haven't heard back yet.

In our project, we've worked around the problem by switching from smalt to bowtie2, so I'll create a pull request with our bowtie2 version.

runtime error

Hello
trying to use IVA on my Macbook Pro M1 (most recent OS)
no problems with the installation but got this error:

iva  -f R1_001.fastq.gz -r  R2_001.fastq.gz  iva_out
RuntimeError: 
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

any suggestions?
thank you!

iva version check of external programs fails

IVA using Python3.6 (Homebrew) on OSX El Capitan gives me the following error. As a workaround, I've just commented out lines 102 and 103 of external_progs.py.

Traceback (most recent call last):
  File "/usr/local/bin/iva", line 129, in <module>
    iva.external_progs.get_all_versions(iva.external_progs.assembly_progs)
  File "/usr/local/lib/python3.6/site-packages/iva/external_progs.py", line 102, in get_all_versions
    if prog in minimum_versions and LooseVersion(version) < LooseVersion(minimum_versions[prog]):
  File "/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/distutils/version.py", line 52, in __lt__
    c = self._cmp(other)
  File "/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/distutils/version.py", line 337, in _cmp
    if self.version < other.version:
TypeError: '<' not supported between instances of 'str' and 'int'

kmc std::bad_alloc

Running iva --test TestIVA inside a virtualbox running Ubuntu 16.04 I get the following error:

Copied input test files into here: /home/user/TestIVA
Current working directory: /home/user/TestIVA
Running iva on the test data with the command:
/home/user/.local/bin/iva --threads 1 --pcr_primers hiv_pcr_primers.fa -f reads_1.fq.gz -r reads_2.fq.gz iva.out
The following command failed with exit code 1
/home/user/.local/bin/iva --threads 1 --pcr_primers hiv_pcr_primers.fa -f reads_1.fq.gz -r reads_2.fq.gz iva.out

The output was:

The following command failed with exit code 134
bash run_kmc.sh

The output was:

terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
run_kmc.sh: line 2: 16564 Aborted                 (core dumped) kmc -fa -m4 -k95 -sf1 -ci25 -cs1000000 -cx1000000 /home/user/TestIVA/iva.out/tmp.common_kmers.ile75uwy/reads.fa kmc_out $PWD > /dev/null

The installation instructions involve

wget http://sun.aei.polsl.pl/kmc/download/kmc_dump

which pulls version 2.3; via http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&project=kmc&subpage=download version 3.0 is available. Using that, same problem. Not sure if this is a kmc bug or an IVA bug; please advise. Cheers

IVA: error: unrecognized arguments: path_to_output_directory

I am running into this error while using IVA.
IVA: error: unrecognized arguments: path_to_output_directory

the commandline arguments are
iva --threads 10 --verbose 2 -f forward_reads.gz -r reverse_reads.gz output_directory

problem with samtools version

The following command failed with exit code 127
smalt map -n 10 -O -i 800 -y 0.9 /assembly/tmp.process_seeds.qh2fsh25/out.map_index reads_1.fa reads_2.fa | samtools view -bS -T reference_in.fasta - > /assembly/tmp.process_seeds.qh2fsh25/out.unsorted.bam

The output was:

samtools: error while loading shared libraries: libcrypto.so.1.0.0: cannot open shared object file: No such file or directory

problem with samtools?
I did install it with Bioconda

Could not retrieve index file

Hello All,
This is Mani. I have downloaded iva and tried to assemble viral genome using the basic command. I am getting this particular error and I tried run the test command there also I got the same error I pasted this error because this is small and I can paste the complete error that the tool is giving. I have tried run the tool in different environment using conda there also I got the same. Can anyone help me with this? and explain what I am doing wrong here?

iva -f /home/jncasr/Assembly/10000FS1_R1.fastq -r /home/jncasr/Assembly/10000FS1_R2.fastq /home/jncasr/Assembly/out3
[E::idx_find_and_load] Could not retrieve index file for '/home/jncasr/Assembly/out3/tmp.common_kmers.ojoyvaqg/map.bam'
[E::idx_find_and_load] Could not retrieve index file for '/home/jncasr/Assembly/out3/tmp.common_kmers.euu5g4nh/map.bam'
[E::idx_find_and_load] Could not retrieve index file for '/home/jncasr/Assembly/out3/tmp.common_kmers.0xrz17tl/map.bam'
[E::idx_find_and_load] Could not retrieve index file for '/home/jncasr/Assembly/out3/tmp.common_kmers.em0i3o9k/map.bam'
[E::idx_find_and_load] Could not retrieve index file for '/home/jncasr/Assembly/out3/tmp.common_kmers.2jwlxggi/map.bam'
[E::idx_find_and_load] Could not retrieve index file for '/home/jncasr/Assembly/out3/tmp.common_kmers._z8m8dz0/map.bam'
[E::idx_find_and_load] Could not retrieve index file for '/home/jncasr/Assembly/out3/tmp.common_kmers.a8se87pz/map.bam'
[E::idx_find_and_load] Could not retrieve index file for '/home/jncasr/Assembly/out3/tmp.common_kmers.ixckra93/map.bam'
[E::idx_find_and_load] Could not retrieve index file for '/home/jncasr/Assembly/out3/tmp.common_kmers.qijajw4g/map.bam'
Failed to make first seed. Cannot continue

Thanks,
Manisenthil Shanmugam.

sanger-pathogens / iva Goto Github PK

iva's Introduction

IVA

Contents

Introduction

Installation

Running the tests

Usage

License

Feedback/Issues

Citation

iva's People

Contributors

Stargazers

Watchers

Forkers

iva's Issues

old:

new:

old:

new:

Recommend Projects

Recommend Topics

Recommend Org