Coder Social home page Coder Social logo

oshlack / mintie Goto Github PK

View Code? Open in Web Editor NEW
33.0 7.0 7.0 22.36 MB

Method for Identifying Novel Transcripts and Isoforms using Equivalence classes, in cancer and rare disease.

License: MIT License

R 6.04% Python 78.92% Groovy 8.79% Shell 6.25%
rna-seq cancer gene-fusions duplications cryptic-variants structural-variation transcriptomics rare-disease

mintie's People

Contributors

mcmero avatar nadiadavidson avatar wdecoster avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

mintie's Issues

Running MINTIE without control and without bpipe

Hi,

It is mentioned in the documentation that MINTIE can be run without control data as follows:

bpipe run @$MINTIEDIR/params.txt $MINTIEDIR/MINTIE.groovy $cases

Can MINTIE be run without control data and without bpipe like this?:

mintie -w -p params.txt cases/*.fastq.gz

One or more parallel stages aborted

I have installed the tools sucessfully, but I can't run more than 1 simultaneous case as bpipe. Everytime when several cases run to the stage assemble simultaneously, only the last processed case will continue to run,while others will stop with the details below:

====================== Pipeline Failed ===========================

One or more parallel stages aborted. The following messages were reported:

---------------------------------------- assemble  ( 14 )  -----------------------------------------

Command in stage assemble failed with exit status = 137 : 

rlens=`zcat 14/trim1.fastq.gz 14/trim2.fastq.gz                        | awk -v mrl=76 'BEGIN {minlen = mrl; maxlen = 0} {                             if (NR % 4 == 2) {                                 rlen = length($1) ;                                 if (rlen > maxlen) {maxlen = rlen}                                 if (rlen < minlen) {minlen = rlen}                             }} END {print minlen" "maxlen}'` ;             min_rlen=${rlens% *} ;             max_rlen=${rlens#* } ;              if [ ! -d /asnas/wangqf_group/yanln/MINTIE/14/14/SOAPassembly ]; then                 mkdir /asnas/wangqf_group/yanln/MINTIE/14/14/SOAPassembly ;             fi ;             cd /asnas/wangqf_group/yanln/MINTIE/14/14/SOAPassembly ;              echo "max_rd_len=$max_rlen" > config.config ;             echo -e "[LIB]\nq1=../../14/trim1.fastq.gz\nq2=../../14/trim2.fastq.gz" >> config.config ;             if [ -e SOAP.fasta ]; then rm SOAP.fasta ; fi ;             for k in 29 49 69 ; do                 if [ $k -gt $min_rlen ]; then                     echo "WARNING: Kmer size $k exceeds minimum read length ${min_rlen}. Please double check parameters." ;                 else                     /xtdisk/liuxin_group/suipp/ITD/RNA_seq/MINTIE-v0.3.0/tools/bin/soapdenovotrans pregraph -s config.config -o outputGraph_$k -K $k -p 6 ;                     /xtdisk/liuxin_group/suipp/ITD/RNA_seq/MINTIE-v0.3.0/tools/bin/soapdenovotrans contig -g outputGraph_$k ;                     cat outputGraph_$k.contig | sed "s/^>/>k${k}_/g" >> SOAP.fasta ;                 fi ;             done ;              cd ../../ ;             /xtdisk/liuxin_group/suipp/ITD/RNA_seq/MINTIE-v0.3.0/tools/bin/dedupe in=14/SOAPassembly/SOAP.fasta out=stdout.fa threads=6 overwrite=true |                 /xtdisk/liuxin_group/suipp/ITD/RNA_seq/MINTIE-v0.3.0/tools/bin/fasta_formatter |                 awk '!/^>/ { next } { getline seq } length(seq) > 100 { print $0 "\n" seq }' > 14/14_denovo_filt.fasta ;             if [ ! -s 14/14_denovo_filt.fasta ] ; then                 rm 14/14_denovo_filt.fasta ;                 echo "ERROR: de novo assembled contigs fasta file is empty." ;                 echo "Please check paths for SOAPdenovoTrans, dedupe and fasta" ;                 echo "formatter are correct, and their dependencies are installed." ;             fi ;

----------------------------------------------------------------------------------------------------

Use 'bpipe errors' to see output from failed commands.

The content of the parameter file is:

-p threads=6
-p assembly_mem=100
-p assembler=soap
-p scores=33
-p min_read_length=76
-p min_contig_len=100
-p minQScore=20
-p Ks=29,49,69
-p min_gap=3
-p min_clip=20
-p min_match=30,0.3
-p min_logfc=2
-p min_cpm=0.1
-p fdr=0.05
-p sort_ram=4G
-p gene_filter=
-p var_filter=
-p splice_motif_mismatch=1
-p fastqCaseFormat=cases/%_R*.fastq.gz
-p fastqControlFormat=controls/%_R*.fastq.gz
-p assemblyFasta=
-p run_de_step=true

I don't know why I can't run several ceses simultaneously, looking forward to your reply. Thanks.

Recommended performance optimisation for filter_fasta.py

Hi, When running with differential expression turned off, filter_fasta.py took extraordinarily long (in fact, it exceeded the cluster allocated time of 20 hrs).

Upon, closer inspection, it appears this is due to the record.id in tx_list[col_id].values bit. I've provided an alternative approach.

Before

tx_list = pd.read_csv(tx_list_file, sep='\t', header=header)

handle = open(fasta_file, 'r')
for record in SeqIO.parse(handle, 'fasta'):
    record.description = record.id
    if record.id in tx_list[col_id].values:
        sys.stdout.write(record.format('fasta'))
handle.close()

After

tx_list = pd.read_csv(tx_list_file, sep='\t', header=header)

lookup_list = set(tx_list[col_id].values.tolist())

handle = open(fasta_file, 'r')
for record in SeqIO.parse(handle, 'fasta'):
    record.description = record.id
    if record.id in lookup_list:
        sys.stdout.write(record.format('fasta'))
handle.close()

Brief test with 2000 lines fasta (16x reduction in runtime)

√ % wc -l sample_denovo_filt_2000.fasta
    2000 sample_denovo_filt_2000.fasta

(venv_mintie)  MINTIE
√ % time python filter_fasta_previous.py sample_denovo_filt_2000.fasta  eq_classes_de.txt --col_id contig > sample_de_contigs_prev.fasta
python filter_fasta_previous.py sample_denovo_filt_2000.fasta  --col_id conti  48.28s user 0.63s system 101% cpu 48.308 total

(venv_mintie)  MINTIE
√ % time python filter_fasta.py sample_denovo_filt_2000.fasta  eq_classes_de.txt --col_id contig > sample_de_contigs.fasta            
python filter_fasta.py sample_denovo_filt_2000.fasta eq_classes_de.txt  conti  3.81s user 0.46s system 119% cpu 3.582 total

(venv_mintie)  MINTIE
√ % diff sample_de_contigs.fasta sample_de_contigs_prev.fasta

(venv_mintie)  MINTIE
√ % echo $?                                                  
0 (i.e. files are identical)

can't manally construct reference for hg19

Hi, I tried to running this with hg19. However, the CHESS database not support to this version. Can you give me some suggestions of how to construct hg19 reference for this tool?

Thanks

Cluster Setup using conda environments

Hello,

You state in your Documentation that Mintie needs to be installed using the conda base environment if used on a cluster. On our cluster this is a bit tricky since there are conflicting packages and in general users may want to use environments to avoid conflicts with other packages they use.
Is there a way one could propagate environments? Is this limitation due to the generation of the submission scripts through bpipe? I'm not familiar with bpipe, but from a first glance it looked as if there is not an option yet to easily extend the submission scripts.

Thank you

Docker image?

Do you happen to have a docker image ready for analysis?

Stage post-process fail

Hi Marek,
I've updated the MINTIE code to the latest that is on github as of yesterday (18 May 2020) and now I get the following error within the stage post-process:
==================================== Stage post_process (Rh18) =====================================
/usr/local/lib/python3.7/site-packages/pandas/compat/init.py:84: UserWarning: Could not import the lzma module. Your installed Python is incomplete. Attempting to use lzma compression will result in a RuntimeError.
warnings.warn(msg)
Traceback (most recent call last):
File "/usr/local/MINTIE/annotate/post_process.py", line 180, in
main()
File "/usr/local/MINTIE/annotate/post_process.py", line 167, in main
contigs = add_de_info(contigs, de_results)
File "/usr/local/MINTIE/annotate/post_process.py", line 92, in add_de_info
contigs = contigs.drop_duplicates(ignore_index=True)
TypeError: drop_duplicates() got an unexpected keyword argument 'ignore_index'
Cleaned up file Rh18/Rh18_results.tsv to .bpipe/trash/Rh18_results.tsv.1
ERROR: stage post_process failed: Command in stage post_process failed with exit status = 1 :

/usr/local/bin/python3.7 /usr/local/MINTIE/annotate/post_process.py Rh18 Rh18/novel_contigs_info.tsv Rh18/eq_classes_de.txt Rh18/vaf_estimates.txt --log /data/SarcomaCellLines/Analysis/Rh18/MINTIE/Rh18/postprocess.log > Rh18/Rh18_results.tsv

These are the parameters I'm using:
" -p threads=8 -p genome_mem=8000000000 -p assembly_mem=8 -p assembler=soap -p scores=33 -p min_read_length=50 -p max_read_length=150 -p minQScore=20 -p Ks=79,49 -p min_gap=7 -p min_clip=20 -p min_match=30,0.3 -p min_logfc=2 -p min_cpm=0.1 -p fdr=0.05 -p sort_ram=4G -p gene_filter= -p var_filter= -p assemblyFasta= -p test_mode=false -p fastqCaseFormat=cases/%_R*.fastq.gz -p fastqControlFormat=controls/%_R*.fastq.gz"

Thank you!

Cannot setup reference

Hi, I installed mintie from conda (as suggested in the wiki):

mamba create -c conda-forge -c bioconda -n mintie mintie

Info on the package:

# Name                    Version              Build         Channel
mintie                    0.4.1                hdfd78af_0    bioconda

But when I try to setup the reference with mintie -r it fails.

Here same "interesting" parts of the log:

Generating references...                                                                                                                                                                                          
                                                                                                         
genome_fasta not found, setting this up...                                                                                                                                                                        
--2022-11-28 16:54:12--  http://ccb.jhu.edu/chess/data/hg38_p8.fa.gz
Resolving ccb.jhu.edu (ccb.jhu.edu)... 128.220.233.141                                                                                                                                                            
Connecting to ccb.jhu.edu (ccb.jhu.edu)|128.220.233.141|:80... connected.
HTTP request sent, awaiting response... 404 Not Found                                                    
2022-11-28 16:54:12 ERROR 404: Not Found.                                                                
                                                    
gzip: hg38_p8.fa.gz: No such file or directory      
[E::fai_build3_core] Failed to open the file hg38_p8.fa                                                                                                                                                           
[faidx] Could not build fai index hg38_p8.fa.fai
...
FileNotFoundError: [Errno 2] No such file or directory: 'chess2.2.gtf'
cat: tx2gene.success: No such file or directory
Checking that all required references were setup:
WARNING: genome_fasta could not be found!!!! You will need to setup genome_fasta manually, then add its path to references.groovy
WARNING: tx_annotation could not be found!!!! You will need to setup tx_annotation manually, then add its path to references.groovy
WARNING: trans_fasta could not be found!!!! You will need to setup trans_fasta manually, then add its path to references.groovy
WARNING: ann_info could not be found!!!! You will need to setup ann_info manually, then add its path to references.groovy
WARNING: tx2gene could not be found!!!! You will need to setup tx2gene manually, then add its path to references.groovy
gmap_refdir looks like it has been setup
gmap_genome looks like it has been setup
**********************************************************
WARNING: One or more command did not install successfully. See warning messages above. You will need to correct this before running MINTIE.

Here the full log.

Thanks,

How to use MINTIE with Uncompressed FASTQ Files?

The datasets I am working with consists of a large collection of paired-end FASTQ files, which are not in the .gz compressed format. When I attempt to run MINTIE with these uncompressed files by modifying the params.txt file (-p fastqCaseFormat=cases/%_R*.fastq;
-p fastqControlFormat=controls/%_R*.fastq). But, I encounter an error message that says, “Stage fastq_dedupe unable to locate one or more inputs specified by 'from' ending with [*.gz]”.

I understand that MINTIE is currently configured to work with gzipped fastq files. However, given the volume of data I am dealing with, it would be incredibly helpful if I could use MINTIE directly on uncompressed FASTQ files. This would save a significant amount of time that would otherwise be spent on file conversion. If it is possible to modify MINTIE to enable the direct use of FASTQ files, I would greatly appreciate your guidance on how to make this happen.

Replace gmap with minimap2 in align_contigs_against_genome

Many false positives seem to be a result of poor alignment of contigs to the genome, which are resulting is bad annoation. e.g.
k49_1134199 0 chr21 44220603 3 281S16M6810N5M2435N6M6199N13M50029N16M1I6M32I32M21I4M4911N3M16052N8M8047N10M2202N2I11M1I25M3I22M6D11M750N10M1163N4M1809N3M1264N7M24156N4M6825N5M3920N2M12899N7M4539N3M7819N8M10199N7M1D1M4453N8M5667N2M11730N3M55386N4M402730N2M6445N3I6M15389N1M13514N9M1I47M2N4I126M15D1279M * 00 CAGCGCTCCTGGCCCCCCGAAGTCCCAGAGCTGCTGACCCCCACCCCAGCTGCATCAGAGAGCCTGTCTGGGGCCAAGGTTGCCAGAGATTTCTGAAGACACAGCTTGTTCCTTGTTCTTGGCTGGTGGGTGCACAAGGACTTCTGGAAGGGATTTAGACGGGGCTGAGTGCTAGGATTAAAGTGGGGATGGGAGTACGGCAACAGAAAAACCTGGGAGCTAGCAATGCACCCAGCCCTTGACTGTGCCCTGGTGGACAGCCGAGCTGTGGCTCTAGCGTGAGCCAGTGCCTTCCTGTCCCTGCCAAGGGTGAGGCCAGAGTTGGCCCCGAGGCTAATGTTTCAGTGGGTGAGATTAGGTCGGCCGTACAGAGGCCGGTGGGCTCCCTGACATCCCTTCCAGGCAACCTGAAAGCACTGAAATAGCTTATGGCCCTGTGCCAGGGACCTTGGCCCAAGCTGCTGACCTCCAGGGTGGGGAGGGAGCTACCCCCAGGAGAAGAGTCACTCAGACAGCAGTATGAGCAAGCCAGCCAGCAGCTCCGTGCCTGCACCCAGCTCAGGGGAATCCCAGGGGGTTCAGATGCCCAGGAAGGAAAAGGGGACAGCGCTACTGCTATGGAATGAGACCACCACTTCTCCTGTTGTCCTTCCCAGCTTCTCCCCAACCTCCCCTTTTCCCTAGTTTATAAGACAGGAGAAAAGGGAGAAAGCAAAAAGCTGGAAAGAAACAGAAGTAAGATAAATAGCTAGACGACCTTGGCGCCACCACCTGGCCCTGGTGGTTAAAATGATAATAATATTAACCCCTGACCAAAACGACTGGTGTTATCTGTAAATCCCAGACATTGTGTGAGAAAGCACCGTAAAACTTTTTGTCCTATTAGCTGATGTGTGTAGCCCCCAGTCACGTTCCTCACGCTTACTTGATCTATTATGACCCTTTCACGTGGACCCCTTAGAGTTGTAAGCTCTTAAAAGGGCTAGGAATTTCTTTTTCGGGGAGCTCGGCTCTTAAGACGCAAGTCTGCTGACACTCCTGGCCAAATAAAGCCCTTCCTTCTTTAACCGAGTGTCTGAGGAATTCTGTCTGCGGCTTGTCCGGCTACAACGGTGCTGGAGCCCAGACTCTCAGGGAAAGGAACCCGAGCCGTCAGAAAACCATCTGATTCCAGGCTGGGGCAAGGGACATGGAGATGGGCCTGCAGCATCATGTTGCTCCAGAAAGCAAGAAAGTGCTCAGAACGGTAGAACGGGGATGCATGGACAGGACACGCAGCCAGACCTAGCGGATTTGAGCATCTCGGGGAAGAAAGGACAGCCACAGATCATGCACTACTGAACAAAATAAAACTGTGGGTCACGCTGATGAGAGAGAGGCTGCAGAGAAGGAGAGACCCTTCCTTAGGTTGGCAGCCGTGAGTGGCAGGCGGGGACCAGCACGGCACCAATCTGCAGCCATCGCAGTGATGGCGGCTTCAGGCGGGGACCTCCGCGGATGCTGAGCCTGCGGGTGCGATTTGATGAGGGCAGAACCTCACCAGCCCACAGTGGCTGCGAGGGGATCATGCAGCGGGATGGGGAGGCCGGGGGGATGCCGTCTCAGCAGAGCCGTCCACGCTGACCTCATCAAGACTGGGACGGGGCCACAGCAGTGCCTCTCATGGGCACTTAGGACACCGTCACTGAGGGGCTCCTGCCAAAGCACACCTGAGTCCAGGCAGAGGAAACTCCAGACAAGACCCCCGAGGGTCATGCTACAAAGCTGCTCTCCTGACTTCCTCAGAAACGCCCAAGGACAGGAAAGACAAAGAAAGCTGAGGACTTGTCCAGATTCAAGAAGCCCAAGGAGACGGCTGAGCGTAGGGCGAGCCTGGGTGAGGAGATTCAGAGCGTTAGACGGCTGAGCGCAGTGTGTGAACCTGGGTTAGGAGATTTGGGGCCTGAGATGGCTGAGTGCAGGGTGAGCCTGAGTGAGGAGATTCTGAGCCTGAGACAGCTGAGCACAGGGTGAGCCTGGGTGACAAAATCCACCAGGAAAATATGCTCACGAAGACATCATTGGGACAACCAATAAAATATGCGT * MD:Z:35AG4G1T8AC19C1C1CG3CC1GCT2CC2A7G24T4TCT3A2CCTC2GCT1A1T1T6C1T2TGAGGG2C1^GGGACA1CA1G48G17^G4A1G43C2C3C3TT7A1CC33A14T29C3T18C16C1A6T5TT^ATTATTATTATTAAC13T19T11A3A6TT1C8C3T2G1C8CA4A10A5C3A3G2CT6C1CA9T23C313C14A784 NH:i:1HI:i:1 NM:i:175 SM:i:40 XQ:i:40 X2:i:0 XO:Z:UU

The read should align to chr21:43268915-43270392 and chr2:231884096-231893280

Replace the following stage with minimap2 could be a simple improvement (but require a bunch of validation work). Keen on your thoughts @mcmero

align_contigs_against_genome = {
def sample_name = branch.name
output.dir = sample_name
produce('aligned_contigs_against_genome.sam'){
exec """
$gmap -D $gmap_refdir -d $gmap_genome -f samse -t $threads -x $min_gap --max-intronlength-ends=500000 -n 0 $input.fasta > $output
""", "align_contigs_against_genome"
}
}

Missing references.groovy file

Dear Marek,

I installed MINTIE using Mamba-forge (and micromamba, I tried both) on my WSL (Ubuntu). I tried to run a test in a folder outside of base folder and got an error:
A script, requested to be loaded from file '/home/drshamy/micromamba/envs/mintie/share/mintie-0.4.2-0/references.groovy', could not be accessed.
I looked in the referred folder and there is no file called references.groovy.
Should there be a file called references.groovy? Can you help me to solve the issue?
Thank you very much in advance.

missing references groovy

The pattern provided did not match

Hello,
I am trying to use Mintie (v4.2-0) for mouse reference ( I managed to create the groovy file) , however, I am keep getting below error. My pattern seems correct. Please let me know if I am missing anything here, could it be because I have "-" in my sample ids, if so is there a way to get around that ?

ERROR: The pattern provided 'cases/%_R*.fastq.gz' did not match any of the files provided as input [cases/SampleXR-01-04_R1.fastq.gz, cases/SampleXR-01-04_R2.fastq.gz, cases/SampleXR-01-05_R1.fastq.gz, cases/SampleXR-01-05_R2.fastq.gz, cases/SampleXR-01-06_R1.fastq.gz, cases/SampleXR-01-06_R2.fastq.gz, controls/SampleXR-01-01_R1.fastq.gz, controls/SampleXR-01-01_R2.fastq.gz, controls/SampleXR-01-02_R1.fastq.gz, controls/SampleXR-01-02_R2.fastq.gz, controls/SampleXR-01-03_R1.fastq.gz, controls/SampleXR-01-03_R2.fastq.gz]

Here is my param.txt


-p threads=16
-p assembly_mem=200
-p assembler=soap
-p scores=33
-p min_read_length=80
-p min_contig_len=100
-p minQScore=20
-p Ks=79,49
-p min_gap=7
-p min_clip=20
-p min_match=30,0.3
-p min_logfc=5
-p min_cpm=0.1
-p fdr=0.05
-p sort_ram=4G
-p gene_filter=
-p var_filter=
-p splice_motif_mismatch=4
-p fastqCaseFormat=cases/%_R*.fastq.gz
-p fastqControlFormat=controls/%_R*.fastq.gz
-p assemblyFasta=
-p run_de_step=true

Also, is there an option to provide custom reference.groovy, beside replacing it in $MINTIEDIR ?

Pipeline failed at Assemble stage

Dear Marek,

I have installed Mintie using Mamba on WSL (Ubuntu). Although I managed to install the reference genome (mintie -r), I'm facing an error at the Assemble stage at the test datasets run (after mintie -t). The Trimming stage seems to be working correctly (I got trimmed output files), but the pipeline fails to create "denovo_filt.fasta" and/or "SOAP.fasta" files properly (those files are missing). I also tried to install Mintie manually, including all required packages from the pipeline (added references to the "tool.groovy" file), including Trinity and rnaSPAdes packages. However, I faced exactly the same problem. Do you maybe have an idea of what and where might be the problem causing this error? Should I maybe have to reconfigure SOAPdenovo-Trans parameters somehow? I have ran and tested all packages in the pipeline individually and they are all working correctly. So, I have no other option but to ask you to help me in solving this problem as I'm very keen to use Mintie in my project (might be exactly what I need).
Thank you in advance.

SOAP error

error on test run

Hi Marek:

I have mintie installed via conda, and set up the reference using the updated sh file.
mintie -r and mintie -h all run without issue.
But the test run failed with following Groovy error, could you have a look what is wrong with my installation?

OpenJDK 64-Bit Server VM warning: Options -Xverify:none and -noverify were deprecated in JDK 13 and will likely be removed in a future release.
java.lang.NoClassDefFoundError: Could not initialize class org.codehaus.groovy.vmplugin.v7.Java7
	at org.codehaus.groovy.vmplugin.VMPluginFactory.<clinit>(VMPluginFactory.java:43)
	at org.codehaus.groovy.reflection.GroovyClassValueFactory.<clinit>(GroovyClassValueFactory.java:35)
	at org.codehaus.groovy.reflection.ClassInfo.<clinit>(ClassInfo.java:107)
	at org.codehaus.groovy.reflection.ReflectionCache.getCachedClass(ReflectionCache.java:95)
	at org.codehaus.groovy.reflection.ReflectionCache.<clinit>(ReflectionCache.java:39)
	at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.registerMethods(MetaClassRegistryImpl.java:209)
	at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.<init>(MetaClassRegistryImpl.java:107)
	at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.<init>(MetaClassRegistryImpl.java:85)
	at groovy.lang.GroovySystem.<clinit>(GroovySystem.java:36)
	at org.codehaus.groovy.runtime.InvokerHelper.<clinit>(InvokerHelper.java:86)
	at org.codehaus.groovy.runtime.callsite.CallSiteArray.createCallStaticSite(CallSiteArray.java:74)
	at org.codehaus.groovy.runtime.callsite.CallSiteArray.createCallSite(CallSiteArray.java:161)
	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:115)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:127)
	at bpipe.Runner.<clinit>(Runner.groovy:58)
	at bpipe.Runner9.main(Runner9.java:47)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at org.codehaus.groovy.tools.GroovyStarter.rootLoader(GroovyStarter.java:110)
	at org.codehaus.groovy.tools.GroovyStarter.main(GroovyStarter.java:128)
java.lang.reflect.InvocationTargetException
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at org.codehaus.groovy.tools.GroovyStarter.rootLoader(GroovyStarter.java:110)
	at org.codehaus.groovy.tools.GroovyStarter.main(GroovyStarter.java:128)
Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.codehaus.groovy.reflection.ReflectionCache
	at org.codehaus.groovy.runtime.dgmimpl.NumberNumberMetaMethod.<clinit>(NumberNumberMetaMethod.java:33)
	at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499)
	at java.base/java.lang.reflect.ReflectAccess.newInstance(ReflectAccess.java:128)
	at java.base/jdk.internal.reflect.ReflectionFactory.newInstance(ReflectionFactory.java:347)
	at java.base/java.lang.Class.newInstance(Class.java:645)
	at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.createMetaMethodFromClass(MetaClassRegistryImpl.java:257)
	at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.<init>(MetaClassRegistryImpl.java:110)
	at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.<init>(MetaClassRegistryImpl.java:85)
	at groovy.lang.GroovySystem.<clinit>(GroovySystem.java:36)
	at org.codehaus.groovy.runtime.InvokerHelper.<clinit>(InvokerHelper.java:86)
	at org.codehaus.groovy.runtime.callsite.CallSiteArray.createCallStaticSite(CallSiteArray.java:74)
	at org.codehaus.groovy.runtime.callsite.CallSiteArray.createCallSite(CallSiteArray.java:161)
	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:115)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:127)
	at bpipe.Runner.<clinit>(Runner.groovy:58)
	at bpipe.Runner9.main(Runner9.java:47)
	... 6 more

IIT file /data/miniconda3/envs/mintie/share/mintie-0.4.2-0/ref/gmap_genome/gmap_genome.chromosome.iit is not valid

Hi,
I am trying to use Mintie on a cloud platform (Nimbus): Operating System: Ubuntu 18.04.5 LTS; Kernel: Linux 4.15.0-206-generic; Architecture: x86-64. I installed Mintie using the suggested method (using mamba). The installation completed without issue and I also successfully downloaded the reference files.

I am now trying to run mintie on the test data using

$ mintie -t
$ mintie -w -p test_params.txt cases/.fastq.gz controls/.fastq.gz
However, I am running into the following issue relating to a gmap_genome file

====================================================================================================
|                              Starting Pipeline at 2023-06-06 08:06                               |
====================================================================================================

================================ Stage fastq_dedupe (allvars-case) =================================

==================================== Stage trim (allvars-case) =====================================

================================== Stage assemble (allvars-case) ===================================

============================= Stage create_salmon_index (allvars-case) =============================

================================= Stage run_salmon (allvars-case) ==================================

================================ Stage run_salmon (allvars-control) ================================

=========================== Stage create_ec_count_matrix (allvars-case) ============================

=================================== Stage run_de (allvars-case) ====================================

========================== Stage filter_on_significant_ecs (allvars-case) ==========================

======================== Stage align_contigs_against_genome (allvars-case) =========================
Note: gmap.avx2 does not exist.  For faster speed, may want to compile package on an AVX2 machine
GMAP version 2023-04-28 called with args: gmap.sse42 -D /data/miniconda3/envs/mintie/share/mintie-0.4.2-0/ref -d gmap_genome -f samse -t 2 -x 7 --max-intronlength-ends=500000 -n 0 allvars-case/de_contigs.fasta
Checking compiler assumptions for SSE2: 6B8B4567 327B23C6 xor=59F066A1
Checking compiler assumptions for SSE4.1: -103 -58 max=198 => compiler zero extends
Checking compiler options for SSE4.2: 6B8B4567 __builtin_clz=1 __builtin_ctz=0 _mm_popcnt_u32=17 __builtin_popcount=17 
Finished checking compiler assumptions
IIT file /data/miniconda3/envs/mintie/share/mintie-0.4.2-0/ref/gmap_genome/gmap_genome.chromosome.iit is not valid
ERROR: Command failed with exit status = 9 : 

gmap -D /data/miniconda3/envs/mintie/share/mintie-0.4.2-0/ref -d gmap_genome -f samse -t 2 -x 7 --max-intronlength-ends=500000 -n 0 allvars-case/de_contigs.fasta > allvars-case/aligned_contigs_against_genome.sam 


========================================= Pipeline Failed ==========================================

One or more parallel stages aborted. The following messages were reported: 

Branch allvars-case in stage Unknown reported message:

Command failed with exit status = 9 : 

gmap -D /data/miniconda3/envs/mintie/share/mintie-0.4.2-0/ref -d gmap_genome -f samse -t 2 -x 7 --max-intronlength-ends=500000 -n 0 allvars-case/de_contigs.fasta > allvars-case/aligned_contigs_against_genome.sam

Use 'bpipe errors' to see output from failed commands.

It looks like the gmap_genome references were not created properly, as data/miniconda3/envs/mintie/share/mintie-0.4.2-0/ref/gmap_genome/gmap_genome.chromosome.iit does not exist.

If you have any suggestions for how I can troubleshoot this, that would be great.

Thanks

SLURM submission limited to PBSpro installs

Hi Marek,

Thank you for sharing your wonderful pipeline! MINTIE is working very well in local mode submitted to our queue through an interactive job.

We have had problems with the cluster implementation as BPIPE requires the qstat -x flag included in PBSpro. Our qstat install is packaged with slurm-torque 18.08.4-1.el7.

Execution Command
nohup srun mintie -w -p params.txt cases/*.fastq.gz controls/*.fastq.gz &
Successfully submits 'fastq_dedup' to the queue as 1 job per sample.

Error
Pipeline hangs after successful completion of 'fastq_dedup'. SLURM exit status is COMPLETE and output fastq are generated.

Outputs in .bpipe/bpipe.log:

bpipe.Utils	[38]	INFO	|11:57:27 Executing command: qstat -x 5451428 
bpipe.executor.TorqueStatusMonitor	[38]	WARNING	|11:57:27 Error occurred in processing torque output: java.lang.Exception: Error parsing torque output: unexpected error: Unknown option: x 

Environment
The MINTIE installation is for version 0.3.9 installed via miniconda3/mamba. The package version are in the yaml here:
mintie.yml.txt

The BPIPE scheduling configuration is as follows:

executor="slurm"

//controls the total number of procs MINTIE can spawn
//if running locally, ensure that concurrency is not to
//set to more than the number of procs available. If
//running on a cluster, this can be increased
concurrency=10

//following commands are for running on a cluster
walltime="5-20:00:00"
queue="bigmem"
mem_param="mem-per-cpu"
memory="30"
proc_mode=1
usePollerFileWatcher=true
useLegacyTorqueJobPolling=true
procs=10
account="grayl"

//add server-specific module to load
modules="miniconda3"

commands {

Thank you in advance for taking a look at this.
Lesley

annotate_contigs fails in run_test

Hi!

Thank you for your preprint. I was trying to test the pipeline but faced the following issue. Could you help?

============================== Stage annotate_contigs (allvars-case) ===============================
Traceback (most recent call last):
File "/home/usr/data/mintie/MINTIE-v0.3.5/annotate/annotate_contigs.py", line 720, in
main()
File "/home/usr/data/mintie/MINTIE-v0.3.5/annotate/annotate_contigs.py", line 717, in main
annotate_contigs(args)
File "/home/usr/data/mintie/MINTIE-v0.3.5/annotate/annotate_contigs.py", line 659, in annotate_contigs
bam = pysam.AlignmentFile(args.bam_file, 'rc')
File "pysam/libcalignmentfile.pyx", line 742, in pysam.libcalignmentfile.AlignmentFile.cinit
File "pysam/libcalignmentfile.pyx", line 991, in pysam.libcalignmentfile.AlignmentFile._open
ValueError: file has no sequences defined (mode='rc') - is it SAM/BAM format? Consider opening with check_sq=False
Cleaned up file allvars-case/annotated_contigs.vcf to .bpipe/trash/annotated_contigs.vcf.3
Cleaned up file allvars-case/annotated_contigs_info.tsv to .bpipe/trash/annotated_contigs_info.tsv.3
ERROR: stage annotate_contigs failed: Command in stage annotate_contigs failed with exit status = 1 :

/home/usr/miniconda3/envs/mintie/bin/python /home/usr/data/mintie/MINTIE-v0.3.5/annotate/annotate_contigs.py allvars-case allvars-case/aligned_contigs_against_genome.bam /home/usr/data/mintie/MINTIE-v0.3.5/ref/chess2.2.info /home/usr/data/mintie/MINTIE-v0.3.5/ref/chess2.2.gtf allvars-case/annotated_contigs.bam allvars-case/annotated_contigs_info.tsv --minClip 20 --minGap 7 --minMatch 30,0.3 --log /home/usr/data/mintie/MINTIE-v0.3.5/allvars-case/annotate.log > allvars-case/annotated_contigs.vcf

Could not initialize class org.codehaus.groovy.vmplugin.v7.Java7

Hi,

I am trying to use Mintie on a cloud platform (Nimbus): Operating System: Ubuntu 18.04.5 LTS; Kernel: Linux 4.15.0-206-generic; Architecture: x86-64. I installed Mintie using the suggested method (using mamba). The installation completed without issue and I also successfully downloaded the reference files.

I am now trying to run mintie on the test data using

$ mintie -t
$ mintie -w -p test_params.txt cases/*.fastq.gz controls/*.fastq.gz

However, I keep getting this java issue:

penJDK 64-Bit Server VM warning: Options -Xverify:none and -noverify were deprecated in JDK 13 and will likely be removed in a future release.
java.lang.NoClassDefFoundError: Could not initialize class org.codehaus.groovy.vmplugin.v7.Java7
        at org.codehaus.groovy.vmplugin.VMPluginFactory.<clinit>(VMPluginFactory.java:43)
        at org.codehaus.groovy.reflection.GroovyClassValueFactory.<clinit>(GroovyClassValueFactory.java:35)
        at org.codehaus.groovy.reflection.ClassInfo.<clinit>(ClassInfo.java:107)
        at org.codehaus.groovy.reflection.ReflectionCache.getCachedClass(ReflectionCache.java:95)
        at org.codehaus.groovy.reflection.ReflectionCache.<clinit>(ReflectionCache.java:39)
        at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.registerMethods(MetaClassRegistryImpl.java:209)
        at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.<init>(MetaClassRegistryImpl.java:107)
        at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.<init>(MetaClassRegistryImpl.java:85)
        at groovy.lang.GroovySystem.<clinit>(GroovySystem.java:36)
        at org.codehaus.groovy.runtime.InvokerHelper.<clinit>(InvokerHelper.java:86)
        at org.codehaus.groovy.runtime.callsite.CallSiteArray.createCallStaticSite(CallSiteArray.java:74)
        at org.codehaus.groovy.runtime.callsite.CallSiteArray.createCallSite(CallSiteArray.java:161)
        at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:115)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:127)
        at bpipe.Runner.<clinit>(Runner.groovy:58)
        at bpipe.Runner9.main(Runner9.java:47)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:568)
        at org.codehaus.groovy.tools.GroovyStarter.rootLoader(GroovyStarter.java:110)
        at org.codehaus.groovy.tools.GroovyStarter.main(GroovyStarter.java:128)
java.lang.reflect.InvocationTargetException
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:568)
        at org.codehaus.groovy.tools.GroovyStarter.rootLoader(GroovyStarter.java:110)
        at org.codehaus.groovy.tools.GroovyStarter.main(GroovyStarter.java:128)
Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.codehaus.groovy.reflection.ReflectionCache
        at org.codehaus.groovy.runtime.dgmimpl.NumberNumberMetaMethod.<clinit>(NumberNumberMetaMethod.java:33)
        at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77)
        at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499)
        at java.base/java.lang.reflect.ReflectAccess.newInstance(ReflectAccess.java:128)
        at java.base/jdk.internal.reflect.ReflectionFactory.newInstance(ReflectionFactory.java:347)
        at java.base/java.lang.Class.newInstance(Class.java:645)
        at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.createMetaMethodFromClass(MetaClassRegistryImpl.java:257)
        at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.<init>(MetaClassRegistryImpl.java:110)
        at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.<init>(MetaClassRegistryImpl.java:85)
        at groovy.lang.GroovySystem.<clinit>(GroovySystem.java:36)
        at org.codehaus.groovy.runtime.InvokerHelper.<clinit>(InvokerHelper.java:86)
        at org.codehaus.groovy.runtime.callsite.CallSiteArray.createCallStaticSite(CallSiteArray.java:74)
        at org.codehaus.groovy.runtime.callsite.CallSiteArray.createCallSite(CallSiteArray.java:161)
        at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:115)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:127)
        at bpipe.Runner.<clinit>(Runner.groovy:58)
        at bpipe.Runner9.main(Runner9.java:47)
        ... 6 more

Could you please help me figure out why this is happening?

Thanks!
Chiara

Question about interpretation of cpos

Hello, and thanks for creating such a useful tool. I had a question about understanding the "cpos" output, particularly when the alignment includes gaps. For instance, I have the following result:

chr1 pos1 strand1 chr2 pos2 strand2 variant_type overlapping_genes variant_id partner_id vars_in_contig VAF varsize contig_varsize cpos contig_len contig_cigar
chr1 973327 - chr1 973499 - RI PLEKHN1 k49_231483a . 1 0.5953358 173 173 141 695 48M137N137M436N150M175N360M

The retained intron position in question here is between the two rightmost exons in the attached screenshot from IGV.

Screen Shot 2021-08-23 at 2 20 05 PM

However, if I were to annotate that cpos myself, I'd say it was at 477 (specifically it looks like the RI is from 477 - 477+173). I'm not sure I see how 141 is calculated.

Thanks for the help,

Rachel

Pipeline fail at "annotate contigs"

Hi,

Congratulations on the recent bioRxiv - I really enjoyed it and (hopefully) look forward to applying it in my work.

I am currently trying to test it out on my non-human non-model organism, and I am getting an error that I am struggling to troubleshoot at the "annotate contigs" stage. The error is below. Is there something obvious I should look at that I am missing?

Any advice is greatly appreciated.
Kind regards,
Steve

================================= Stage annotate_contigs (7554789) =================================
Traceback (most recent call last):
File "/nfs/users/nfs_s/sd21/lustre118_link/software/TRANSCRIPTOME/MINTIE/annotate/annotate_contigs.py", line 28, in
from utils import cached, init_logging, exit_with_error
File "/lustre/scratch118/infgen/team133/sd21/software/TRANSCRIPTOME/MINTIE/annotate/utils.py", line 17
print("{} ERROR: {}, exiting".format(PROGRAM_NAME, message), file=sys.stderr)
^
SyntaxError: invalid syntax

Installation problem (environment)

Hi.
I am trying to install MINTIE using conda. But it consistently fails at the "Solving environment" step. Could you please help about this? Please see below:

(base) C:\Users\ysnuy>conda create -c conda-forge -c bioconda -n mintie mintie
Collecting package metadata (current_repodata.json): done
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: |
Found conflicts! Looking for incompatible packages.
This can take several minutes.  Press CTRL-C to abort.
failed
UnsatisfiableError:
(base) C:\Users\ysnuy>

Can't install/run test

Can't install or run test.
Using miniconda3 (Miniconda3-latest-Linux-x86_64.sh from 16/1/20)
python 3.7.4

Issues:
#1 samtools wouldn't install with the automatic script
#2 pysam python module wouldn't compile for this version of python/conda.

Key error: 'chrchr19'

Hello,

I'm currently trying to run a simulation through MINTIE using the run_simu.py files in the simu folder. I have resolved most of the issues I have encountered except the following.

python run_simu.py params.ini --log simulog.txt
Traceback (most recent call last):
  File "/home/neuro/miniconda3/envs/mintie/run_simu.py", line 345, in <module>
    main()
  File "/home/neuro/miniconda3/envs/mintie/run_simu.py", line 342, in main
    simulate(args)
  File "/home/neuro/miniconda3/envs/mintie/run_simu.py", line 321, in simulate
    available_genes = simulate_fusions(simp, params, available_genes, valid_txs,
  File "/home/neuro/miniconda3/envs/mintie/run_simu.py", line 124, in simulate_fusions
    fus_parts = simu.write_fusion((tx1, tx2), (gene1, gene2), all_exons, paths['genome_fasta'], \
  File "/home/neuro/miniconda3/envs/mintie/simu.py", line 595, in write_fusion
    block_seq = get_random_block(chr1, gene_trees, genome_fasta, block_range)
  File "/home/neuro/miniconda3/envs/mintie/simu.py", line 397, in get_random_block
    chr_range = chr_sizes[('chr%s' % chrom)]
KeyError: 'chrchr19'

I'm not entirely sure how to proceed.

Thanks for the help!

Running MINTIE with Single end dataset

Hi,

My rnaseq dataset is single end and I want to run MINTIE with it. I get the error that

stage fastq_dedupe failed: Insufficient inputs: at least 2 inputs are expected with extension .gz but only 1 are available

How can I run MINTIE with single end dataset?

Inoder to test, I tried to run the test dataset after deleting R2 files from cases and controls as

mintie -w -p test_params.txt cases/*.fastq.gz controls/*.fastq.gz

in the test_params.txt, I made the following changes

-p fastqCaseFormat=cases/%_R1.fastq.gz
-p fastqControlFormat=controls/%_R1.fastq.gz

I get the same error. Could you let me know how to run the tool on single end dataset?

Thanks,

Error in Testing MINTIE

Dear Author
I have manually installed mintie 0.4.2 and setup ref w/o error. But, mintie -w -p test_params.txt cases/.fastq.gz controls/.fastq.gz provoked error as follows. I would appreciate your help.
mntieTest.3rdtime.log

╒══════════════════════════════════════════════════════════════════════════════════════════════════╕
| Starting Pipeline at 2024-02-02 21:04 |
╘══════════════════════════════════════════════════════════════════════════════════════════════════╛

================================ Stage fastq_dedupe (allvars-case) =================================
Reads before:
423500 allvars-case/temp1.fastq
Reads after:
423400 allvars-case/allvars-case.1.fastq

==================================== Stage trim (allvars-case) =====================================
TrimmomaticPE: Started with arguments:
-threads 2 -phred33 allvars-case/allvars-case.1.fastq.gz allvars-case/allvars-case.2.fastq.gz allvars-case/trim1.fastq /dev/null allvars-case/trim2.fastq /dev/null LEADING:20 TRAILING:20 MINLEN:80
Input Read Pairs: 105850 Both Surviving: 105850 (100.00%) Forward Only Surviving: 0 (0.00%) Reverse Only Surviving: 0 (0.00%) Dropped: 0 (0.00%)
TrimmomaticPE: Completed successfully

================================== Stage assemble (allvars-case) ===================================
bash: line 1: 1461 Segmentation fault /mnt/d/mintie_240129/MINTIE-0.4.2_240129/MINTIE-0.4.2/tools/bin/soapdenovotrans pregraph -s config.config -o outputGraph_$k -K $k -p 2
ERROR: stage assemble failed: Command in stage assemble failed with exit status = 139 :

rlens=gunzip -c allvars-case/trim1.fastq.gz allvars-case/trim2.fastq.gz | awk -v mrl=80 'BEGIN {minlen = mrl; maxlen = 0} { if (NR % 4 == 2) { rlen = length($1) ; if (rlen > maxlen) {maxlen = rlen} if (rlen < minlen) {minlen = rlen} }} END {print minlen" "maxlen}' ; min_rlen=${rlens% } ; max_rlen=${rlens# } ; if [ ! -d allvars-case/SOAPassembly ]; then mkdir allvars-case/SOAPassembly ; fi ; cd allvars-case/SOAPassembly ; echo "max_rd_len=$max_rlen" > config.config ; echo -e "[LIB]\nq1=../../allvars-case/trim1.fastq.gz\nq2=../../allvars-case/trim2.fastq.gz" >> config.config ; if [ -e SOAP.fasta ]; then rm SOAP.fasta ; fi ; for k in 79 49 ; do if [ $k -gt $min_rlen ]; then echo "WARNING: Kmer size $k exceeds minimum read length ${min_rlen}. Please double check parameters." ; else /mnt/d/mintie_240129/MINTIE-0.4.2_240129/MINTIE-0.4.2/tools/bin/soapdenovotrans pregraph -s config.config -o outputGraph_$k -K $k -p 2 ; /mnt/d/mintie_240129/MINTIE-0.4.2_240129/MINTIE-0.4.2/tools/bin/soapdenovotrans contig -g outputGraph_$k ; cat outputGraph_$k.contig | sed "s/^>/>k${k}_/g" >> SOAP.fasta ; fi ; done ; cd ../../ ; /mnt/d/mintie_240129/MINTIE-0.4.2_240129/MINTIE-0.4.2/tools/bin/dedupe in=allvars-case/SOAPassembly/SOAP.fasta out=stdout.fa threads=2 overwrite=true | /mnt/d/mintie_240129/MINTIE-0.4.2_240129/MINTIE-0.4.2/tools/bin/fasta_formatter | awk '!/^>/ { next } { getline seq } length(seq) > 100 { print $0 "\n" seq }' > allvars-case/allvars-case_denovo_filt.fasta ; if [ ! -s allvars-case/allvars-case_denovo_filt.fasta ] ; then rm allvars-case/allvars-case_denovo_filt.fasta ; echo "ERROR: de novo assembled contigs fasta file is empty." ; echo "Please check paths for SOAPdenovoTrans, dedupe and fasta" ; echo "formatter are correct, and their dependencies are installed." ; fi ;

========================================= Pipeline Failed ==========================================

In stage Unknown: One or more parallel stages aborted. The following messages were reported:

----------------------------------- assemble ( allvars-case ) ------------------------------------

Command in stage assemble failed with exit status = 139 :

rlens=gunzip -c allvars-case/trim1.fastq.gz allvars-case/trim2.fastq.gz | awk -v mrl=80 'BEGIN {minlen = mrl; maxlen = 0} { if (NR % 4 == 2) { rlen = length($1) ; if (rlen > maxlen) {maxlen = rlen} if (rlen < minlen) {minlen = rlen} }} END {print minlen" "maxlen}' ; min_rlen=${rlens% } ; max_rlen=${rlens# } ; if [ ! -d allvars-case/SOAPassembly ]; then mkdir allvars-case/SOAPassembly ; fi ; cd allvars-case/SOAPassembly ; echo "max_rd_len=$max_rlen" > config.config ; echo -e "[LIB]\nq1=../../allvars-case/trim1.fastq.gz\nq2=../../allvars-case/trim2.fastq.gz" >> config.config ; if [ -e SOAP.fasta ]; then rm SOAP.fasta ; fi ; for k in 79 49 ; do if [ $k -gt $min_rlen ]; then echo "WARNING: Kmer size $k exceeds minimum read length ${min_rlen}. Please double check parameters." ; else /mnt/d/mintie_240129/MINTIE-0.4.2_240129/MINTIE-0.4.2/tools/bin/soapdenovotrans pregraph -s config.config -o outputGraph_$k -K $k -p 2 ; /mnt/d/mintie_240129/MINTIE-0.4.2_240129/MINTIE-0.4.2/tools/bin/soapdenovotrans contig -g outputGraph_$k ; cat outputGraph_$k.contig | sed "s/^>/>k${k}_/g" >> SOAP.fasta ; fi ; done ; cd ../../ ; /mnt/d/mintie_240129/MINTIE-0.4.2_240129/MINTIE-0.4.2/tools/bin/dedupe in=allvars-case/SOAPassembly/SOAP.fasta out=stdout.fa threads=2 overwrite=true | /mnt/d/mintie_240129/MINTIE-0.4.2_240129/MINTIE-0.4.2/tools/bin/fasta_formatter | awk '!/^>/ { next } { getline seq } length(seq) > 100 { print $0 "\n" seq }' > allvars-case/allvars-case_denovo_filt.fasta ; if [ ! -s allvars-case/allvars-case_denovo_filt.fasta ] ; then rm allvars-case/allvars-case_denovo_filt.fasta ; echo "ERROR: de novo assembled contigs fasta file is empty." ; echo "Please check paths for SOAPdenovoTrans, dedupe and fasta" ; echo "formatter are correct, and their dependencies are installed." ; fi ;


Use 'bpipe errors' to see output from failed commands.

/mnt/d/mintie_240129/MINTIE-0.4.2_240129/MINTIE-0.4.2# bpipe errors

============================== Found 1 failed commands from run 1318 ===============================

====================================== Command assemble (26) =======================================

Command : rlens=gunzip -c allvars-case/trim1.fastq.gz allvars-case/trim2.fastq.gz | awk -v mrl=80 'BEGIN {minlen = mrl; maxlen = 0} { if (NR % 4 == 2) { rlen = length($1) ; if (rlen > maxlen) {maxlen = rlen} if (rlen < minlen) {minlen = rlen} }} END {print minlen" "maxlen}' ; min_rlen=${rlens% } ; max_rlen=${rlens# } ; if [ ! -d allvars-case/SOAPassembly ]; then
mkdir allvars-case/SOAPassembly ; fi ; cd allvars-case/SOAPassembly ; echo "max_rd_len=$max_rlen" > config.config ; echo -e "[LIB]\nq1=../../allvars-case/trim1.fastq.gz\nq2=../../allvars-case/trim2.fastq.gz" >> config.config ; if [ -e SOAP.fasta ]; then rm SOAP.fasta ; fi ; for k in 79 49 ; do if [ $k -gt $min_rlen ]; then echo "WARNING: Kmer size $k exceeds minimum read length ${min_rlen}. Please double check parameters." ; else /mnt/d/mintie_240129/MINTIE-0.4.2_240129/MINTIE-0.4.2/tools/bin/soapdenovotrans pregraph -s config.config -o outputGraph_$k -K $k -p 2 ; /mnt/d/mintie_240129/MINTIE-0.4.2_240129/MINTIE-0.4.2/tools/bin/soapdenovotrans contig -g outputGraph_$k ; cat outputGraph_$k.contig | sed "s/^>/>k${k}_/g" >> SOAP.fasta ; fi ;
done ; cd ../../ ; /mnt/d/mintie_240129/MINTIE-0.4.2_240129/MINTIE-0.4.2/tools/bin/dedupe in=allvars-case/SOAPassembly/SOAP.fasta out=stdout.fa threads=2 overwrite=true | /mnt/d/mintie_240129/MINTIE-0.4.2_240129/MINTIE-0.4.2/tools/bin/fasta_formatter | awk '!/^>/ { next } { getline seq } length(seq) > 100 { print $0 "\n" seq }' > allvars-case/allvars-case_denovo_filt.fasta ; if [ ! -s allvars-case/allvars-case_denovo_filt.fasta ] ; then rm allvars-case/allvars-case_denovo_filt.fasta ; echo "ERROR: de novo assembled contigs fasta file is empty." ; echo "Please check paths for SOAPdenovoTrans, dedupe and fasta" ; echo "formatter are correct, and their dependencies are installed." ; fi ;
Started : Fri Feb 02 21:05:10 JST 2024
Stopped : Fri Feb 02 21:05:10 JST 2024
Exit Code : 139
Config:
Name | Value
----------------------------------
procs | 8
memory | 180
max_per_command_threads | 16
executor | local
mem_param | mem
name | assemble
proc_mode | 1
usePollerFileWatcher | true
walltime | 20:00:00
queue | batch
concurrency | 16

Output :

    bash: line 1:  1461 Segmentation fault      /mnt/d/mintie_240129/MINTIE-0.4.2_240129/MINTIE-0.4.2/tools/bin/soapdenovotrans pregraph -s config.config -o outputGraph_$k -K $k -p 2

Issue running the test pipeline

I have just deployed mintie using mamba and get this error. Is there some java version that need to be pinned in the conda recipe?

mintie -w -p test_params.txt cases/.fastq.gz controls/.fastq.gz
OpenJDK 64-Bit Server VM warning: Options -Xverify:none and -noverify were deprecated in JDK 13 and will likely be removed in a future release.
java.lang.NoClassDefFoundError: Could not initialize class org.codehaus.groovy.vmplugin.v7.Java7
at org.codehaus.groovy.vmplugin.VMPluginFactory.(VMPluginFactory.java:43)
at org.codehaus.groovy.reflection.GroovyClassValueFactory.(GroovyClassValueFactory.java:35)
at org.codehaus.groovy.reflection.ClassInfo.(ClassInfo.java:107)
at org.codehaus.groovy.reflection.ReflectionCache.getCachedClass(ReflectionCache.java:95)
at org.codehaus.groovy.reflection.ReflectionCache.(ReflectionCache.java:39)
at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.registerMethods(MetaClassRegistryImpl.java:209)
at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.(MetaClassRegistryImpl.java:107)
at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.(MetaClassRegistryImpl.java:85)
at groovy.lang.GroovySystem.(GroovySystem.java:36)
at org.codehaus.groovy.runtime.InvokerHelper.(InvokerHelper.java:86)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.createCallStaticSite(CallSiteArray.java:74)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.createCallSite(CallSiteArray.java:161)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:115)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:127)
at bpipe.Runner.(Runner.groovy:58)
at bpipe.Runner9.main(Runner9.java:47)
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
at org.codehaus.groovy.tools.GroovyStarter.rootLoader(GroovyStarter.java:110)
at org.codehaus.groovy.tools.GroovyStarter.main(GroovyStarter.java:128)
Caused by: java.lang.ExceptionInInitializerError: Exception org.codehaus.groovy.GroovyBugError [in thread "main"]
at org.codehaus.groovy.vmplugin.v7.Java7.(Java7.java:45)
at java.base/jdk.internal.misc.Unsafe.ensureClassInitialized0(Native Method)
at java.base/jdk.internal.misc.Unsafe.ensureClassInitialized(Unsafe.java:1160)
at java.base/jdk.internal.reflect.MethodHandleAccessorFactory.ensureClassInitialized(MethodHandleAccessorFactory.java:340)
at java.base/jdk.internal.reflect.MethodHandleAccessorFactory.newConstructorAccessor(MethodHandleAccessorFactory.java:103)
at java.base/jdk.internal.reflect.ReflectionFactory.newConstructorAccessor(ReflectionFactory.java:173)
at java.base/java.lang.reflect.Constructor.acquireConstructorAccessor(Constructor.java:549)
at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499)
at java.base/java.lang.reflect.ReflectAccess.newInstance(ReflectAccess.java:132)
at java.base/jdk.internal.reflect.ReflectionFactory.newInstance(ReflectionFactory.java:259)
at java.base/java.lang.Class.newInstance(Class.java:755)
at org.codehaus.groovy.vmplugin.VMPluginFactory.createPlugin(VMPluginFactory.java:57)
at org.codehaus.groovy.vmplugin.VMPluginFactory.(VMPluginFactory.java:39)
... 20 more
java.lang.reflect.InvocationTargetException
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:118)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
at org.codehaus.groovy.tools.GroovyStarter.rootLoader(GroovyStarter.java:110)
at org.codehaus.groovy.tools.GroovyStarter.main(GroovyStarter.java:128)
Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.codehaus.groovy.reflection.ReflectionCache
at org.codehaus.groovy.runtime.dgmimpl.NumberNumberMetaMethod.(NumberNumberMetaMethod.java:33)
at java.base/jdk.internal.misc.Unsafe.ensureClassInitialized0(Native Method)
at java.base/jdk.internal.misc.Unsafe.ensureClassInitialized(Unsafe.java:1160)
at java.base/jdk.internal.reflect.MethodHandleAccessorFactory.ensureClassInitialized(MethodHandleAccessorFactory.java:340)
at java.base/jdk.internal.reflect.MethodHandleAccessorFactory.newConstructorAccessor(MethodHandleAccessorFactory.java:103)
at java.base/jdk.internal.reflect.ReflectionFactory.newConstructorAccessor(ReflectionFactory.java:173)
at java.base/java.lang.reflect.Constructor.acquireConstructorAccessor(Constructor.java:549)
at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499)
at java.base/java.lang.reflect.ReflectAccess.newInstance(ReflectAccess.java:132)
at java.base/jdk.internal.reflect.ReflectionFactory.newInstance(ReflectionFactory.java:259)
at java.base/java.lang.Class.newInstance(Class.java:755)
at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.createMetaMethodFromClass(MetaClassRegistryImpl.java:257)
at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.(MetaClassRegistryImpl.java:110)
at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.(MetaClassRegistryImpl.java:85)
at groovy.lang.GroovySystem.(GroovySystem.java:36)
at org.codehaus.groovy.runtime.InvokerHelper.(InvokerHelper.java:86)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.createCallStaticSite(CallSiteArray.java:74)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.createCallSite(CallSiteArray.java:161)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:115)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:127)
at bpipe.Runner.(Runner.groovy:58)
at bpipe.Runner9.main(Runner9.java:47)
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
... 3 more
Caused by: java.lang.ExceptionInInitializerError: Exception java.lang.NoClassDefFoundError: Could not initialize class org.codehaus.groovy.vmplugin.v7.Java7 [in thread "main"]
at org.codehaus.groovy.vmplugin.VMPluginFactory.(VMPluginFactory.java:43)
at org.codehaus.groovy.reflection.GroovyClassValueFactory.(GroovyClassValueFactory.java:35)
at org.codehaus.groovy.reflection.ClassInfo.(ClassInfo.java:107)
at org.codehaus.groovy.reflection.ReflectionCache.getCachedClass(ReflectionCache.java:95)
at org.codehaus.groovy.reflection.ReflectionCache.(ReflectionCache.java:39)
at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.registerMethods(MetaClassRegistryImpl.java:209)
at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.(MetaClassRegistryImpl.java:107)
... 14 more

refine_annotations.py: 'DataFrame' object has no attribute 'valid_motif'

Hi,

After successfully using MINTIE for some samples running the pipeline for one raised the following error, replacing a confidential sample identifier with "":
If necessary I can share a file (through email). Could you let me know which file would be most informative for you? Or is there something else you want me to do to debug?

Greetz,
Wouter

================================== Command refine_contigs (1999) ===================================

Command    : python /home/wdecoster/mintie/MINTIE/annotate/refine_annotations.py             <sample>/annotated_contigs_info.tsv <sample>/annotated_contigs.vcf <sample>/annotated_contigs.bam /home/wdecoster/mintie/MINTIE/ref/chess2.2.gtf             /home/wdecoster/mintie/MINTIE/ref/hg38.fa <sample>/novel_contigs             --minClip 20             --minGap 7             --mismatches 0             --log /home/wdecoster/mintie/<sample>/refine.log > <sample>/novel_contigs.vcf ; samtools index <sample>/novel_contigs.bam ; python /home/wdecoster/mintie/MINTIE/util/filter_fasta.py <sample>/de_contigs.fasta <sample>/novel_contigs_info.tsv --col_id contig_id > <sample>/novel_contigs.fasta ;
Started    : Mon Sep 21 13:23:24 CEST 2020
Stopped    : Mon Sep 21 13:23:33 CEST 2020
Exit Code  : 1
Config: 
                   Name           |  Value  
          ----------------------------------
          max_per_command_threads | 16      
          executor                | local   
          concurrency             | 32      
          walltime                | 20:00:00
          queue                   | batch   
          mem_param               | mem     
          memory                  | 8       
          proc_mode               | 1       
          usePollerFileWatcher    | true    
          manualPollerSleepTime   | 30000   
          maxFileNameLength       | 2048    
          name                    | python  
          procs                   | 1       

Output    : 

        Traceback (most recent call last):
          File "/home/wdecoster/mintie/MINTIE/annotate/refine_annotations.py", line 572, in <module>
            main()
          File "/home/wdecoster/mintie/MINTIE/annotate/refine_annotations.py", line 564, in main
            keep_contigs = get_contigs_to_keep(args)
          File "/home/wdecoster/mintie/MINTIE/annotate/refine_annotations.py", line 510, in get_contigs_to_keep
            is_novel_exon = np.logical_and(is_novel_exon, contigs.valid_motif)
          File "/home/wdecoster/miniconda3/envs/mintie/lib/python3.8/site-packages/pandas/core/generic.py", line 5136, in __getattr__
            return object.__getattribute__(self, name)
        AttributeError: 'DataFrame' object has no attribute 'valid_motif'

Guidance around key parameters

It would be great to get some guidance around impact of these parameters. It's not immediately clear how increasing or decreasing these parameters affects sensitivity.

-p Ks=79,49 comma-separated kmer lengths for de novo assembly. This option affects SOAPdenovotrans and rnaSPAdes (but not Trinity). Please ensure that your read length is longer than ALL kmer lengths specified.
-p min_read_length=50 minimum read length for trimming. NOTE: Please ensure this is greater than your minimum Kmer length.
-p min_contig_len=100 minimum length required for for assembled contig to be kept.

Appropriate gmap version not found

Hi,

NOTE: I've attached the full job.err.log which includes the output of bpipe errors.

When running MINTIE in a docker I run into the following error where an appropriate version of gmap cannot be found (extract of job.error.log):

2020-04-21T09:06:45.112323100Z ============================ Command align_contigs_against_genome (27) =============================
2020-04-21T09:06:45.115255265Z 
2020-04-21T09:06:45.117204227Z Command    : /app/MINTIE/tools/bin/gmap -D /app/MINTIE/ref -d gmap_genome -f samse -t 8 -n 0 P006304-303189/de_contigs.fasta > P006304-303189/aligned_contigs_against_genome.sam
2020-04-21T09:06:45.127157281Z Started    : Tue Apr 21 09:06:11 UTC 2020
2020-04-21T09:06:45.127434586Z Stopped    : Tue Apr 21 09:06:12 UTC 2020
2020-04-21T09:06:45.127639364Z Exit Code  : 255
2020-04-21T09:06:45.127707622Z Config: 
2020-04-21T09:06:45.161273655Z                    Name           |            Value            
2020-04-21T09:06:45.161399397Z           ------------------------------------------------------
2020-04-21T09:06:45.164737361Z           max_per_command_threads | 16                          
2020-04-21T09:06:45.165661290Z           executor                | local                       
2020-04-21T09:06:45.166065481Z           concurrency             | 240                         
2020-04-21T09:06:45.166621049Z           walltime                | 20:00:00                    
2020-04-21T09:06:45.167308134Z           queue                   | batch                       
2020-04-21T09:06:45.167641414Z           mem_param               | mem                         
2020-04-21T09:06:45.167936725Z           memory                  | 32                          
2020-04-21T09:06:45.168888023Z           proc_mode               | 1                           
2020-04-21T09:06:45.169282940Z           usePollerFileWatcher    | true                        
2020-04-21T09:06:45.169641991Z           manualPollerSleepTime   | 30000                       
2020-04-21T09:06:45.169982765Z           maxFileNameLength       | 2048                        
2020-04-21T09:06:45.170468281Z           procs                   | 8                           
2020-04-21T09:06:45.170750694Z           name                    | align_contigs_against_genome
2020-04-21T09:06:45.170985378Z 
2020-04-21T09:06:45.170993714Z Output    : 
2020-04-21T09:06:45.171030044Z 
2020-04-21T09:06:45.171429324Z 	Note: /app/MINTIE/tools/bin/gmap.sse42 does not exist.  For faster speed, may want to compile package on an SSE4.2 machine
2020-04-21T09:06:45.171448412Z 	Note: /app/MINTIE/tools/bin/gmap.sse41 does not exist.  For faster speed, may want to compile package on an SSE4.1 machine
2020-04-21T09:06:45.171456511Z 	Note: /app/MINTIE/tools/bin/gmap.ssse3 does not exist.  For faster speed, may want to compile package on an SSSE3 machine
2020-04-21T09:06:45.171462305Z 	Note: /app/MINTIE/tools/bin/gmap.sse2 does not exist.  For faster speed, may want to compile package on an SSE2 machine
2020-04-21T09:06:45.171467892Z 	Note: /app/MINTIE/tools/bin/gmap.nosimd does not exist.  For faster speed, may want to compile package on an non-SIMD machine
2020-04-21T09:06:45.171473323Z 	Error: appropriate GMAP version not found
2020-04-21T09:06:45.171478341Z

I checked in my /MINTIE/tools/bin and gmap is there.

Here is the list of /MINTIE/tools:

$ ls /app/MINTIE/tools
FastUniq
SOAPdenovo-Trans-bin-v1.03
Trimmomatic-0.39
bbmap
bedtools
bin
bpipe-0.9.9.5
gmap-2019-09-12
hisat2-2.1.0
salmon-latest_linux_x86_64
samtools-1.9
share

And /MINTIE/tools/bin:

$ ls /app/MINTIE/tools/bin | egrep '^g'
get-genome
getreads
gff3_genes
gff3_introns
gff3_splicesites
gi2ancestors
gi2taxid
gitable
gmap
gmap.avx2
gmap_build
gmap_compress
gmap_process
gmap_reassemble
gmap_uncompress
gmapindex
gmapl
gmapl.avx2
grademerge
gradesam
gsnap
gsnap.avx2
gsnapl
gsnapl.avx2
gtf_genes
gtf_introns
gtf_splicesites
gtf_transcript_splicesites
gvf_iit

I have all of the dependency programs installed and references installed using:
$ ./install_linux64.sh

Which ended:

Checking that all required tools were installed:
bpipe looks like it has been installed
fastuniq looks like it has been installed
dedupe looks like it has been installed
trimmomatic looks like it has been installed
fasta_formatter looks like it has been installed
samtools looks like it has been installed
bedtools looks like it has been installed
soapdenovotrans looks like it has been installed
salmon looks like it has been installed
hisat looks like it has been installed
gmap looks like it has been installed
**********************************************************
All commands installed successfully!

And I installed the reference files using:
$ ./setup_references_hg38.sh

Which ended:

Checking that all required references were setup:
genome_fasta looks like it has been setup
trans_fasta looks like it has been setup
tx_annotation looks like it has been setup
ann_info looks like it has been setup
tx2gene looks like it has been setup
gmap_refdir looks like it has been setup
gmap_genome looks like it has been setup
**********************************************************
All commands installed successfully!

references.groovy file when running test

I've recently tried installing and running mintie on the test dataset and have run into a few issues.

I've installed mamba using the following:

mamba create -c conda-forge -c bioconda -n mintie mintie

This gives me mintie=0.3.9-0

I then run the parameter to setup the test data:

mintie -t

Which made the following files:

├── cases
│   ├── allvars-case_R1.fastq.gz
│   └── allvars-case_R2.fastq.gz
├── controls
│   ├── allvars-control_R1.fastq.gz
│   └── allvars-control_R2.fastq.gz
└── test_params.txt

2 directories, 5 files

Then, I ran the following mintie commands:

mintie -w -p test_params.txt cases/*.fastq.gz controls/*.fastq.gz

I get the following output:

=========================================== Bpipe Error ============================================

An error occurred executing your pipeline:

A script, requested to be loaded from file '/home/jcole/miniconda3/envs/mintie/share/mintie-0.3.9-0/references.groovy', could not be accessed.


Please see the details below for more information.

========================================== Error Details ===========================================

bpipe.PipelineError: A script, requested to be loaded from file '/home/jcole/miniconda3/envs/mintie/share/mintie-0.3.9-0/references.groovy', could not be accessed.
        at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at MINTIE.groovy.run(MINTIE.groovy:15)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)


====================================================================================================

More details about why this error occurred may be available in the full log file .bpipe/bpipe.log

Indeed, references.groovy does not exist. Is there something I need to do to generate that file?

No space left on device

HI
I installed MINTIE using suggested method(using mamba). The installation was completed without issue and also successfully dowloaded the reference files. I tried to run MINITE on the test data, it was also successfully finished. However I was running MINTIE on my real data, I can only run 1 case vs 1 controls just like the test data. When I wanted to try to running MINTIE with 3 controls, I always prompt no space left on device.
``============================= Found 1 failed commands from run 925835 ==============================`

==================================== Command fastq_dedupe (374) ====================================

Command    : gunzip -c cases/allvars-case_R1.fastq.gz > allvars-case/temp1.fastq ; gunzip -c cases/allvars-case_R2.fastq.gz > allvars-case/temp2.fastq ; echo allvars-case/temp1.fastq > allva echo allvars-case/temp2.fastq >> allvars-case/fastq.list ;             echo "Reads before:" ; wc -l allvars-case/temp1.fastq ;             fastuniq -i allvars-case/fastq.list -o allvars-castq -p allvars-case/allvars-case.2.fastq ;             echo "Reads after:" ; wc -l allvars-case/allvars-case.1.fastq ;             gzip allvars-case/allvars-case.1.fastq allvars-case/allvars-       rm allvars-case/fastq.list allvars-case/temp1.fastq allvars-case/temp2.fastq
Started    : Mon Jun 26 08:30:38 CST 2023
Stopped    : Mon Jun 26 08:59:01 CST 2023
Exit Code  : 1
Config: 
                   Name           |    Value    
          --------------------------------------
          procs                   | 1           
          memory                  | 160         
          max_per_command_threads | 16          
          executor                | local       
          mem_param               | mem         
          name                    | fastq_dedupe
          proc_mode               | 1           
          usePollerFileWatcher    | true        
          walltime                | 20:00:00    
          queue                   | batch       
          concurrency             | 16       

Output :

Reads before:
209526696 allvars-case/temp1.fastq
Reads after:
138463916 allvars-case/allvars-case.1.fastq

gzip: allvars-case/allvars-case.1.fastq.gz: No space left on device

`
If you have any suggestions for how I can troubleshoot this, that would be great.
Also I want to know how much space is required to run MINTIE.
Thanks

Expected output file all/eq_classes_de.txt in stage run_de (all) could not be found

When I set up the fastq files and run ‘mintie -w -p test_params.txt cases/.fastq.gz controls/.fastq.gz’

I didn't find the file called Eq_ classes_ de.txt
At last,it showed:

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

filter, lag

The following objects are masked from ‘package:base’:

intersect, setdiff, setequal, union

Attaching package: ‘data.table’

The following objects are masked from ‘package:dplyr’:

between, first, last

Loading required package: limma
Error: Insufficient controls. Please run MINTIE with at least 1 controls.
ERROR: Expected output file all/eq_classes_de.txt in stage run_de (all) could not be found

========================================= Pipeline Failed ==========================================

In stage Unknown: One or more parallel stages aborted. The following messages were reported:

------------------------------------------- Unknown all --------------------------------------------

Expected output file all/eq_classes_de.txt in stage run_de (all) could not be found


Use 'bpipe errors' to see output from failed commands.

(mintie) PowerEdge-T440:~$ bpipe errors

============================== Found 0 failed commands from run 57035 ==============================

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.