oshlack / mintie Goto Github PK
View Code? Open in Web Editor NEWMethod for Identifying Novel Transcripts and Isoforms using Equivalence classes, in cancer and rare disease.
License: MIT License
Method for Identifying Novel Transcripts and Isoforms using Equivalence classes, in cancer and rare disease.
License: MIT License
Hi,
It is mentioned in the documentation that MINTIE can be run without control data as follows:
bpipe run @$MINTIEDIR/params.txt $MINTIEDIR/MINTIE.groovy $cases
Can MINTIE be run without control data and without bpipe like this?:
mintie -w -p params.txt cases/*.fastq.gz
I have installed the tools sucessfully, but I can't run more than 1 simultaneous case as bpipe. Everytime when several cases run to the stage assemble simultaneously, only the last processed case will continue to run,while others will stop with the details below:
====================== Pipeline Failed ===========================
One or more parallel stages aborted. The following messages were reported:
---------------------------------------- assemble ( 14 ) -----------------------------------------
Command in stage assemble failed with exit status = 137 :
rlens=`zcat 14/trim1.fastq.gz 14/trim2.fastq.gz | awk -v mrl=76 'BEGIN {minlen = mrl; maxlen = 0} { if (NR % 4 == 2) { rlen = length($1) ; if (rlen > maxlen) {maxlen = rlen} if (rlen < minlen) {minlen = rlen} }} END {print minlen" "maxlen}'` ; min_rlen=${rlens% *} ; max_rlen=${rlens#* } ; if [ ! -d /asnas/wangqf_group/yanln/MINTIE/14/14/SOAPassembly ]; then mkdir /asnas/wangqf_group/yanln/MINTIE/14/14/SOAPassembly ; fi ; cd /asnas/wangqf_group/yanln/MINTIE/14/14/SOAPassembly ; echo "max_rd_len=$max_rlen" > config.config ; echo -e "[LIB]\nq1=../../14/trim1.fastq.gz\nq2=../../14/trim2.fastq.gz" >> config.config ; if [ -e SOAP.fasta ]; then rm SOAP.fasta ; fi ; for k in 29 49 69 ; do if [ $k -gt $min_rlen ]; then echo "WARNING: Kmer size $k exceeds minimum read length ${min_rlen}. Please double check parameters." ; else /xtdisk/liuxin_group/suipp/ITD/RNA_seq/MINTIE-v0.3.0/tools/bin/soapdenovotrans pregraph -s config.config -o outputGraph_$k -K $k -p 6 ; /xtdisk/liuxin_group/suipp/ITD/RNA_seq/MINTIE-v0.3.0/tools/bin/soapdenovotrans contig -g outputGraph_$k ; cat outputGraph_$k.contig | sed "s/^>/>k${k}_/g" >> SOAP.fasta ; fi ; done ; cd ../../ ; /xtdisk/liuxin_group/suipp/ITD/RNA_seq/MINTIE-v0.3.0/tools/bin/dedupe in=14/SOAPassembly/SOAP.fasta out=stdout.fa threads=6 overwrite=true | /xtdisk/liuxin_group/suipp/ITD/RNA_seq/MINTIE-v0.3.0/tools/bin/fasta_formatter | awk '!/^>/ { next } { getline seq } length(seq) > 100 { print $0 "\n" seq }' > 14/14_denovo_filt.fasta ; if [ ! -s 14/14_denovo_filt.fasta ] ; then rm 14/14_denovo_filt.fasta ; echo "ERROR: de novo assembled contigs fasta file is empty." ; echo "Please check paths for SOAPdenovoTrans, dedupe and fasta" ; echo "formatter are correct, and their dependencies are installed." ; fi ;
----------------------------------------------------------------------------------------------------
Use 'bpipe errors' to see output from failed commands.
The content of the parameter file is:
-p threads=6
-p assembly_mem=100
-p assembler=soap
-p scores=33
-p min_read_length=76
-p min_contig_len=100
-p minQScore=20
-p Ks=29,49,69
-p min_gap=3
-p min_clip=20
-p min_match=30,0.3
-p min_logfc=2
-p min_cpm=0.1
-p fdr=0.05
-p sort_ram=4G
-p gene_filter=
-p var_filter=
-p splice_motif_mismatch=1
-p fastqCaseFormat=cases/%_R*.fastq.gz
-p fastqControlFormat=controls/%_R*.fastq.gz
-p assemblyFasta=
-p run_de_step=true
I don't know why I can't run several ceses simultaneously, looking forward to your reply. Thanks.
This is really a great tool, is it suitable for plants?
Hi, When running with differential expression turned off, filter_fasta.py
took extraordinarily long (in fact, it exceeded the cluster allocated time of 20 hrs).
Upon, closer inspection, it appears this is due to the record.id in tx_list[col_id].values
bit. I've provided an alternative approach.
Before
tx_list = pd.read_csv(tx_list_file, sep='\t', header=header)
handle = open(fasta_file, 'r')
for record in SeqIO.parse(handle, 'fasta'):
record.description = record.id
if record.id in tx_list[col_id].values:
sys.stdout.write(record.format('fasta'))
handle.close()
After
tx_list = pd.read_csv(tx_list_file, sep='\t', header=header)
lookup_list = set(tx_list[col_id].values.tolist())
handle = open(fasta_file, 'r')
for record in SeqIO.parse(handle, 'fasta'):
record.description = record.id
if record.id in lookup_list:
sys.stdout.write(record.format('fasta'))
handle.close()
Brief test with 2000 lines fasta (16x reduction in runtime)
√ % wc -l sample_denovo_filt_2000.fasta
2000 sample_denovo_filt_2000.fasta
(venv_mintie) MINTIE
√ % time python filter_fasta_previous.py sample_denovo_filt_2000.fasta eq_classes_de.txt --col_id contig > sample_de_contigs_prev.fasta
python filter_fasta_previous.py sample_denovo_filt_2000.fasta --col_id conti 48.28s user 0.63s system 101% cpu 48.308 total
(venv_mintie) MINTIE
√ % time python filter_fasta.py sample_denovo_filt_2000.fasta eq_classes_de.txt --col_id contig > sample_de_contigs.fasta
python filter_fasta.py sample_denovo_filt_2000.fasta eq_classes_de.txt conti 3.81s user 0.46s system 119% cpu 3.582 total
(venv_mintie) MINTIE
√ % diff sample_de_contigs.fasta sample_de_contigs_prev.fasta
(venv_mintie) MINTIE
√ % echo $?
0 (i.e. files are identical)
The following line fails:
wget --no-check-certificate http://ccb.jhu.edu/software/hisat2/dl/hisat2-2.1.0-Linux_x86_64.zip
Could replace with:
wget "https://cloud.biohpc.swmed.edu/index.php/s/hisat2-210-Linux_x86_64/download" -O hisat2-2.1.0-Linux_x86_64.zip
Hi, I tried to running this with hg19. However, the CHESS database not support to this version. Can you give me some suggestions of how to construct hg19 reference for this tool?
Thanks
Hello,
You state in your Documentation that Mintie needs to be installed using the conda base environment if used on a cluster. On our cluster this is a bit tricky since there are conflicting packages and in general users may want to use environments to avoid conflicts with other packages they use.
Is there a way one could propagate environments? Is this limitation due to the generation of the submission scripts through bpipe? I'm not familiar with bpipe, but from a first glance it looked as if there is not an option yet to easily extend the submission scripts.
Thank you
Do you happen to have a docker image ready for analysis?
Hi Marek,
I've updated the MINTIE code to the latest that is on github as of yesterday (18 May 2020) and now I get the following error within the stage post-process:
==================================== Stage post_process (Rh18) =====================================
/usr/local/lib/python3.7/site-packages/pandas/compat/init.py:84: UserWarning: Could not import the lzma module. Your installed Python is incomplete. Attempting to use lzma compression will result in a RuntimeError.
warnings.warn(msg)
Traceback (most recent call last):
File "/usr/local/MINTIE/annotate/post_process.py", line 180, in
main()
File "/usr/local/MINTIE/annotate/post_process.py", line 167, in main
contigs = add_de_info(contigs, de_results)
File "/usr/local/MINTIE/annotate/post_process.py", line 92, in add_de_info
contigs = contigs.drop_duplicates(ignore_index=True)
TypeError: drop_duplicates() got an unexpected keyword argument 'ignore_index'
Cleaned up file Rh18/Rh18_results.tsv to .bpipe/trash/Rh18_results.tsv.1
ERROR: stage post_process failed: Command in stage post_process failed with exit status = 1 :
/usr/local/bin/python3.7 /usr/local/MINTIE/annotate/post_process.py Rh18 Rh18/novel_contigs_info.tsv Rh18/eq_classes_de.txt Rh18/vaf_estimates.txt --log /data/SarcomaCellLines/Analysis/Rh18/MINTIE/Rh18/postprocess.log > Rh18/Rh18_results.tsv
These are the parameters I'm using:
" -p threads=8 -p genome_mem=8000000000 -p assembly_mem=8 -p assembler=soap -p scores=33 -p min_read_length=50 -p max_read_length=150 -p minQScore=20 -p Ks=79,49 -p min_gap=7 -p min_clip=20 -p min_match=30,0.3 -p min_logfc=2 -p min_cpm=0.1 -p fdr=0.05 -p sort_ram=4G -p gene_filter= -p var_filter= -p assemblyFasta= -p test_mode=false -p fastqCaseFormat=cases/%_R*.fastq.gz -p fastqControlFormat=controls/%_R*.fastq.gz"
Thank you!
Hi, I installed mintie from conda (as suggested in the wiki):
mamba create -c conda-forge -c bioconda -n mintie mintie
Info on the package:
# Name Version Build Channel
mintie 0.4.1 hdfd78af_0 bioconda
But when I try to setup the reference with mintie -r
it fails.
Here same "interesting" parts of the log:
Generating references...
genome_fasta not found, setting this up...
--2022-11-28 16:54:12-- http://ccb.jhu.edu/chess/data/hg38_p8.fa.gz
Resolving ccb.jhu.edu (ccb.jhu.edu)... 128.220.233.141
Connecting to ccb.jhu.edu (ccb.jhu.edu)|128.220.233.141|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2022-11-28 16:54:12 ERROR 404: Not Found.
gzip: hg38_p8.fa.gz: No such file or directory
[E::fai_build3_core] Failed to open the file hg38_p8.fa
[faidx] Could not build fai index hg38_p8.fa.fai
...
FileNotFoundError: [Errno 2] No such file or directory: 'chess2.2.gtf'
cat: tx2gene.success: No such file or directory
Checking that all required references were setup:
WARNING: genome_fasta could not be found!!!! You will need to setup genome_fasta manually, then add its path to references.groovy
WARNING: tx_annotation could not be found!!!! You will need to setup tx_annotation manually, then add its path to references.groovy
WARNING: trans_fasta could not be found!!!! You will need to setup trans_fasta manually, then add its path to references.groovy
WARNING: ann_info could not be found!!!! You will need to setup ann_info manually, then add its path to references.groovy
WARNING: tx2gene could not be found!!!! You will need to setup tx2gene manually, then add its path to references.groovy
gmap_refdir looks like it has been setup
gmap_genome looks like it has been setup
**********************************************************
WARNING: One or more command did not install successfully. See warning messages above. You will need to correct this before running MINTIE.
Here the full log.
Thanks,
The datasets I am working with consists of a large collection of paired-end FASTQ files, which are not in the .gz compressed format. When I attempt to run MINTIE with these uncompressed files by modifying the params.txt file (-p fastqCaseFormat=cases/%_R*.fastq;
-p fastqControlFormat=controls/%_R*.fastq). But, I encounter an error message that says, “Stage fastq_dedupe unable to locate one or more inputs specified by 'from' ending with [*.gz]”.
I understand that MINTIE is currently configured to work with gzipped fastq files. However, given the volume of data I am dealing with, it would be incredibly helpful if I could use MINTIE directly on uncompressed FASTQ files. This would save a significant amount of time that would otherwise be spent on file conversion. If it is possible to modify MINTIE to enable the direct use of FASTQ files, I would greatly appreciate your guidance on how to make this happen.
Many false positives seem to be a result of poor alignment of contigs to the genome, which are resulting is bad annoation. e.g.
k49_1134199 0 chr21 44220603 3 281S16M6810N5M2435N6M6199N13M50029N16M1I6M32I32M21I4M4911N3M16052N8M8047N10M2202N2I11M1I25M3I22M6D11M750N10M1163N4M1809N3M1264N7M24156N4M6825N5M3920N2M12899N7M4539N3M7819N8M10199N7M1D1M4453N8M5667N2M11730N3M55386N4M402730N2M6445N3I6M15389N1M13514N9M1I47M2N4I126M15D1279M * 00 CAGCGCTCCTGGCCCCCCGAAGTCCCAGAGCTGCTGACCCCCACCCCAGCTGCATCAGAGAGCCTGTCTGGGGCCAAGGTTGCCAGAGATTTCTGAAGACACAGCTTGTTCCTTGTTCTTGGCTGGTGGGTGCACAAGGACTTCTGGAAGGGATTTAGACGGGGCTGAGTGCTAGGATTAAAGTGGGGATGGGAGTACGGCAACAGAAAAACCTGGGAGCTAGCAATGCACCCAGCCCTTGACTGTGCCCTGGTGGACAGCCGAGCTGTGGCTCTAGCGTGAGCCAGTGCCTTCCTGTCCCTGCCAAGGGTGAGGCCAGAGTTGGCCCCGAGGCTAATGTTTCAGTGGGTGAGATTAGGTCGGCCGTACAGAGGCCGGTGGGCTCCCTGACATCCCTTCCAGGCAACCTGAAAGCACTGAAATAGCTTATGGCCCTGTGCCAGGGACCTTGGCCCAAGCTGCTGACCTCCAGGGTGGGGAGGGAGCTACCCCCAGGAGAAGAGTCACTCAGACAGCAGTATGAGCAAGCCAGCCAGCAGCTCCGTGCCTGCACCCAGCTCAGGGGAATCCCAGGGGGTTCAGATGCCCAGGAAGGAAAAGGGGACAGCGCTACTGCTATGGAATGAGACCACCACTTCTCCTGTTGTCCTTCCCAGCTTCTCCCCAACCTCCCCTTTTCCCTAGTTTATAAGACAGGAGAAAAGGGAGAAAGCAAAAAGCTGGAAAGAAACAGAAGTAAGATAAATAGCTAGACGACCTTGGCGCCACCACCTGGCCCTGGTGGTTAAAATGATAATAATATTAACCCCTGACCAAAACGACTGGTGTTATCTGTAAATCCCAGACATTGTGTGAGAAAGCACCGTAAAACTTTTTGTCCTATTAGCTGATGTGTGTAGCCCCCAGTCACGTTCCTCACGCTTACTTGATCTATTATGACCCTTTCACGTGGACCCCTTAGAGTTGTAAGCTCTTAAAAGGGCTAGGAATTTCTTTTTCGGGGAGCTCGGCTCTTAAGACGCAAGTCTGCTGACACTCCTGGCCAAATAAAGCCCTTCCTTCTTTAACCGAGTGTCTGAGGAATTCTGTCTGCGGCTTGTCCGGCTACAACGGTGCTGGAGCCCAGACTCTCAGGGAAAGGAACCCGAGCCGTCAGAAAACCATCTGATTCCAGGCTGGGGCAAGGGACATGGAGATGGGCCTGCAGCATCATGTTGCTCCAGAAAGCAAGAAAGTGCTCAGAACGGTAGAACGGGGATGCATGGACAGGACACGCAGCCAGACCTAGCGGATTTGAGCATCTCGGGGAAGAAAGGACAGCCACAGATCATGCACTACTGAACAAAATAAAACTGTGGGTCACGCTGATGAGAGAGAGGCTGCAGAGAAGGAGAGACCCTTCCTTAGGTTGGCAGCCGTGAGTGGCAGGCGGGGACCAGCACGGCACCAATCTGCAGCCATCGCAGTGATGGCGGCTTCAGGCGGGGACCTCCGCGGATGCTGAGCCTGCGGGTGCGATTTGATGAGGGCAGAACCTCACCAGCCCACAGTGGCTGCGAGGGGATCATGCAGCGGGATGGGGAGGCCGGGGGGATGCCGTCTCAGCAGAGCCGTCCACGCTGACCTCATCAAGACTGGGACGGGGCCACAGCAGTGCCTCTCATGGGCACTTAGGACACCGTCACTGAGGGGCTCCTGCCAAAGCACACCTGAGTCCAGGCAGAGGAAACTCCAGACAAGACCCCCGAGGGTCATGCTACAAAGCTGCTCTCCTGACTTCCTCAGAAACGCCCAAGGACAGGAAAGACAAAGAAAGCTGAGGACTTGTCCAGATTCAAGAAGCCCAAGGAGACGGCTGAGCGTAGGGCGAGCCTGGGTGAGGAGATTCAGAGCGTTAGACGGCTGAGCGCAGTGTGTGAACCTGGGTTAGGAGATTTGGGGCCTGAGATGGCTGAGTGCAGGGTGAGCCTGAGTGAGGAGATTCTGAGCCTGAGACAGCTGAGCACAGGGTGAGCCTGGGTGACAAAATCCACCAGGAAAATATGCTCACGAAGACATCATTGGGACAACCAATAAAATATGCGT * MD:Z:35AG4G1T8AC19C1C1CG3CC1GCT2CC2A7G24T4TCT3A2CCTC2GCT1A1T1T6C1T2TGAGGG2C1^GGGACA1CA1G48G17^G4A1G43C2C3C3TT7A1CC33A14T29C3T18C16C1A6T5TT^ATTATTATTATTAAC13T19T11A3A6TT1C8C3T2G1C8CA4A10A5C3A3G2CT6C1CA9T23C313C14A784 NH:i:1HI:i:1 NM:i:175 SM:i:40 XQ:i:40 X2:i:0 XO:Z:UU
The read should align to chr21:43268915-43270392 and chr2:231884096-231893280
Replace the following stage with minimap2 could be a simple improvement (but require a bunch of validation work). Keen on your thoughts @mcmero
align_contigs_against_genome = {
def sample_name = branch.name
output.dir = sample_name
produce('aligned_contigs_against_genome.sam'){
exec """
$gmap -D $gmap_refdir -d $gmap_genome -f samse -t $threads -x $min_gap --max-intronlength-ends=500000 -n 0 $input.fasta > $output
""", "align_contigs_against_genome"
}
}
Dear Marek,
I installed MINTIE using Mamba-forge (and micromamba, I tried both) on my WSL (Ubuntu). I tried to run a test in a folder outside of base folder and got an error:
A script, requested to be loaded from file '/home/drshamy/micromamba/envs/mintie/share/mintie-0.4.2-0/references.groovy', could not be accessed.
I looked in the referred folder and there is no file called references.groovy.
Should there be a file called references.groovy? Can you help me to solve the issue?
Thank you very much in advance.
Hello,
I am trying to use Mintie (v4.2-0) for mouse reference ( I managed to create the groovy file) , however, I am keep getting below error. My pattern seems correct. Please let me know if I am missing anything here, could it be because I have "-" in my sample ids, if so is there a way to get around that ?
ERROR: The pattern provided 'cases/%_R*.fastq.gz' did not match any of the files provided as input [cases/SampleXR-01-04_R1.fastq.gz, cases/SampleXR-01-04_R2.fastq.gz, cases/SampleXR-01-05_R1.fastq.gz, cases/SampleXR-01-05_R2.fastq.gz, cases/SampleXR-01-06_R1.fastq.gz, cases/SampleXR-01-06_R2.fastq.gz, controls/SampleXR-01-01_R1.fastq.gz, controls/SampleXR-01-01_R2.fastq.gz, controls/SampleXR-01-02_R1.fastq.gz, controls/SampleXR-01-02_R2.fastq.gz, controls/SampleXR-01-03_R1.fastq.gz, controls/SampleXR-01-03_R2.fastq.gz]
Here is my param.txt
-p threads=16
-p assembly_mem=200
-p assembler=soap
-p scores=33
-p min_read_length=80
-p min_contig_len=100
-p minQScore=20
-p Ks=79,49
-p min_gap=7
-p min_clip=20
-p min_match=30,0.3
-p min_logfc=5
-p min_cpm=0.1
-p fdr=0.05
-p sort_ram=4G
-p gene_filter=
-p var_filter=
-p splice_motif_mismatch=4
-p fastqCaseFormat=cases/%_R*.fastq.gz
-p fastqControlFormat=controls/%_R*.fastq.gz
-p assemblyFasta=
-p run_de_step=true
Also, is there an option to provide custom reference.groovy, beside replacing it in $MINTIEDIR ?
Dear Marek,
I have installed Mintie using Mamba on WSL (Ubuntu). Although I managed to install the reference genome (mintie -r), I'm facing an error at the Assemble stage at the test datasets run (after mintie -t). The Trimming stage seems to be working correctly (I got trimmed output files), but the pipeline fails to create "denovo_filt.fasta" and/or "SOAP.fasta" files properly (those files are missing). I also tried to install Mintie manually, including all required packages from the pipeline (added references to the "tool.groovy" file), including Trinity and rnaSPAdes packages. However, I faced exactly the same problem. Do you maybe have an idea of what and where might be the problem causing this error? Should I maybe have to reconfigure SOAPdenovo-Trans parameters somehow? I have ran and tested all packages in the pipeline individually and they are all working correctly. So, I have no other option but to ask you to help me in solving this problem as I'm very keen to use Mintie in my project (might be exactly what I need).
Thank you in advance.
Hi Marek:
I have mintie installed via conda, and set up the reference using the updated sh file.
mintie -r
and mintie -h
all run without issue.
But the test run failed with following Groovy error, could you have a look what is wrong with my installation?
OpenJDK 64-Bit Server VM warning: Options -Xverify:none and -noverify were deprecated in JDK 13 and will likely be removed in a future release.
java.lang.NoClassDefFoundError: Could not initialize class org.codehaus.groovy.vmplugin.v7.Java7
at org.codehaus.groovy.vmplugin.VMPluginFactory.<clinit>(VMPluginFactory.java:43)
at org.codehaus.groovy.reflection.GroovyClassValueFactory.<clinit>(GroovyClassValueFactory.java:35)
at org.codehaus.groovy.reflection.ClassInfo.<clinit>(ClassInfo.java:107)
at org.codehaus.groovy.reflection.ReflectionCache.getCachedClass(ReflectionCache.java:95)
at org.codehaus.groovy.reflection.ReflectionCache.<clinit>(ReflectionCache.java:39)
at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.registerMethods(MetaClassRegistryImpl.java:209)
at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.<init>(MetaClassRegistryImpl.java:107)
at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.<init>(MetaClassRegistryImpl.java:85)
at groovy.lang.GroovySystem.<clinit>(GroovySystem.java:36)
at org.codehaus.groovy.runtime.InvokerHelper.<clinit>(InvokerHelper.java:86)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.createCallStaticSite(CallSiteArray.java:74)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.createCallSite(CallSiteArray.java:161)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:115)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:127)
at bpipe.Runner.<clinit>(Runner.groovy:58)
at bpipe.Runner9.main(Runner9.java:47)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at org.codehaus.groovy.tools.GroovyStarter.rootLoader(GroovyStarter.java:110)
at org.codehaus.groovy.tools.GroovyStarter.main(GroovyStarter.java:128)
java.lang.reflect.InvocationTargetException
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at org.codehaus.groovy.tools.GroovyStarter.rootLoader(GroovyStarter.java:110)
at org.codehaus.groovy.tools.GroovyStarter.main(GroovyStarter.java:128)
Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.codehaus.groovy.reflection.ReflectionCache
at org.codehaus.groovy.runtime.dgmimpl.NumberNumberMetaMethod.<clinit>(NumberNumberMetaMethod.java:33)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499)
at java.base/java.lang.reflect.ReflectAccess.newInstance(ReflectAccess.java:128)
at java.base/jdk.internal.reflect.ReflectionFactory.newInstance(ReflectionFactory.java:347)
at java.base/java.lang.Class.newInstance(Class.java:645)
at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.createMetaMethodFromClass(MetaClassRegistryImpl.java:257)
at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.<init>(MetaClassRegistryImpl.java:110)
at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.<init>(MetaClassRegistryImpl.java:85)
at groovy.lang.GroovySystem.<clinit>(GroovySystem.java:36)
at org.codehaus.groovy.runtime.InvokerHelper.<clinit>(InvokerHelper.java:86)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.createCallStaticSite(CallSiteArray.java:74)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.createCallSite(CallSiteArray.java:161)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:115)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:127)
at bpipe.Runner.<clinit>(Runner.groovy:58)
at bpipe.Runner9.main(Runner9.java:47)
... 6 more
Would be great to outline which software versions this has been tested with
Hi,
I am trying to use Mintie on a cloud platform (Nimbus): Operating System: Ubuntu 18.04.5 LTS; Kernel: Linux 4.15.0-206-generic; Architecture: x86-64. I installed Mintie using the suggested method (using mamba). The installation completed without issue and I also successfully downloaded the reference files.
I am now trying to run mintie on the test data using
$ mintie -t
$ mintie -w -p test_params.txt cases/.fastq.gz controls/.fastq.gz
However, I am running into the following issue relating to a gmap_genome
file
====================================================================================================
| Starting Pipeline at 2023-06-06 08:06 |
====================================================================================================
================================ Stage fastq_dedupe (allvars-case) =================================
==================================== Stage trim (allvars-case) =====================================
================================== Stage assemble (allvars-case) ===================================
============================= Stage create_salmon_index (allvars-case) =============================
================================= Stage run_salmon (allvars-case) ==================================
================================ Stage run_salmon (allvars-control) ================================
=========================== Stage create_ec_count_matrix (allvars-case) ============================
=================================== Stage run_de (allvars-case) ====================================
========================== Stage filter_on_significant_ecs (allvars-case) ==========================
======================== Stage align_contigs_against_genome (allvars-case) =========================
Note: gmap.avx2 does not exist. For faster speed, may want to compile package on an AVX2 machine
GMAP version 2023-04-28 called with args: gmap.sse42 -D /data/miniconda3/envs/mintie/share/mintie-0.4.2-0/ref -d gmap_genome -f samse -t 2 -x 7 --max-intronlength-ends=500000 -n 0 allvars-case/de_contigs.fasta
Checking compiler assumptions for SSE2: 6B8B4567 327B23C6 xor=59F066A1
Checking compiler assumptions for SSE4.1: -103 -58 max=198 => compiler zero extends
Checking compiler options for SSE4.2: 6B8B4567 __builtin_clz=1 __builtin_ctz=0 _mm_popcnt_u32=17 __builtin_popcount=17
Finished checking compiler assumptions
IIT file /data/miniconda3/envs/mintie/share/mintie-0.4.2-0/ref/gmap_genome/gmap_genome.chromosome.iit is not valid
ERROR: Command failed with exit status = 9 :
gmap -D /data/miniconda3/envs/mintie/share/mintie-0.4.2-0/ref -d gmap_genome -f samse -t 2 -x 7 --max-intronlength-ends=500000 -n 0 allvars-case/de_contigs.fasta > allvars-case/aligned_contigs_against_genome.sam
========================================= Pipeline Failed ==========================================
One or more parallel stages aborted. The following messages were reported:
Branch allvars-case in stage Unknown reported message:
Command failed with exit status = 9 :
gmap -D /data/miniconda3/envs/mintie/share/mintie-0.4.2-0/ref -d gmap_genome -f samse -t 2 -x 7 --max-intronlength-ends=500000 -n 0 allvars-case/de_contigs.fasta > allvars-case/aligned_contigs_against_genome.sam
Use 'bpipe errors' to see output from failed commands.
It looks like the gmap_genome references were not created properly, as data/miniconda3/envs/mintie/share/mintie-0.4.2-0/ref/gmap_genome/gmap_genome.chromosome.iit
does not exist.
If you have any suggestions for how I can troubleshoot this, that would be great.
Thanks
Hi Marek,
Thank you for sharing your wonderful pipeline! MINTIE is working very well in local mode submitted to our queue through an interactive job.
We have had problems with the cluster implementation as BPIPE requires the qstat -x flag included in PBSpro. Our qstat install is packaged with slurm-torque 18.08.4-1.el7.
Execution Command
nohup srun mintie -w -p params.txt cases/*.fastq.gz controls/*.fastq.gz &
Successfully submits 'fastq_dedup' to the queue as 1 job per sample.
Error
Pipeline hangs after successful completion of 'fastq_dedup'. SLURM exit status is COMPLETE and output fastq are generated.
Outputs in .bpipe/bpipe.log:
bpipe.Utils [38] INFO |11:57:27 Executing command: qstat -x 5451428
bpipe.executor.TorqueStatusMonitor [38] WARNING |11:57:27 Error occurred in processing torque output: java.lang.Exception: Error parsing torque output: unexpected error: Unknown option: x
Environment
The MINTIE installation is for version 0.3.9 installed via miniconda3/mamba. The package version are in the yaml here:
mintie.yml.txt
The BPIPE scheduling configuration is as follows:
executor="slurm"
//controls the total number of procs MINTIE can spawn
//if running locally, ensure that concurrency is not to
//set to more than the number of procs available. If
//running on a cluster, this can be increased
concurrency=10
//following commands are for running on a cluster
walltime="5-20:00:00"
queue="bigmem"
mem_param="mem-per-cpu"
memory="30"
proc_mode=1
usePollerFileWatcher=true
useLegacyTorqueJobPolling=true
procs=10
account="grayl"
//add server-specific module to load
modules="miniconda3"
commands {
Thank you in advance for taking a look at this.
Lesley
Hi!
Thank you for your preprint. I was trying to test the pipeline but faced the following issue. Could you help?
============================== Stage annotate_contigs (allvars-case) ===============================
Traceback (most recent call last):
File "/home/usr/data/mintie/MINTIE-v0.3.5/annotate/annotate_contigs.py", line 720, in
main()
File "/home/usr/data/mintie/MINTIE-v0.3.5/annotate/annotate_contigs.py", line 717, in main
annotate_contigs(args)
File "/home/usr/data/mintie/MINTIE-v0.3.5/annotate/annotate_contigs.py", line 659, in annotate_contigs
bam = pysam.AlignmentFile(args.bam_file, 'rc')
File "pysam/libcalignmentfile.pyx", line 742, in pysam.libcalignmentfile.AlignmentFile.cinit
File "pysam/libcalignmentfile.pyx", line 991, in pysam.libcalignmentfile.AlignmentFile._open
ValueError: file has no sequences defined (mode='rc') - is it SAM/BAM format? Consider opening with check_sq=False
Cleaned up file allvars-case/annotated_contigs.vcf to .bpipe/trash/annotated_contigs.vcf.3
Cleaned up file allvars-case/annotated_contigs_info.tsv to .bpipe/trash/annotated_contigs_info.tsv.3
ERROR: stage annotate_contigs failed: Command in stage annotate_contigs failed with exit status = 1 :
/home/usr/miniconda3/envs/mintie/bin/python /home/usr/data/mintie/MINTIE-v0.3.5/annotate/annotate_contigs.py allvars-case allvars-case/aligned_contigs_against_genome.bam /home/usr/data/mintie/MINTIE-v0.3.5/ref/chess2.2.info /home/usr/data/mintie/MINTIE-v0.3.5/ref/chess2.2.gtf allvars-case/annotated_contigs.bam allvars-case/annotated_contigs_info.tsv --minClip 20 --minGap 7 --minMatch 30,0.3 --log /home/usr/data/mintie/MINTIE-v0.3.5/allvars-case/annotate.log > allvars-case/annotated_contigs.vcf
Hi,
I am trying to use Mintie on a cloud platform (Nimbus): Operating System: Ubuntu 18.04.5 LTS; Kernel: Linux 4.15.0-206-generic; Architecture: x86-64. I installed Mintie using the suggested method (using mamba
). The installation completed without issue and I also successfully downloaded the reference files.
I am now trying to run mintie on the test data using
$ mintie -t
$ mintie -w -p test_params.txt cases/*.fastq.gz controls/*.fastq.gz
However, I keep getting this java issue:
penJDK 64-Bit Server VM warning: Options -Xverify:none and -noverify were deprecated in JDK 13 and will likely be removed in a future release.
java.lang.NoClassDefFoundError: Could not initialize class org.codehaus.groovy.vmplugin.v7.Java7
at org.codehaus.groovy.vmplugin.VMPluginFactory.<clinit>(VMPluginFactory.java:43)
at org.codehaus.groovy.reflection.GroovyClassValueFactory.<clinit>(GroovyClassValueFactory.java:35)
at org.codehaus.groovy.reflection.ClassInfo.<clinit>(ClassInfo.java:107)
at org.codehaus.groovy.reflection.ReflectionCache.getCachedClass(ReflectionCache.java:95)
at org.codehaus.groovy.reflection.ReflectionCache.<clinit>(ReflectionCache.java:39)
at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.registerMethods(MetaClassRegistryImpl.java:209)
at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.<init>(MetaClassRegistryImpl.java:107)
at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.<init>(MetaClassRegistryImpl.java:85)
at groovy.lang.GroovySystem.<clinit>(GroovySystem.java:36)
at org.codehaus.groovy.runtime.InvokerHelper.<clinit>(InvokerHelper.java:86)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.createCallStaticSite(CallSiteArray.java:74)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.createCallSite(CallSiteArray.java:161)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:115)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:127)
at bpipe.Runner.<clinit>(Runner.groovy:58)
at bpipe.Runner9.main(Runner9.java:47)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at org.codehaus.groovy.tools.GroovyStarter.rootLoader(GroovyStarter.java:110)
at org.codehaus.groovy.tools.GroovyStarter.main(GroovyStarter.java:128)
java.lang.reflect.InvocationTargetException
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at org.codehaus.groovy.tools.GroovyStarter.rootLoader(GroovyStarter.java:110)
at org.codehaus.groovy.tools.GroovyStarter.main(GroovyStarter.java:128)
Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.codehaus.groovy.reflection.ReflectionCache
at org.codehaus.groovy.runtime.dgmimpl.NumberNumberMetaMethod.<clinit>(NumberNumberMetaMethod.java:33)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499)
at java.base/java.lang.reflect.ReflectAccess.newInstance(ReflectAccess.java:128)
at java.base/jdk.internal.reflect.ReflectionFactory.newInstance(ReflectionFactory.java:347)
at java.base/java.lang.Class.newInstance(Class.java:645)
at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.createMetaMethodFromClass(MetaClassRegistryImpl.java:257)
at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.<init>(MetaClassRegistryImpl.java:110)
at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.<init>(MetaClassRegistryImpl.java:85)
at groovy.lang.GroovySystem.<clinit>(GroovySystem.java:36)
at org.codehaus.groovy.runtime.InvokerHelper.<clinit>(InvokerHelper.java:86)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.createCallStaticSite(CallSiteArray.java:74)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.createCallSite(CallSiteArray.java:161)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:115)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:127)
at bpipe.Runner.<clinit>(Runner.groovy:58)
at bpipe.Runner9.main(Runner9.java:47)
... 6 more
Could you please help me figure out why this is happening?
Thanks!
Chiara
Hello, and thanks for creating such a useful tool. I had a question about understanding the "cpos" output, particularly when the alignment includes gaps. For instance, I have the following result:
chr1 | pos1 | strand1 | chr2 | pos2 | strand2 | variant_type | overlapping_genes | variant_id | partner_id | vars_in_contig | VAF | varsize | contig_varsize | cpos | contig_len | contig_cigar |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
chr1 | 973327 | - | chr1 | 973499 | - | RI | PLEKHN1 | k49_231483a | . | 1 | 0.5953358 | 173 | 173 | 141 | 695 | 48M137N137M436N150M175N360M |
The retained intron position in question here is between the two rightmost exons in the attached screenshot from IGV.
However, if I were to annotate that cpos myself, I'd say it was at 477 (specifically it looks like the RI is from 477 - 477+173). I'm not sure I see how 141 is calculated.
Thanks for the help,
Rachel
Hi,
Congratulations on the recent bioRxiv - I really enjoyed it and (hopefully) look forward to applying it in my work.
I am currently trying to test it out on my non-human non-model organism, and I am getting an error that I am struggling to troubleshoot at the "annotate contigs" stage. The error is below. Is there something obvious I should look at that I am missing?
Any advice is greatly appreciated.
Kind regards,
Steve
================================= Stage annotate_contigs (7554789) =================================
Traceback (most recent call last):
File "/nfs/users/nfs_s/sd21/lustre118_link/software/TRANSCRIPTOME/MINTIE/annotate/annotate_contigs.py", line 28, in
from utils import cached, init_logging, exit_with_error
File "/lustre/scratch118/infgen/team133/sd21/software/TRANSCRIPTOME/MINTIE/annotate/utils.py", line 17
print("{} ERROR: {}, exiting".format(PROGRAM_NAME, message), file=sys.stderr)
^
SyntaxError: invalid syntax
Hi.
I am trying to install MINTIE using conda. But it consistently fails at the "Solving environment" step. Could you please help about this? Please see below:
(base) C:\Users\ysnuy>conda create -c conda-forge -c bioconda -n mintie mintie
Collecting package metadata (current_repodata.json): done
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: |
Found conflicts! Looking for incompatible packages.
This can take several minutes. Press CTRL-C to abort.
failed
UnsatisfiableError:
(base) C:\Users\ysnuy>
Hello,
I'm currently trying to run a simulation through MINTIE using the run_simu.py files in the simu folder. I have resolved most of the issues I have encountered except the following.
python run_simu.py params.ini --log simulog.txt
Traceback (most recent call last):
File "/home/neuro/miniconda3/envs/mintie/run_simu.py", line 345, in <module>
main()
File "/home/neuro/miniconda3/envs/mintie/run_simu.py", line 342, in main
simulate(args)
File "/home/neuro/miniconda3/envs/mintie/run_simu.py", line 321, in simulate
available_genes = simulate_fusions(simp, params, available_genes, valid_txs,
File "/home/neuro/miniconda3/envs/mintie/run_simu.py", line 124, in simulate_fusions
fus_parts = simu.write_fusion((tx1, tx2), (gene1, gene2), all_exons, paths['genome_fasta'], \
File "/home/neuro/miniconda3/envs/mintie/simu.py", line 595, in write_fusion
block_seq = get_random_block(chr1, gene_trees, genome_fasta, block_range)
File "/home/neuro/miniconda3/envs/mintie/simu.py", line 397, in get_random_block
chr_range = chr_sizes[('chr%s' % chrom)]
KeyError: 'chrchr19'
I'm not entirely sure how to proceed.
Thanks for the help!
Hi,
My rnaseq dataset is single end and I want to run MINTIE with it. I get the error that
stage fastq_dedupe failed: Insufficient inputs: at least 2 inputs are expected with extension .gz but only 1 are available
How can I run MINTIE with single end dataset?
Inoder to test, I tried to run the test dataset after deleting R2 files from cases and controls as
mintie -w -p test_params.txt cases/*.fastq.gz controls/*.fastq.gz
in the test_params.txt, I made the following changes
-p fastqCaseFormat=cases/%_R1.fastq.gz
-p fastqControlFormat=controls/%_R1.fastq.gz
I get the same error. Could you let me know how to run the tool on single end dataset?
Thanks,
Dear Author
I have manually installed mintie 0.4.2 and setup ref w/o error. But, mintie -w -p test_params.txt cases/.fastq.gz controls/.fastq.gz provoked error as follows. I would appreciate your help.
mntieTest.3rdtime.log
╒══════════════════════════════════════════════════════════════════════════════════════════════════╕
| Starting Pipeline at 2024-02-02 21:04 |
╘══════════════════════════════════════════════════════════════════════════════════════════════════╛
================================ Stage fastq_dedupe (allvars-case) =================================
Reads before:
423500 allvars-case/temp1.fastq
Reads after:
423400 allvars-case/allvars-case.1.fastq
==================================== Stage trim (allvars-case) =====================================
TrimmomaticPE: Started with arguments:
-threads 2 -phred33 allvars-case/allvars-case.1.fastq.gz allvars-case/allvars-case.2.fastq.gz allvars-case/trim1.fastq /dev/null allvars-case/trim2.fastq /dev/null LEADING:20 TRAILING:20 MINLEN:80
Input Read Pairs: 105850 Both Surviving: 105850 (100.00%) Forward Only Surviving: 0 (0.00%) Reverse Only Surviving: 0 (0.00%) Dropped: 0 (0.00%)
TrimmomaticPE: Completed successfully
================================== Stage assemble (allvars-case) ===================================
bash: line 1: 1461 Segmentation fault /mnt/d/mintie_240129/MINTIE-0.4.2_240129/MINTIE-0.4.2/tools/bin/soapdenovotrans pregraph -s config.config -o outputGraph_$k -K $k -p 2
ERROR: stage assemble failed: Command in stage assemble failed with exit status = 139 :
rlens=gunzip -c allvars-case/trim1.fastq.gz allvars-case/trim2.fastq.gz | awk -v mrl=80 'BEGIN {minlen = mrl; maxlen = 0} { if (NR % 4 == 2) { rlen = length($1) ; if (rlen > maxlen) {maxlen = rlen} if (rlen < minlen) {minlen = rlen} }} END {print minlen" "maxlen}'
; min_rlen=${rlens% } ; max_rlen=${rlens# } ; if [ ! -d allvars-case/SOAPassembly ]; then mkdir allvars-case/SOAPassembly ; fi ; cd allvars-case/SOAPassembly ; echo "max_rd_len=$max_rlen" > config.config ; echo -e "[LIB]\nq1=../../allvars-case/trim1.fastq.gz\nq2=../../allvars-case/trim2.fastq.gz" >> config.config ; if [ -e SOAP.fasta ]; then rm SOAP.fasta ; fi ; for k in 79 49 ; do if [ $k -gt $min_rlen ]; then echo "WARNING: Kmer size
========================================= Pipeline Failed ==========================================
In stage Unknown: One or more parallel stages aborted. The following messages were reported:
----------------------------------- assemble ( allvars-case ) ------------------------------------
Command in stage assemble failed with exit status = 139 :
rlens=gunzip -c allvars-case/trim1.fastq.gz allvars-case/trim2.fastq.gz | awk -v mrl=80 'BEGIN {minlen = mrl; maxlen = 0} { if (NR % 4 == 2) { rlen = length($1) ; if (rlen > maxlen) {maxlen = rlen} if (rlen < minlen) {minlen = rlen} }} END {print minlen" "maxlen}'
; min_rlen=${rlens% } ; max_rlen=${rlens# } ; if [ ! -d allvars-case/SOAPassembly ]; then mkdir allvars-case/SOAPassembly ; fi ; cd allvars-case/SOAPassembly ; echo "max_rd_len=$max_rlen" > config.config ; echo -e "[LIB]\nq1=../../allvars-case/trim1.fastq.gz\nq2=../../allvars-case/trim2.fastq.gz" >> config.config ; if [ -e SOAP.fasta ]; then rm SOAP.fasta ; fi ; for k in 79 49 ; do if [ $k -gt $min_rlen ]; then echo "WARNING: Kmer size
Use 'bpipe errors' to see output from failed commands.
/mnt/d/mintie_240129/MINTIE-0.4.2_240129/MINTIE-0.4.2# bpipe errors
============================== Found 1 failed commands from run 1318 ===============================
====================================== Command assemble (26) =======================================
Command : rlens=gunzip -c allvars-case/trim1.fastq.gz allvars-case/trim2.fastq.gz | awk -v mrl=80 'BEGIN {minlen = mrl; maxlen = 0} { if (NR % 4 == 2) { rlen = length($1) ; if (rlen > maxlen) {maxlen = rlen} if (rlen < minlen) {minlen = rlen} }} END {print minlen" "maxlen}'
; min_rlen=${rlens% } ; max_rlen=${rlens# } ; if [ ! -d allvars-case/SOAPassembly ]; then
mkdir allvars-case/SOAPassembly ; fi ; cd allvars-case/SOAPassembly ; echo "max_rd_len=$max_rlen" > config.config ; echo -e "[LIB]\nq1=../../allvars-case/trim1.fastq.gz\nq2=../../allvars-case/trim2.fastq.gz" >> config.config ; if [ -e SOAP.fasta ]; then rm SOAP.fasta ; fi ; for k in 79 49 ; do if [ $k -gt $min_rlen ]; then echo "WARNING: Kmer size
done ; cd ../../ ; /mnt/d/mintie_240129/MINTIE-0.4.2_240129/MINTIE-0.4.2/tools/bin/dedupe in=allvars-case/SOAPassembly/SOAP.fasta out=stdout.fa threads=2 overwrite=true | /mnt/d/mintie_240129/MINTIE-0.4.2_240129/MINTIE-0.4.2/tools/bin/fasta_formatter | awk '!/^>/ { next } { getline seq } length(seq) > 100 { print $0 "\n" seq }' > allvars-case/allvars-case_denovo_filt.fasta ; if [ ! -s allvars-case/allvars-case_denovo_filt.fasta ] ; then rm allvars-case/allvars-case_denovo_filt.fasta ; echo "ERROR: de novo assembled contigs fasta file is empty." ; echo "Please check paths for SOAPdenovoTrans, dedupe and fasta" ; echo "formatter are correct, and their dependencies are installed." ; fi ;
Started : Fri Feb 02 21:05:10 JST 2024
Stopped : Fri Feb 02 21:05:10 JST 2024
Exit Code : 139
Config:
Name | Value
----------------------------------
procs | 8
memory | 180
max_per_command_threads | 16
executor | local
mem_param | mem
name | assemble
proc_mode | 1
usePollerFileWatcher | true
walltime | 20:00:00
queue | batch
concurrency | 16
Output :
bash: line 1: 1461 Segmentation fault /mnt/d/mintie_240129/MINTIE-0.4.2_240129/MINTIE-0.4.2/tools/bin/soapdenovotrans pregraph -s config.config -o outputGraph_$k -K $k -p 2
I have just deployed mintie using mamba and get this error. Is there some java version that need to be pinned in the conda recipe?
mintie -w -p test_params.txt cases/.fastq.gz controls/.fastq.gz
OpenJDK 64-Bit Server VM warning: Options -Xverify:none and -noverify were deprecated in JDK 13 and will likely be removed in a future release.
java.lang.NoClassDefFoundError: Could not initialize class org.codehaus.groovy.vmplugin.v7.Java7
at org.codehaus.groovy.vmplugin.VMPluginFactory.(VMPluginFactory.java:43)
at org.codehaus.groovy.reflection.GroovyClassValueFactory.(GroovyClassValueFactory.java:35)
at org.codehaus.groovy.reflection.ClassInfo.(ClassInfo.java:107)
at org.codehaus.groovy.reflection.ReflectionCache.getCachedClass(ReflectionCache.java:95)
at org.codehaus.groovy.reflection.ReflectionCache.(ReflectionCache.java:39)
at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.registerMethods(MetaClassRegistryImpl.java:209)
at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.(MetaClassRegistryImpl.java:107)
at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.(MetaClassRegistryImpl.java:85)
at groovy.lang.GroovySystem.(GroovySystem.java:36)
at org.codehaus.groovy.runtime.InvokerHelper.(InvokerHelper.java:86)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.createCallStaticSite(CallSiteArray.java:74)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.createCallSite(CallSiteArray.java:161)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:115)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:127)
at bpipe.Runner.(Runner.groovy:58)
at bpipe.Runner9.main(Runner9.java:47)
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
at org.codehaus.groovy.tools.GroovyStarter.rootLoader(GroovyStarter.java:110)
at org.codehaus.groovy.tools.GroovyStarter.main(GroovyStarter.java:128)
Caused by: java.lang.ExceptionInInitializerError: Exception org.codehaus.groovy.GroovyBugError [in thread "main"]
at org.codehaus.groovy.vmplugin.v7.Java7.(Java7.java:45)
at java.base/jdk.internal.misc.Unsafe.ensureClassInitialized0(Native Method)
at java.base/jdk.internal.misc.Unsafe.ensureClassInitialized(Unsafe.java:1160)
at java.base/jdk.internal.reflect.MethodHandleAccessorFactory.ensureClassInitialized(MethodHandleAccessorFactory.java:340)
at java.base/jdk.internal.reflect.MethodHandleAccessorFactory.newConstructorAccessor(MethodHandleAccessorFactory.java:103)
at java.base/jdk.internal.reflect.ReflectionFactory.newConstructorAccessor(ReflectionFactory.java:173)
at java.base/java.lang.reflect.Constructor.acquireConstructorAccessor(Constructor.java:549)
at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499)
at java.base/java.lang.reflect.ReflectAccess.newInstance(ReflectAccess.java:132)
at java.base/jdk.internal.reflect.ReflectionFactory.newInstance(ReflectionFactory.java:259)
at java.base/java.lang.Class.newInstance(Class.java:755)
at org.codehaus.groovy.vmplugin.VMPluginFactory.createPlugin(VMPluginFactory.java:57)
at org.codehaus.groovy.vmplugin.VMPluginFactory.(VMPluginFactory.java:39)
... 20 more
java.lang.reflect.InvocationTargetException
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:118)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
at org.codehaus.groovy.tools.GroovyStarter.rootLoader(GroovyStarter.java:110)
at org.codehaus.groovy.tools.GroovyStarter.main(GroovyStarter.java:128)
Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.codehaus.groovy.reflection.ReflectionCache
at org.codehaus.groovy.runtime.dgmimpl.NumberNumberMetaMethod.(NumberNumberMetaMethod.java:33)
at java.base/jdk.internal.misc.Unsafe.ensureClassInitialized0(Native Method)
at java.base/jdk.internal.misc.Unsafe.ensureClassInitialized(Unsafe.java:1160)
at java.base/jdk.internal.reflect.MethodHandleAccessorFactory.ensureClassInitialized(MethodHandleAccessorFactory.java:340)
at java.base/jdk.internal.reflect.MethodHandleAccessorFactory.newConstructorAccessor(MethodHandleAccessorFactory.java:103)
at java.base/jdk.internal.reflect.ReflectionFactory.newConstructorAccessor(ReflectionFactory.java:173)
at java.base/java.lang.reflect.Constructor.acquireConstructorAccessor(Constructor.java:549)
at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499)
at java.base/java.lang.reflect.ReflectAccess.newInstance(ReflectAccess.java:132)
at java.base/jdk.internal.reflect.ReflectionFactory.newInstance(ReflectionFactory.java:259)
at java.base/java.lang.Class.newInstance(Class.java:755)
at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.createMetaMethodFromClass(MetaClassRegistryImpl.java:257)
at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.(MetaClassRegistryImpl.java:110)
at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.(MetaClassRegistryImpl.java:85)
at groovy.lang.GroovySystem.(GroovySystem.java:36)
at org.codehaus.groovy.runtime.InvokerHelper.(InvokerHelper.java:86)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.createCallStaticSite(CallSiteArray.java:74)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.createCallSite(CallSiteArray.java:161)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:115)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:127)
at bpipe.Runner.(Runner.groovy:58)
at bpipe.Runner9.main(Runner9.java:47)
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
... 3 more
Caused by: java.lang.ExceptionInInitializerError: Exception java.lang.NoClassDefFoundError: Could not initialize class org.codehaus.groovy.vmplugin.v7.Java7 [in thread "main"]
at org.codehaus.groovy.vmplugin.VMPluginFactory.(VMPluginFactory.java:43)
at org.codehaus.groovy.reflection.GroovyClassValueFactory.(GroovyClassValueFactory.java:35)
at org.codehaus.groovy.reflection.ClassInfo.(ClassInfo.java:107)
at org.codehaus.groovy.reflection.ReflectionCache.getCachedClass(ReflectionCache.java:95)
at org.codehaus.groovy.reflection.ReflectionCache.(ReflectionCache.java:39)
at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.registerMethods(MetaClassRegistryImpl.java:209)
at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.(MetaClassRegistryImpl.java:107)
... 14 more
Hi,
After successfully using MINTIE for some samples running the pipeline for one raised the following error, replacing a confidential sample identifier with "":
If necessary I can share a file (through email). Could you let me know which file would be most informative for you? Or is there something else you want me to do to debug?
Greetz,
Wouter
================================== Command refine_contigs (1999) ===================================
Command : python /home/wdecoster/mintie/MINTIE/annotate/refine_annotations.py <sample>/annotated_contigs_info.tsv <sample>/annotated_contigs.vcf <sample>/annotated_contigs.bam /home/wdecoster/mintie/MINTIE/ref/chess2.2.gtf /home/wdecoster/mintie/MINTIE/ref/hg38.fa <sample>/novel_contigs --minClip 20 --minGap 7 --mismatches 0 --log /home/wdecoster/mintie/<sample>/refine.log > <sample>/novel_contigs.vcf ; samtools index <sample>/novel_contigs.bam ; python /home/wdecoster/mintie/MINTIE/util/filter_fasta.py <sample>/de_contigs.fasta <sample>/novel_contigs_info.tsv --col_id contig_id > <sample>/novel_contigs.fasta ;
Started : Mon Sep 21 13:23:24 CEST 2020
Stopped : Mon Sep 21 13:23:33 CEST 2020
Exit Code : 1
Config:
Name | Value
----------------------------------
max_per_command_threads | 16
executor | local
concurrency | 32
walltime | 20:00:00
queue | batch
mem_param | mem
memory | 8
proc_mode | 1
usePollerFileWatcher | true
manualPollerSleepTime | 30000
maxFileNameLength | 2048
name | python
procs | 1
Output :
Traceback (most recent call last):
File "/home/wdecoster/mintie/MINTIE/annotate/refine_annotations.py", line 572, in <module>
main()
File "/home/wdecoster/mintie/MINTIE/annotate/refine_annotations.py", line 564, in main
keep_contigs = get_contigs_to_keep(args)
File "/home/wdecoster/mintie/MINTIE/annotate/refine_annotations.py", line 510, in get_contigs_to_keep
is_novel_exon = np.logical_and(is_novel_exon, contigs.valid_motif)
File "/home/wdecoster/miniconda3/envs/mintie/lib/python3.8/site-packages/pandas/core/generic.py", line 5136, in __getattr__
return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'valid_motif'
It would be great to get some guidance around impact of these parameters. It's not immediately clear how increasing or decreasing these parameters affects sensitivity.
-p Ks=79,49 comma-separated kmer lengths for de novo assembly. This option affects SOAPdenovotrans and rnaSPAdes (but not Trinity). Please ensure that your read length is longer than ALL kmer lengths specified.
-p min_read_length=50 minimum read length for trimming. NOTE: Please ensure this is greater than your minimum Kmer length.
-p min_contig_len=100 minimum length required for for assembled contig to be kept.
Hi,
NOTE: I've attached the full job.err.log which includes the output of bpipe errors.
When running MINTIE in a docker I run into the following error where an appropriate version of gmap cannot be found (extract of job.error.log):
2020-04-21T09:06:45.112323100Z ============================ Command align_contigs_against_genome (27) =============================
2020-04-21T09:06:45.115255265Z
2020-04-21T09:06:45.117204227Z Command : /app/MINTIE/tools/bin/gmap -D /app/MINTIE/ref -d gmap_genome -f samse -t 8 -n 0 P006304-303189/de_contigs.fasta > P006304-303189/aligned_contigs_against_genome.sam
2020-04-21T09:06:45.127157281Z Started : Tue Apr 21 09:06:11 UTC 2020
2020-04-21T09:06:45.127434586Z Stopped : Tue Apr 21 09:06:12 UTC 2020
2020-04-21T09:06:45.127639364Z Exit Code : 255
2020-04-21T09:06:45.127707622Z Config:
2020-04-21T09:06:45.161273655Z Name | Value
2020-04-21T09:06:45.161399397Z ------------------------------------------------------
2020-04-21T09:06:45.164737361Z max_per_command_threads | 16
2020-04-21T09:06:45.165661290Z executor | local
2020-04-21T09:06:45.166065481Z concurrency | 240
2020-04-21T09:06:45.166621049Z walltime | 20:00:00
2020-04-21T09:06:45.167308134Z queue | batch
2020-04-21T09:06:45.167641414Z mem_param | mem
2020-04-21T09:06:45.167936725Z memory | 32
2020-04-21T09:06:45.168888023Z proc_mode | 1
2020-04-21T09:06:45.169282940Z usePollerFileWatcher | true
2020-04-21T09:06:45.169641991Z manualPollerSleepTime | 30000
2020-04-21T09:06:45.169982765Z maxFileNameLength | 2048
2020-04-21T09:06:45.170468281Z procs | 8
2020-04-21T09:06:45.170750694Z name | align_contigs_against_genome
2020-04-21T09:06:45.170985378Z
2020-04-21T09:06:45.170993714Z Output :
2020-04-21T09:06:45.171030044Z
2020-04-21T09:06:45.171429324Z Note: /app/MINTIE/tools/bin/gmap.sse42 does not exist. For faster speed, may want to compile package on an SSE4.2 machine
2020-04-21T09:06:45.171448412Z Note: /app/MINTIE/tools/bin/gmap.sse41 does not exist. For faster speed, may want to compile package on an SSE4.1 machine
2020-04-21T09:06:45.171456511Z Note: /app/MINTIE/tools/bin/gmap.ssse3 does not exist. For faster speed, may want to compile package on an SSSE3 machine
2020-04-21T09:06:45.171462305Z Note: /app/MINTIE/tools/bin/gmap.sse2 does not exist. For faster speed, may want to compile package on an SSE2 machine
2020-04-21T09:06:45.171467892Z Note: /app/MINTIE/tools/bin/gmap.nosimd does not exist. For faster speed, may want to compile package on an non-SIMD machine
2020-04-21T09:06:45.171473323Z Error: appropriate GMAP version not found
2020-04-21T09:06:45.171478341Z
I checked in my /MINTIE/tools/bin and gmap is there.
Here is the list of /MINTIE/tools:
$ ls /app/MINTIE/tools
FastUniq
SOAPdenovo-Trans-bin-v1.03
Trimmomatic-0.39
bbmap
bedtools
bin
bpipe-0.9.9.5
gmap-2019-09-12
hisat2-2.1.0
salmon-latest_linux_x86_64
samtools-1.9
share
And /MINTIE/tools/bin:
$ ls /app/MINTIE/tools/bin | egrep '^g'
get-genome
getreads
gff3_genes
gff3_introns
gff3_splicesites
gi2ancestors
gi2taxid
gitable
gmap
gmap.avx2
gmap_build
gmap_compress
gmap_process
gmap_reassemble
gmap_uncompress
gmapindex
gmapl
gmapl.avx2
grademerge
gradesam
gsnap
gsnap.avx2
gsnapl
gsnapl.avx2
gtf_genes
gtf_introns
gtf_splicesites
gtf_transcript_splicesites
gvf_iit
I have all of the dependency programs installed and references installed using:
$ ./install_linux64.sh
Which ended:
Checking that all required tools were installed:
bpipe looks like it has been installed
fastuniq looks like it has been installed
dedupe looks like it has been installed
trimmomatic looks like it has been installed
fasta_formatter looks like it has been installed
samtools looks like it has been installed
bedtools looks like it has been installed
soapdenovotrans looks like it has been installed
salmon looks like it has been installed
hisat looks like it has been installed
gmap looks like it has been installed
**********************************************************
All commands installed successfully!
And I installed the reference files using:
$ ./setup_references_hg38.sh
Which ended:
Checking that all required references were setup:
genome_fasta looks like it has been setup
trans_fasta looks like it has been setup
tx_annotation looks like it has been setup
ann_info looks like it has been setup
tx2gene looks like it has been setup
gmap_refdir looks like it has been setup
gmap_genome looks like it has been setup
**********************************************************
All commands installed successfully!
I've recently tried installing and running mintie on the test dataset and have run into a few issues.
I've installed mamba using the following:
mamba create -c conda-forge -c bioconda -n mintie mintie
This gives me mintie=0.3.9-0
I then run the parameter to setup the test data:
mintie -t
Which made the following files:
├── cases
│ ├── allvars-case_R1.fastq.gz
│ └── allvars-case_R2.fastq.gz
├── controls
│ ├── allvars-control_R1.fastq.gz
│ └── allvars-control_R2.fastq.gz
└── test_params.txt
2 directories, 5 files
Then, I ran the following mintie commands:
mintie -w -p test_params.txt cases/*.fastq.gz controls/*.fastq.gz
I get the following output:
=========================================== Bpipe Error ============================================
An error occurred executing your pipeline:
A script, requested to be loaded from file '/home/jcole/miniconda3/envs/mintie/share/mintie-0.3.9-0/references.groovy', could not be accessed.
Please see the details below for more information.
========================================== Error Details ===========================================
bpipe.PipelineError: A script, requested to be loaded from file '/home/jcole/miniconda3/envs/mintie/share/mintie-0.3.9-0/references.groovy', could not be accessed.
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at MINTIE.groovy.run(MINTIE.groovy:15)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
====================================================================================================
More details about why this error occurred may be available in the full log file .bpipe/bpipe.log
Indeed, references.groovy does not exist. Is there something I need to do to generate that file?
HI
I installed MINTIE using suggested method(using mamba). The installation was completed without issue and also successfully dowloaded the reference files. I tried to run MINITE on the test data, it was also successfully finished. However I was running MINTIE on my real data, I can only run 1 case vs 1 controls just like the test data. When I wanted to try to running MINTIE with 3 controls, I always prompt no space left on device.
``============================= Found 1 failed commands from run 925835 ==============================`
==================================== Command fastq_dedupe (374) ====================================
Command : gunzip -c cases/allvars-case_R1.fastq.gz > allvars-case/temp1.fastq ; gunzip -c cases/allvars-case_R2.fastq.gz > allvars-case/temp2.fastq ; echo allvars-case/temp1.fastq > allva echo allvars-case/temp2.fastq >> allvars-case/fastq.list ; echo "Reads before:" ; wc -l allvars-case/temp1.fastq ; fastuniq -i allvars-case/fastq.list -o allvars-castq -p allvars-case/allvars-case.2.fastq ; echo "Reads after:" ; wc -l allvars-case/allvars-case.1.fastq ; gzip allvars-case/allvars-case.1.fastq allvars-case/allvars- rm allvars-case/fastq.list allvars-case/temp1.fastq allvars-case/temp2.fastq
Started : Mon Jun 26 08:30:38 CST 2023
Stopped : Mon Jun 26 08:59:01 CST 2023
Exit Code : 1
Config:
Name | Value
--------------------------------------
procs | 1
memory | 160
max_per_command_threads | 16
executor | local
mem_param | mem
name | fastq_dedupe
proc_mode | 1
usePollerFileWatcher | true
walltime | 20:00:00
queue | batch
concurrency | 16
Output :
Reads before:
209526696 allvars-case/temp1.fastq
Reads after:
138463916 allvars-case/allvars-case.1.fastq
gzip: allvars-case/allvars-case.1.fastq.gz: No space left on device
`
If you have any suggestions for how I can troubleshoot this, that would be great.
Also I want to know how much space is required to run MINTIE.
Thanks
When I set up the fastq files and run ‘mintie -w -p test_params.txt cases/.fastq.gz controls/.fastq.gz’
I didn't find the file called Eq_ classes_ de.txt
At last,it showed:
Attaching package: ‘dplyr’
The following objects are masked from ‘package:stats’:
filter, lag
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union
Attaching package: ‘data.table’
The following objects are masked from ‘package:dplyr’:
between, first, last
Loading required package: limma
Error: Insufficient controls. Please run MINTIE with at least 1 controls.
ERROR: Expected output file all/eq_classes_de.txt in stage run_de (all) could not be found
========================================= Pipeline Failed ==========================================
In stage Unknown: One or more parallel stages aborted. The following messages were reported:
------------------------------------------- Unknown all --------------------------------------------
Expected output file all/eq_classes_de.txt in stage run_de (all) could not be found
Use 'bpipe errors' to see output from failed commands.
(mintie) PowerEdge-T440:~$ bpipe errors
============================== Found 0 failed commands from run 57035 ==============================
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.