Hi. LTR_retriever stops with a RepeatMasker error resulting from a sequence ID longer than 50 characters in the file xxx.fa.mod.ltrTE.trunc. The ID in question is >LSRX01000097.1:1074794..1082843|LSRX01000097.1:1074380..1083242[IN]
See that the original sequence IDs are not particularly long. but due to the large coordinate numbers the IDs become long. Is there a fix for this problem? Below is the whole output including the repeatmasker test run
Thanks. Claudio
##########################
LTR_retriever v1.8.0
##########################
Contributors: Shujun Ou, Ning Jiang
Please cite: Ou S, Jiang N: LTR_retriever: a highly accurate and sensitive program for identification of long terminal repeat retrotransposons. Plant Physiology 2018, 176:1410-1422
Parameters: -genome /home/manager/BigShare/dinos/11-200.fa -infinder /home/manager/LTR_Finder/source/11-200.finder.scn
四 5月 31 19:51:58 CST 2018 Dependency checking: All passed!
四 5月 31 19:52:52 CST 2018 The longest sequence ID in the genome contains 109 characters, which is longer than the limit (15)
Trying to reformat seq IDs...
Attempt 1...
四 5月 31 19:53:12 CST 2018 Seq ID conversion successful!
四 5月 31 19:53:12 CST 2018 Start to convert inputs...
Total candidates: 173
Total uniq candidates: 173
四 5月 31 19:53:25 CST 2018 Module 1: Start to clean up candidates...
Sequences with 10 missing bp or 0.8 missing data rate will be discarded.
Sequences containing tandem repeats will be discarded.
四 5月 31 19:53:31 CST 2018 145 clean candidates remained
四 5月 31 19:53:31 CST 2018 Modules 2-5: Start to analyze the structure of candidates...
The terminal motif, TSD, boundary, orientation, age, and superfamily will be identified in this step.
四 5月 31 19:54:13 CST 2018 Intact LTR-RT found: 118
Can't remove /home/manager/BigShare/dinos/11-200.fa.mod.ltrTE.pass.clust: Text file busy, skipping file.
四 5月 31 19:54:30 CST 2018 Module 6: Start to analyze truncated LTR-RTs...
Truncated LTR-RTs without the intact version will be retained in the LTR-RT library.
Use -notrunc if you don't want to keep them.
四 5月 31 19:54:30 CST 2018 4 truncated LTR-RTs found
Can't remove /home/manager/BigShare/dinos/11-200.fa.mod.ltrTE.trunc: Text file busy, skipping file.
Warning: LOC list /home/manager/BigShare/dinos/11-200.fa.mod.ltrTE.veryfalse is empty.
ERROR: RepeatMasker is not running properly!
Please check the file /home/manager/BigShare/dinos/11-200.fa.mod.ltrTE.mask.lib and /home/manager/BigShare/dinos/11-200.fa.mod.ltrTE.trunc and test run:
RepeatMasker -e ncbi -q -pa 4 -no_is -norna -nolow -div 40 -lib /home/manager/BigShare/dinos/11-200.fa.mod.ltrTE.mask.lib -cutoff 225 /home/manager/BigShare/dinos/11-200.fa.mod.ltrTE.trunc
Please report errors to https://github.com/oushujun/LTR_retriever/issues
Program halt!
manager@sb:~/RepeatMasker$ ./RepeatMasker -e ncbi -q -pa 4 -no_is -norna -nolow -div 40 -lib /home/manager/BigShare/dinos/11-200.fa.mod.ltrTE.mask.lib -cutoff 225 /home/manager/BigShare/dinos/11-200.fa.mod.ltrTE.trunc
RepeatMasker version open-4.0.7
Search Engine: NCBI/RMBLAST [ 2.2.27+ ]
Master RepeatMasker Database: /home/manager/RepeatMasker/Libraries/RepeatMaskerLib.embl ( Complete Database: dc20170127-rb20170127 )
Custom Repeat Library: /home/manager/BigShare/dinos/11-200.fa.mod.ltrTE.mask.lib
analyzing file /home/manager/BigShare/dinos/11-200.fa.mod.ltrTE.trunc
FastaDB::_cleanIndexAndCompact(): Fasta file contains a sequence identifier which is too long ( max id length = 50 )
at ./RepeatMasker line 718.