Hello,
First thank you for your work. I would like to test your tool on other sub-families of transposable elements so I'm trying to follow the instructions but I have some questions.
I have a single multi-fasta with mm10 genome, multiple fasta with individual chromsomes and a MUSCLE alignment of my list of transposable elements in ClustalW format.
So I ran the construct
tool first and then, when trying scan
, by providing the complete genome (or just 1 chromosome), the program kill itself. By checking the memory, I saw that it reached +200gb of RAM. I tested with chrF.fa
and train.aln
, the program worked well. So I think the fasta files are too big for the program (1.18gb for the complete genome, 60-200mb for the individual chromosomes). So I would like to know if there is a way to do it on the whole genome ?
Also, do the scripts in the directory pipeline
have to be run after construct
and scan
or is it a totally separate part ?
The exact commands that I used are (here with the toy_test files):
construct -o train.construct.model -u -v train.aln
scan -o train.scan.bed -v -c chrF.fa train.construct.model
Thank you for your time and your work.
Best.
Pierre-Emmanuel