Coder Social home page Coder Social logo

svmu's People

Contributors

mahulchak avatar youreprettygood avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

svmu's Issues

install issue

When I run step "make",
I get error
like this
Uploading error.png…
g++ -g -Wall -std=c++0x -c svlib.cpp
svlib.cpp:4: error: expected nested-name-specifier before ‘chroms’
svlib.cpp:4: error: ‘chroms’ has not been declared
svlib.cpp:4: error: expected ‘;’ before ‘=’ token
svlib.cpp:4: error: expected unqualified-id before ‘=’ token
svlib.cpp:5: error: expected nested-name-specifier before ‘ccov’
svlib.cpp:5: error: ‘ccov’ has not been declared
svlib.cpp:5: error: expected ‘;’ before ‘=’ token
svlib.cpp:5: error: expected unqualified-id before ‘=’ token
svlib.cpp:6: error: expected nested-name-specifier before ‘vq’
svlib.cpp:6: error: ‘vq’ has not been declared
svlib.cpp:6: error: expected ‘;’ before ‘=’ token
svlib.cpp:6: error: expected unqualified-id before ‘=’ token
svlib.cpp:16: error: ‘ccov’ does not name a type
svlib.cpp: In function ‘void findInnie(std::vector<mI, std::allocator >&, mI&)’:
svlib.cpp:39: warning: comparison between signed and unsigned integer expressions
svlib.cpp: In function ‘void findInnieQ(std::vector<mI, std::allocator >&, mI&)’:
svlib.cpp:93: warning: comparison between signed and unsigned integer expressions
svlib.cpp: In function ‘void findInnieLast(std::vector<mI, std::allocator >&, mI&)’:
svlib.cpp:132: warning: comparison between signed and unsigned integer expressions
svlib.cpp: At global scope:
svlib.cpp:147: error: variable or field ‘storeCords’ declared void
svlib.cpp:147: error: ‘ccov’ was not declared in this scope
svlib.cpp:147: error: ‘masterRef’ was not declared in this scope
svlib.cpp:147: error: ‘ccov’ was not declared in this scope
svlib.cpp:147: error: ‘masterQ’ was not declared in this scope
svlib.cpp:147: error: expected primary-expression before ‘&’ token
svlib.cpp:147: error: ‘mi’ was not declared in this scope
make: *** [svlib.o] Error 1

lastz or Mummer

Hello mahulchak !
In the latest version , you use the alignment result of both mummer and lastz. But when the alignment result is different between these two softwares . which one is used to detect SV ?

Get none results

Hi there,
when I used SVMU based on the nucmer, and get empty result besides cords.prefix.txt file.

my command is

alignment using nucmer

nucmer -l 80 ara.fa nh.fa -p nh_ara

run svmu

svmu nh_ara.delta nh.fa ara.fa l last_out.txt prefix

The file 'last_out.txt' is an empty file.

Thx

LastZ fail (big genome) & SVMU segmentation fault

Hello,
I'm encountering an issue that several people have raised here but I can't find how they fixed it
I am not able to run lastZ because the genomes are too big. so I am running the svmu with an empty file for it.
Yet, it starts running and after a while do " Segmentation fault (core dumped) svmu".
any idea how I can do?
thanks for your help
Claire

CNV type in sv.txt

there are 8 types of CNV in sv.txt namely:CNV-NQ、CNV-NR、CNV-Q、CNV-R、nCNV-NQ、nCNV-NR、nCNV-Q、nCNV-R . Can you tell me what does them mean exactly?

Possible solution to 'core dumped' error?

Hello,

I am trying to run svmu for three drosophila genomes and I am following the basic instructions posted on this repository. I tried several times to run svmu but the jobs were always aborted and showing a 'core dumped error'. Then I looked at the stderr file and this message was printed:

terminate called after throwing an instance of 'std::invalid_argument'
  what():  stoi

The 'invalid_argument' error made me think that maybe I was not specifying the correct number/format of arguments. Then I noticed that in the basic running instructions only 6 arguments are needed to run svmu:

svmu sam2ref.mm.delta ref.fasta sample.fasta snp_mode sam_lastz.txt prefix

But when I executed svmu with no arguments it gives me:

Usage: /media/DATA5/Javier/software/svmu/svmu_exp/svmu/svmu foo.delta ref.fasta query.fasta cutoff mode(h/l) last_out.txt prefix

As you can see, the binary itself asks for 7 arguments by asking also for a cutoff parameter. I remember from an old version that there was a parameter for the number of unique syntenic blocks to find SVs, which can take values of 5, 10 or 100. I ended up running something like this and It seems that svmu is running:

svmu delta_file reference.fasta target.fasta 5 l lastz_out.txt prefix

Now my questions are: Do you think the parameters 'cutoff' and 'mode' and the way I put them in my command line were correct for the proper function of svmu? Are those actually taken into consideration?

Looking forward to hearing back from you.

Javier,

svmu exit without any error information

HI,
I have a newly assembly plant genome and I'm trying to map it to the reference genome and call large indels. I run the genome comparison with mummer4.0 with the following command:
nucmer -t 25 --mum --noextend -p test ref.fa sample.fa
Then I run svmu with command:
svmu test.delta ref.fa sample.fa 5 -h
When svmu was writing sv.txt files, it exit without any error message. svmu was run with single thread and 256Gb RAM.
Is it caused by memory issue?

segmentation fault (core dumped) with the current version

I use the command "~/software/svmu/svmu test.delta 140001.fas 140028.fas h sam_lastz.txt test", it return "segmentation fault (core dumped) " and produce 3 empty files: cm.test.txt, cnv_all.test.txt and cords.test.txt.
I have tried in three sever, it showed the same problem.
Maybe the problem is sam_lastz.txt? There are 8 coloums rather than 6 (in the mannual) in this file.
If I replace sam_lastz.txt by a nonexist file name like "sam_txt", it seems fine and produce 6 files as below:

drwxrwxr-x 2 bac bac 4096 1月 24 00:50 ./
drwxrwxrwx 26 bac bac 4096 1月 23 23:48 ../
-rw------- 1 bac bac 38016751 1月 23 21:00 140001.fas
-rw------- 1 bac bac 37523890 1月 23 21:01 140028.fas
-rw-rw-r-- 1 bac bac 169630 1月 24 00:50 cm.test.txt
-rw-rw-r-- 1 bac bac 644808 1月 24 00:50 cnv_all.test.txt
-rw-rw-r-- 1 bac bac 1014481 1月 24 00:50 cords.test.txt
-rw-rw-r-- 1 bac bac 4196598 1月 23 21:49 sam_lastz.txt
-rw-rw-r-- 1 bac bac 0 1月 24 00:50 small.test.txt
-rw-rw-r-- 1 bac bac 289469 1月 24 00:50 sv.test.txt
-rw-rw-r-- 1 bac bac 1050793 1月 23 21:04 test.delta

The coordinates of CNV in sv.prefix.txt

Hello,
I used SVMU for SV detection based on mummer result and found that in the sv.prefix.txt file, the length of CNV in both reference and query genomes were the same like below:
REF_CHROM REF_START REF_END SV_TYPE Q_CHROM Q_START Q_END ID LEN COV_REF COV_Q
A01 356308 356314 CNV-R A01 245745 245751 0000000023 6 1 2

356314-356308=245751-245745=6, but the covR and covQ are different. Is this means that segments in query have 2 copies in reference genome? I think that (ref_end-ref_start)/(query_end-query_start) should be equal to 2 in this condition.

How to define a SV type

Can you tell me to define a Insert or Delete or a CNV according to the results of mummer?

lastz error

Hello professor:
When I use the lastz to get *_lastz.txt file using
lastz *.genomic.fa[multiple] *.genomic.fa[multiple] --chain --format=general:name1,strand1,start1,end1,name2,strand2,start2,end2 > *_lastz.txt
some problem like below happen:
"FAILURE: in add_segment()
table size (4,869,542,152 for 101,448,794 segments) exceeds allocation limit of 4,294,967,279;
consider raising scoring threshold (--hspthresh or --exact) or breaking your target sequence into smaller pieces"
look like "raising scoring threshold" will working? the default of "--hspthresh" is 3000 ,so how many can set ? could you give me some advice! Thank you very much!

segmentation fault (terminated by singal 11)

Hello

I was interested in using svmu for my genome assembly of interest and I ran into a seg fault error. Could you take a look at this error?
The genome assembly of my organism is a plant ~400Mbp but I hard masked my sample genome and hard masked my reference genome. The reference genome has 12 chromosomes but my sample genome is still a draft so it has ~2000 contigs. I initially thought the seg fault occured due to running out of memory so I just took one chromosome of the ref genome and used numcer to align all of the possible sample genome contigs. I'm running the commands on a server where I gave 140GB of memory and still ran into seg fault error.

Heres what I get with /usr/bin/time -v

Command terminated by signal 11
Command being timed: "/home/PROGRAMS_AND_SCRIPTS/PROGRAMS/svmu/svmu sam2ref.mr.delta RM.chr01.fa sample.fasta 5 h"
User time (seconds): 82.17
System time (seconds): 6.53
Percent of CPU this job got: 89%
Elapsed (wall clock) time (h:mm:ss or m:ss): 1:38.90
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 6509976
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 1890746
Voluntary context switches: 1553260
Involuntary context switches: 900
Swaps: 0
File system inputs: 0
File system outputs: 1880
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0

And here's what I got from running

gdb

GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-94.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
http://www.gnu.org/software/gdb/bugs/...
Reading symbols from /home/PROGRAMS_AND_SCRIPTS/PROGRAMS/svmu/svmu...done.
[New LWP 4636]
Core was generated by `/home/PROGRAMS_AND_SCRIPTS/PROGRAMS/svmu/svmu sam2ref.mr.delta /scratch/'.
Program terminated with signal 11, Segmentation fault.
#0 0x00002b29496f2fdc in std::string::assign(std::string const&) () from /lib64/libstdc++.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-157.el7_3.1.x86_64 libgcc-4.8.5-11.el7.x86_64 libstdc++-4.8.5-11.el7.x86_64

Thanks in advance for the help.

Jae

Segmentation fault (core dumped) - lastz issues?

I am running an analysis on two genomes - one with a pseudochromosome level assembly, the other with a scaffold level assembly. Using the provided command for lastz took several days for the program to complete, ultimately providing me with an empty file. Adding the following options to the lastz command fixed this, and I was able to get lastz output files in a reasonable timeframe:

--notransition --step=20 --nogapped 

However, when I attempt to run SVMU using the output of this lastz run, I get a Segmentation fault (core dumped) error. Based on the documentation for this tool, I removed any whitespace from the input fasta files, but still got this error. As a sanity check - when the original lastz command no longer worked - I chose to take only a subsection of the input sequences and run lastz on them, which gave me output files - and used that on svmu; this provided me with results, and did not produce the Segmentation fault error. Once I saw this error, I reran this limited analysis using the new options from above, and attempting to run SVMU gave me the same Segmentation fault error.

Due to how long lastz took to run on the whole genomes without these options - and the ultimately fruitless results - I am unsure if there is another way to optimize lastz to run with these genomes. However, from my observations, it would seem as if these options are incompatible with SVMU, resulting in a Seg fault error. These commands lower the seeding sensitivity of lastz, and ignore gapped alignments; perhaps this plays some role in why it has been failing?

Do you have any recommendations for running lastz that does not prevent SVMU from completing?

Loss of chr information in cm.lastz.txt and sv.lastz.txt

Hello mahulchak,
After I run svmu with a genome with two reference genomes, the out put of cm.lastz.txt and sv.lastz.txt shows no complete chromosomes for one reference genome .

The code for lastz and svmu are the same, but different in nucmer, as shown in the following.

NUCMER for refer1

nucmer --maxmatch --noextend --prefix=Sample_ref1 ../reference_genome1.fasta ../sample.genomic.fa

NUCMER for refer2

nucmer --prefix=Sample_ref2 ../reference_genome2.fasta ../sample.genomic.fa

lastz

/path/to/lastz//lastz-distrib-1.04.03/src/lastz_32 ../reference_genome1.fasta[multiple] ../sample.genomic.fa --notransition --step=20 --nogapped --progress=1 --format=maf > lastzSample_ref1.txt

svmu

/path/to/svmu-master/svmu Sample_ref1 ../reference_genome1.fasta ../sample.genomic.fa 5 l lastzSample_ref1.txt svmuSample_ref1

The output for reference1
2.4K Aug 7 17:43 cm.lastzAD3_AD1.txt.txt
2.6G Aug 7 18:11 cnv_all.lastzAD3_AD1.txt.txt
2.4G Aug 7 18:11 cords.lastzAD3_AD1.txt.txt
0 Aug 7 18:11 small.lastzAD3_AD1.txt.txt
3.2K Aug 7 18:11 sv.lastzAD3_AD1.txt.txt

The output for reference2
3.7M Jul 19 02:59 cm.lastzAD3_AD2.txt.txt
631M Jul 19 03:02 cnv_all.lastzAD3_AD2.txt.txt
648M Jul 19 03:02 cords.lastzAD3_AD2.txt.txt
0 Jul 19 03:02 small.lastzAD3_AD2.txt.txt
11M Jul 19 03:37 sv.lastzAD3_AD2.txt.txt

The big difference is the file size for reference1, files cm.lastzAD3_AD1.txt.txt and sv.lastzAD3_AD1.txt.txt are small and there is no complete information for all the chromosomes. And cnv_all and cords are large.
Actually, reference1 and reference2 genome size are near and sequence have a little high similarity. How could the result will be such different changing NUCMER parameters? How can I define accurate SVs for the sample with appropriate parameters?

Looking forward to hearing from you. Thanks a lot!
Best wishes,
Clement

sv.prefix.txt are empty

HI!
SVMU is a very good tools to detect SVs comparing others.
My code is "svmu ref_qry.filted.delta ref.fa qry.fa l null ref_qry" , but some “sv.prefix.txt” is empty with 0.4-alpha
Thanks for very much!
1695664151898

Segmentation fault

hi, I want use svmu to find structural variation between two genomes. here is my code:

$MUMmer/nucmer --prefix NN2Wm82 Gmax_275_v2.0.fa NN1138-2.v0.5.fa
$SVMU/svmu NN2Wm82.mm.delta Gmax_275_v2.0.fa NN1138-2.v0.5.fa NN_Wm82

but I met an error like that:

14865 Segmentation fault $SVMU/svmu NN2Wm82.mm.delta Gmax_275_v2.0.fa NN1138-2.v0.5.fa NN_Wm82.

my svmu version is newest and I compile it with a few warnings. there is a core.14865 file but it's empty .Could help me fix this error? Thank you very much.

terminate called after throwing an instance of 'std::invalid_argument' what(): stoi Aborted (core dumped)

I got en error ruining your svmu:

 svmu sam2ref.mr.delta ref.fa query.fa 100 h > sample.small.txt

Error:

terminate called after throwing an instance of 'std::invalid_argument'
  what():  stoi
Aborted (core dumped)

I test n be 5, get the same error.
when I run

scriptmaker query.fa list_of_fa 

Get the same error:

terminate called after throwing an instance of 'std::invalid_argument'
  what():  stoi
Aborted (core dumped)

how can I solve it?

The meaning of CNV types ?

Hi,

Thank you for your SVMU tools. When checking the svmu results of sv.txt, I can't understand the meaning of CNV types, such as CNV-Q, CNV-R, nCNV-Q and nCNV-R.

Can you explain the detail meaning of these CNV types?

Thank you in advance!

segmentation fault

Hi There,

I ran into this error when execting svmu.
svmu ref.vs.sample.mm.delta ref.fa sample.fa 3
''
A little bit information about my linux platform:

Configured with: ../gcc-5.2.0/configure --prefix=/opt/Modules/gcc/5.2.0 --disable-multilib
Thread model: posix
gcc version 5.2.0 (GCC)

Linux delta040 2.6.32-573.12.1.el6.x86_64 #1 SMP Tue Dec 15 21:19:08 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

Hoepfully, you can show me a way to fix it.

Thanks a lot.

Elzed

confused about the results: sv.txt and cnv.txt

Hi! SVMU is a pretty good tools to detect SVs.

But I am be confused about the result files, the sv.xxx.txt and cnv.xxx.txt.

In my understanding, CNVs are some types of SVs (simple SV type: deletion (DEL), duplication (DUP), inversion (INV), insertion (INS), translocation (TRA). While CNV are unbalanced variants, including: DEL, and DUP, sometimes also including INS).

But in the result files, SVs and CNVs are divided into two files. While some interval are overlapped, there are also some interval unique in cnv.xxx.txt or sv.xxx.txt. So, while considered for downstream analysis, may I merge these two files together to get more SV sites (merge overlapped interval)?

Thanks for a lot!

cnv_all.prefix.txt and small.prefix.txt are empty

Hi, I would like to ask you about two small problems encountered when running svmu. Here's the script I used:
**nucmer --mum -p out msu.Chr6.fa r498.Chr6.fa
delta-filter -i 80 -1 out.delta > out.filter.delta

touch last_out.txt
svmu out.filter.delta msu.Chr6.fa r498.Chr6.fa h last_out.txt nucm**

First, when I run this script, a table with many lines pops up on the screen, like this:
.......
gap Chr6 26089884 26090751 Chr6 27448116 27448564
Chr6 0 26062264 Chr6 27418463 27418463
gap Chr6 26091006 26092210 Chr6 27448837 27455791
Chr6 26093201 26093205 Chr6 27456766 27456770
gap Chr6 26093200 26093201 Chr6 27456770 27456770
Chr6 0 24760893 Chr6 2667338 2667668
gap Chr6 26093698 26093698 Chr6 26207565 26207566
.......
What kind of information is this? Is it important?

Second, my cnv_all.out.txt and small.out.txt are empty, is this normal? Is it because I used the empty file last_out.txt? (This is what I learned from the answers you gave to others).

In order to facilitate you to help me answer more quickly, I have attached the above generated files.
New Folder.zip

Thank you for your help
Huangchao

the make problem

Hello, when I use make command, some errors appearred, like [-Wsign-compare] and [-Wunused-variable] as following:
"g++ -g -Wall -std=c++0x -c svlib.cpp
svlib.cpp: In function ‘void findInnie(std::vector&, mI&)’:
svlib.cpp:39:34: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
while((mi.x2 > mums[i].x1) && (i<mums.size()))
svlib.cpp:339:29: warning: unused variable ‘Dist’ [-Wunused-variable]
double d1 =0, d2 =0, d = 0,Dist= 0,rd2=0,rd=0;"
what should I do? Thanks for your attention.

SVMU analysis using short read contigs.

Dear Mahul, I intend to use SVMU for calling variants between a reference and a newly sequenced strain of Drosophila. I have the contigs and also a scaffolded assembly for the same built from Illumina short reads. Can I proceed with my short read assembly? I also have a pseudo-chromosomes built using mscaffolder, can I use this as query.
My current strategy is using pseudo-chromosomes I built using mscaffolder as a query. I am able to run the analysis successfully. I need your inputs pertaining to my workflow.
Hoping for a positive response. Thank you

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.