Coder Social home page Coder Social logo

seq-gen's Introduction

This package is the generic source-code version. This can be compiled
and run on most UNIX/Linux/Mac OS X systems. It can also be compiled 
for Windows. For Mac OS X a pre-compiled version is available from
the website:

http://tree.bio.ed.ac.uk/software/Seq-Gen/

There is a manual in HTML format in the doc/ directory of this package.

On most UNIX systems, to compile, type:

cd source
make

A binary called 'seq-gen' will be created in the same directory as this
README file.

Any questions about Seq-Gen should be sent to:

	Andrew Rambaut <[email protected]>

seq-gen's People

Contributors

fredericlemoine avatar jazpy avatar kdm9 avatar niemasd avatar rambaut avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

seq-gen's Issues

Update source code on Seq-Gen website

I'm not sure what was causing it, but I was getting a segfault when simulating sequences using the Seq-Gen version on the website (the segfault would occur in the "output" function). I tried compiling from the GitHub source with -g to try to dive deeper and see what might be the issue using gdb, but the GitHub version worked just fine. There must be some bug in the version on the website that was fixed on GitHub, so it would be nice if the version on the website had the bug fixed as it is the link Google finds (or just link to the GitHub repo on the website directly)

terminal whitespace in an input sequence causes failure to find tree

When supplying relaxed phylip-format input sequences to act as ancestral sequences, SeqGen allows whitespace inside sequences, but crashes - "Tree is missing from end of sequence file" - if there's whitespace between the end of the sequence and the end of the line. e.g. "ACG T" is fine but "ACG T " is not.
(Biopython inserts a space every 10 bases when outputting phylip, including one at the end if the number of bases is a multiple of 10.)

"Error reading tree number 1: Closing bracket missing"

I'm getting the error Error reading tree number 1: Closing bracket missing, but the Newick tree I'm feeding Seq-Gen seems like a valid tree (I ran it through nw_distance to get branch lengths just to make sure it works, and nw_distance parsed it just fine). Can you help me find my issue? Here are the contents of the file I'm trying:

1 900
N2 AGGGCCAACGTGGACTGCTTGCTATGGAGGGTGTGTCGACTCCGAGTTCACCGCCTAGTCGCTTTCCTTCCCAAGGTCAGCTTCACACAATCGGGCTATAGGGAATGCCGATTTAAAAGTGGCACGTACCGCACTGGACGCTTACTTGTCTACACATTCCACACGACAAAAGGACCGTGTTGCTTTAAACTTCGTCCAGTCGGCCCCCCGTGTCATGAAGCTCGCCAATCATTGGAGGCTCAGTCGTATGCCAGTGCTTTTCTAACTCTGGTAGCATCTCTGGGATTTATTCGCGCGCCGGATCTAGACATTACCCAAAATAACGCCCACCAGAAGTCAAGGCGCATTGAGCTGGCGTGCAATTGGTATGGCCCTTATTTTAGGCTTTTAGCGCTCAGCCTAAGAGGAGGGCGTACGGTAATATATCAGCGTGGCCAATGGGTCAATGATCTGGTGCAAGGTCCGGCGGCCCAATCTCCAGCCATGGGAAAGGCGGATCGTATGGGAATATATCGCCGATGTAGCACACAGGTGGGACGACGTAGAAGACAGGTTCGGGTGCATAGGTCCGTGAACGCACAGTGTGCACCATTACGAGTACCGCTACGCGGCGTCGTTAGATACAAACTTCCTAGGGGGCCCGAAGCCAAGGGCAGTAAGTCAAGACTAAACATCGCTACAGAAGTTCCGCTTAAGAATACTGTGACGACGACCACCCTCCTATCCCTACATGCGTCGTCGCGACTAAAGATAACGCAGCTACTACTCGGGTACATCTTGCCTGCCGTTAGAATTTTGGTACCGTACAATGCAAGCCTAGCTACCTACGAGAGAATTGGAACAGTCTTGCTACTATCAAGTGGCAAACAACAACCAAGCATTTGGTACGCTTCTTATCGG
1
(((((((N10094|53|3.387644573626598:0.000000,N1335|53|3.4648137547803013:0.385846)N10093|53|3.387644573626598:2.843726,(N10096|53|3.387644573626598:0.000000,(N10098|53|6.290898771186661:0.000000,N1550|53|6.5055189042327735:1.073101)N10097|53|6.290898771186661:14.516271)N10095|53|3.387644573626598:2.843726)N1329|53|2.818899456866464:0.028835,((N1338|53|3.9619077297964496:2.390328,N10086|53|3.4838422003374294:0.000000)N10085|53|3.4838422003374294:0.527812,(N10088|53|3.387644573626598:0.000000,((N1536|53|6.545552153111298:1.273267,N10090|53|6.290898771186661:0.000000)N10089|53|6.290898771186661:12.535181,(N1610|53|6.525077022435883:1.170891,N10092|53|6.290898771186661:0.000000)N10091|53|6.290898771186661:12.535181)N1337|53|3.78386254970119:1.981090)N10087|53|3.387644573626598:0.046824)N1330|53|3.3782797020259157:2.825736)N1328|53|2.8131325115826025:3.249158,((N10082|53|3.4838422003374294:0.000000,N1333|53|4.51547779079867:5.158178)N10081|53|3.4838422003374294:4.890211,(N10084|53|3.4838422003374294:0.000000,N1325|53|5.335201493819431:9.256796)N10083|53|3.4838422003374294:4.890211)N1324|53|2.5057999939750486:1.712496)N10|53|2.1633008480336615:0.918445,(((((N10138|76|7.136413093174793:0.000000,N739|76|7.20349385579528:0.335404)N10137|76|7.136413093174793:14.832736,(N725|76|6.440686534739932:4.398717,(N770|76|7.098571866123575:3.289427,N10140|76|6.440686534739932:0.000000)N10139|76|6.440686534739932:4.398717)N719|76|5.5609431071049045:6.955386)N704|76|4.169865967577531:1.007382,(N762|76|7.136413093174793:5.185240,(N10132|63|7.026520998197115:0.000000,(N10134|63|7.678036911951752:0.000000,(N10136|63|7.750248481326306:0.000000,N746|63|7.759424299384176:0.045879)N10135|63|7.750248481326306:0.361058)N10133|63|7.678036911951752:3.257580)N10131|63|7.026520998197115:4.635779)N735|76|6.099365107581146:10.654877)N18|76|3.968389650172391:6.202349,(N833|76|6.440686534739932:13.568654,N10142|76|3.7269556498970466:0.000000)N10141|76|3.7269556498970466:4.995179)N16|76|2.727919813583346:0.975411,(((N10130|38|4.384954202491446:0.000000,N7703|38|7.692601176576037:16.538235)N10129|38|4.384954202491446:4.099248,(((N10124|38|4.710728536659602:0.000000,N7302|38|6.745072582849662:10.171720)N10123|38|4.710728536659602:3.871776,(N10126|38|4.710728536659602:0.000000,N7543|38|5.168418140802194:2.288448)N10125|38|4.710728536659602:3.871776)N35|38|3.936373278607399:1.326461,(N32|38|5.848390579996254:7.317182,N10128|38|4.384954202491446:0.000000)N10127|38|4.384954202491446:3.569366)N26|38|3.671081001320573:0.529882)N21|38|3.5651046749252604:1.548266,(((((((N10114|35|6.109433810752977:0.000000,(N7721|35|6.68314747809591:1.384597,N10116|35|6.406228150861876:0.000000)N10115|35|6.406228150861876:1.483972)N10113|35|6.109433810752977:0.793526,N7722|35|6.109433810752977:0.793526)N7717|35|5.950728531832838:2.905630,(N10112|35|6.406228150861876:0.000000,N7718|35|6.83715584702786:2.154638)N10111|35|6.406228150861876:5.183128)N7715|35|5.369602522730817:0.276969,N10110|35|5.314208700660342:0.000000)N10109|35|5.314208700660342:0.313356,(N10104|35|5.314208700660342:0.000000,((N10106|35|6.406228150861876:0.000000,N7725|35|7.415432613936449:5.046022)N10105|35|6.406228150861876:4.617694,(N7723|35|6.542907099066029:2.167366,N10108|35|6.109433810752977:0.000000)N10107|35|6.109433810752977:3.133722)N7716|35|5.482689326353514:0.842403)N10103|35|5.314208700660342:0.313356)N42|35|5.251537410866491:1.257687,(((N1313|79|7.394220575611178:3.002561,(N1309|79|7.898448763109277:5.303604,(N10120|79|7.394220575611178:0.000000,N1315|79|7.898448763109277:2.521141)N10119|79|7.394220575611178:2.782463)N1306|79|6.837728031727479:0.220098)N1304|79|6.793708375375154:4.812583,(N10122|79|7.394220575611178:0.000000,N1322|79|7.898448763109277:2.521141)N10121|79|7.394220575611178:7.815144)N41|79|5.831191817063941:2.217568,N10118|79|5.387678251235054:0.000000)N10117|79|5.387678251235054:1.938391)N34|35|5.0:7.804189,((N7261|38|7.760882722187589:0.341408,N10102|38|7.692601176576037:0.000000)N10101|38|7.692601176576037:20.276384,(N7338|38|4.384954202491446:2.374260,((N10100|38|4.710728536659602:0.000000,N7045|38|5.479073165297347:3.841723)N10099|38|4.710728536659602:0.829614,N7125|38|7.692601176576037:15.738977)N37|38|4.544805745499566:3.173517)N29|38|3.910102255157772:1.363889)N23|38|3.6373244314426554:0.990811)N22|38|3.4391622608511176:0.918554)N15|38|3.2554514626661493:3.613069)N13|76|2.5328376431325754:2.766129)N8|76|1.979611885156058:3.104836,(((N10080|76|3.7269556498970466:0.000000,N20|76|4.41755168315786:3.452980)N10079|76|3.7269556498970466:4.622154,N921|76|7.136413093174793:21.669442)N11|76|2.8025247719501962:4.061246,(N12|76|5.480171134108298:8.766077,N10078|76|3.7269556498970466:0.000000)N10077|76|3.7269556498970466:8.683401)N7|76|1.9902755489963695:3.158154)N3|76|1.3586447399943462:6.610871,((N10074|21|2.0361489799714976:0.000000,(N7755|21|4.546571082984857:0.325177,N10076|21|4.481535604876424:0.000000)N10075|21|4.481535604876424:12.226933)N10073|21|2.0361489799714976:8.236285,((((N10064|21|6.427776507669709:0.000000,N8423|21|7.801600945082707:6.869122)N10063|21|6.427776507669709:17.695347,((N10066|21|4.481535604876424:0.000000,(N8217|21|6.6360575289254715:1.041405,N10068|21|6.427776507669709:0.000000)N10067|21|6.427776507669709:9.731205)N10065|21|4.481535604876424:6.404303,(N10070|21|6.427776507669709:0.000000,N8142|21|6.574277590049945:0.732505)N10069|21|6.427776507669709:16.135508)N8109|21|3.2006749810166126:1.559840)N8104|21|2.8887070425746124:5.667660,(N10072|21|2.0361489799714976:0.000000,N8099|21|2.334175756485804:1.490134)N10071|21|2.0361489799714976:1.404870)N8098|21|1.7551749986922354:1.117354,(N10060|21|2.0361489799714976:0.000000,(N8147|21|5.723852597930773:6.211585,N10062|21|4.481535604876424:0.000000)N10061|21|4.481535604876424:12.226933)N10059|21|2.0361489799714976:2.522224)N6|21|1.531704112895537:5.714061)N4|21|0.38889192335194406:1.762107)N2|21|0.036470598824739756:0.182353;

Error reading tree number 1: Closing bracket missing.

Similar to #9 but I can't solve it with regex. I downloaded the nextstrain tree (Jan 4, 2021) for nCov and wanted to run TreeToReads.py with it (newick attached below). However the seq-gen part gives the closing bracket error. I have tried a variety of things including renaming the taxa and resolving multifurcations

perl -MBio::TreeIO -e '$tree=Bio::TreeIO->new(-file=>"nextstrain_ncov_global_tree.nwk")->next_tree; for($tree->get_nodes){$i++; if($_->is_Leaf){$_->id("TAXON$i");} else {$_->id("");} } print $tree->as_text("newick")."\n";' | gotree resolve > anonymized.nwk

And breaking apart long lines

cat anonymized.nwk | perl -plane 's/(.{50,}?,)/\1\n/g' > tmp.nwk

This is my seq-gen command (and change the stdin parameter accordingly)

seq-gen -l768000 -n1 -mGTR -a5.0 -r0.25,0.82,0.15,0.27,2.99,1.00 -f0.299236590102,0.183687135874,0.196176253934,0.32090002009 -or < tmp.nwk

But nothing seems to help so far. Any ideas?

nextstrain_ncov_global_tree.zip

Free software license?

Hi,
We have this project as a package in Debian currently
Whilst I agree that every file containing "code" has a BSD license on top of it, but due to the absence of a LICENSE file, the data and documentation becomes non-free according to the free software guideline followed here.
Could you please add in the same in a LICENSE file and commit?
That'd be great.

PS: Considering that PAML has also adopted a free software license, this can also go about doing the same w/o conflicts, I suppose.

Node labels in ancestral sequences

I am generating the ancestral sequences, which come with node labels. How are these node labels assigned? Where does each internal node occur in the tree?

It would be handy if the program could read the node labels in the tree - that way we know which sequence belongs to which node...

FASTA format output

Hi there,
It was announced that FASTA format output is a new feature in Version 1.3.4 but I can't figure how to set the output parameters to run it. Also, I downloaded version 1.3.4 and seems to be version 1.3.2x:

Seq-Gen-1.3.4$ seq-gen -h
Sequence Generator - seq-gen
Version 1.3.2x

Cheers,

Cannot pipe a tree to `seq-gen`

It is not possible to echo a string into seq-gen to use as a tree:

$ echo "(A:0.1,B:0.1,C:0.1);" | /home/sam/ware/Seq-Gen.v1.3.3/source/seq-gen -mGTR                                                                                                                                              
Sequence Generator - seq-gen
Version 1.3.2x
(c) Copyright, 1996-2004 Andrew Rambaut and Nick Grassly
Department of Zoology, University of Oxford
South Parks Road, Oxford OX1 3PS, U.K.

Error reading tree number 1: .

Meanwhile, the following works:

/home/sam/ware/Seq-Gen.v1.3.3/source/seq-gen -mGTR <<< "(A:0.1,B:0.1,C:0.1);"

From what I can tell, this could be caused by multiple calls of feof(stdin) in seq-gen.c or treefile.c?
I'm not really sure what the "first pass" of stdin does, as this is also a valid input that produces sequences:

/home/sam/ware/Seq-Gen.v1.3.3/source/seq-gen -mGTR
Sequence Generator - seq-gen
Version 1.3.2x
(c) Copyright, 1996-2004 Andrew Rambaut and Nick Grassly
Department of Zoology, University of Oxford
South Parks Road, Oxford OX1 3PS, U.K.

<CTRL-D>
(A:0.1,B:0.1,C:0.1);
<CTRL-D>

Update paml related files to make seq-gen free

For long time seq-gen was considered non-free due to non-free license of paml parts.
Currently the situation has changed and paml is released under GPL license.
I wonder if you can update the outdated paml code in seq-gen and make it this way completely free ?
This will enable me to include seq-gen to the main repository of Debian. Currently I can not do this as the old paml code doesn't comply with Debian Free Software Guidelines.

Cannot give tree with "<<<"

Hello,

I'm following a pipeline which used Seq-Gen as a subprocess within a python script, and it gives the tree in text with <<< instead of as a file name. I've seen in other issues that this is not a problem, but for some reason, it is for me. I'm running Seq-Gen 1.3.4 installed from conda, but I also had the same problem with the latest version, compiled from source:

seq-gen -mGTR -q -a1000 -z1686042325.0655584 -l 1000 -f0.277,0.228,0.246,0.249 -r1,1.68369,1,1,1.91645,1 <<< "(B:0.3598052239,D:0.3425989485,(A:0.4731590178,C:0.46432
18278):0.0822832145);"
Error reading tree number 1: .

Not sure whether I'm doing something wrong. I tried with a simpler command, used by @SamStudio8 on Issue #4, but I get the same error:

seq-gen -mGTR <<< "(A:0.1,B0.1,C:0.1);"
Sequence Generator - seq-gen
Version 1.3.4
(c) Copyright, 1996-2017 Andrew Rambaut and Nick Grassly
Institute of Evolutionary Biology, University of Edinburgh

Originally developed at:
Department of Zoology, University of Oxford

Error reading tree number 1: .

However, saving the tree and passing it as a filename works perfectly:

echo "(B:0.3598052239,D:0.3425989485,(A:0.4731590178,C:0.4643218278):0.0822832145);" > tree.nwk
seq-gen -mGTR -q -a1000 -z1686042325.0655584 -l 1000 -f0.277,0.228,0.246,0.249 -r1,1.68369,1,1,1.91645,1 < tree.nwk
# Sequences produced...

Thus, it seems to not be a problem of the tree format (also, the pipeline I'm following uses the exact line I pasted in the first code block), but I'm at loss as to what else could I test.

Many thanks.

-carlos

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.