Coder Social home page Coder Social logo

alexsafatli / pylogeny Goto Github PK

View Code? Open in Web Editor NEW
20.0 3.0 1.0 9.33 MB

Python framework for phylogenetic tree landscapes

License: GNU General Public License v2.0

Shell 0.10% Python 90.44% C++ 3.89% C 5.11% Makefile 0.46%
python landscape heuristic phylogenetic-tree-landscapes python-framework

pylogeny's Introduction

Pylogeny

Build Status

A software library and code framework, written in the Python programming language for Python 2, for phylogenetic tree reconstruction, rearrangement, scoring, and for the manipulation, heuristic search, and analysis of the phylogenetic tree combinatorial space. Scoring of trees in this library is accomplished by bindings to the libpll phylogenetic C library. Functionality also exists in the framework to execute popular heuristic programs such as FastTree and RAxML to acquire approximate ML trees.

The following tasks are capable of being performed with this library:

  • Generate and maintain phylogenetic tree landscapes.
  • Construct and analyse heuristic methods to search these spaces.
  • Build and rearrange phylogenetic trees using preset operators (such as NNI, SPR, and TBR).
  • Score phylogenetic trees by Maximum Likelihood (calculated as log-likelihood) and Parsimony.
  • Build confidence sets of trees using the widely known CONSEL application.

Code Example

You can create a landscape for a given sequence alignment, add a tree to the landscape corresponding to the one acquired from FastTree, and then perform a hill-climbing search on that landscape on the basis of parsimony with the below code.

from pylogeny.alignment import alignment
from pylogeny.landscape import landscape
from pylogeny.heuristic import parsimonyGreedy

ali = alignment('yourAlignment.fasta')
ls  = landscape(ali,starting_tree=ali.getApproxMLTree())
heu = parsimonyGreedy(ls,ls.getRootNode())
heu.explore()     

This performs an exploration but the heuristic does not return anything. In order to acquire an idea of what this fitness landscape now looks like, you can start making queries to the landscape.

for tree in ls.iterTrees():
    print tree # Print all the Newick strings in the landscape.
globalMax = ls.getGlobalOptimum() # Get the name for tree with best score.
print ls.getTree(globalMax) # Print this tree (see its Newick string).

We can also see what neighboring trees have been explored from the first tree we started the heuristic at.

neighbors = ls.getNeighborsFor(ls.getRoot())
for neighbor in neighbors: # Neighbors are indices.
    print ls.getTree(neighbor) # Print the Newick string for that neighbor.

Installation

Installation requires access to a UNIX-like system or terminal. Furthermore, basic build tools, python development header include files, and MySQL library bindings are required to install this software. In Ubuntu, this is done with the command

 sudo apt-get install python-dev build-essential libmysqlclient-dev

If you do not use a Debian or Ubuntu-derived Linux distribution, search for instructions on acquiring these for your platform.

Before continuing, the non-Python library dependency libpll must be installed. Acquire the appropriate binary or build, from source, version 1.0.2 with SSE3 support. A convenient shell script is located in the root directory of this repository that will perform this installation. You can run this without any download by peforming the command:

wget https://raw.githubusercontent.com/AlexSafatli/Pylogeny/master/install-pll.sh -O - | sh

Once you have acquired and installed all of the necessary non-Python dependencies, you can install this software automatically using pip or easy_install with command

pip install pylogeny

or

easy_install pylogeny

respectively.

Documentation

Generated documentation is found here. Tutorials and additional code examples are present in the wiki on GitHub for this project.

Dependencies

Works With

Citing

To cite this library, refer to the paper Pylogeny: an open-source Python framework for phylogenetic tree reconstruction and search space heuristics that has been published to describe its purpose here at PeerJ.

Contributing

To contribute to this project, feel free to make a pull request and it will be reviewed by the code maintainers.

pylogeny's People

Contributors

alexsafatli avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

aadavin

pylogeny's Issues

Substitution matrix for likelihood calculation?

Is there a way to specify a substitution matrix when calculating the likelihood of a tree? I am using amino acid data as a test set and would like to include a model of evolution in the calculation. I am assuming that, by default, Pylogeny uses a "Poisson" model or something similar to Jukes-Cantor. I would like to evaluate likelihoods using WAG, LG, JTT and even some custom matrices. Moreover, I'd like to apply separate models to each alignment site where appropriate.

Is this possible in Pylogeny? If not, is this functionality slated for future releases?

Thanks!

pip and easy_install not working

First of all, this package looks AMAZING and I cannot wait to use it. Thanks so much to all of the people involved in developing this!

I have tried installing this library using both pip and easy_install, however, using pip I get the following error:

~$sudo pip install phylogeny
Downloading/unpacking phylogeny
Could not find any downloads that satisfy the requirement phylogeny
Cleaning up...
No distributions at all found for phylogeny
Storing debug log for failure in /home/joseph/.pip/pip.log

And using easy_install:

easy_install phylogeny
Searching for phylogeny
Reading https://pypi.python.org/simple/phylogeny/
Couldn't find index page for 'phylogeny' (maybe misspelled?)
Scanning index of all packages (this may take a while)
Reading https://pypi.python.org/simple/
No local packages or download links found for phylogeny
error: Could not find suitable distribution for Requirement.parse('phylogeny')

Is this installation method not yet available?

Python3 support

It would be great to get Pylogeny working with Python3:

  • Remove dependency on mysqldb-python
    • Perhaps use mysqlclient as a drop-in replacement that works for both Python2 and Python3
  • Tweak the Python wrapper in fitch.cpp (involving Python2/3 differences in how strings are treated and int vs. long, described here)
  • Some minor syntax changes

Unfortunately, the p4 dependency also is incompatible with Python3, which requires some additional work.

Installation error: Symbol not found: _memalign

TraceBack:


---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-2-027cc924d852> in <module>()
      1 from pylogeny.alignment import alignment
----> 2 from pylogeny.landscape import landscape
      3 from pylogeny.heuristic import parsimonyGreedy
      4 
      5 ali = alignment('yourAlignment.fasta')

/usr/local/Cellar/python/2.7.8_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pylogeny-0.2.3-py2.7-macosx-10.9-x86_64.egg/pylogeny/landscape.py in <module>()
      7 import networkx
      8 from random import choice
----> 9 from scoring import getParsimonyFromProfiles as parsimony, getLogLikelihood as ll
     10 from parsimony import profile_set as profiles
     11 from bipartition import bipartition

/usr/local/Cellar/python/2.7.8_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pylogeny-0.2.3-py2.7-macosx-10.9-x86_64.egg/pylogeny/scoring.py in <module>()
      5 # E-mail: [email protected]
      6 
----> 7 import pll, parsimony, fitch, p4
      8 try:
      9     from model import DiscreteStateModel as State

/usr/local/Cellar/python/2.7.8_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pylogeny-0.2.3-py2.7-macosx-10.9-x86_64.egg/pylogeny/pll.py in <module>()
      5 # E-mail: [email protected]
      6 
----> 7 from pylibpll import *
      8 from tempfile import NamedTemporaryFile as NTempFile
      9 import os

ImportError: dlopen(/usr/local/Cellar/python/2.7.8_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pylogeny-0.2.3-py2.7-macosx-10.9-x86_64.egg/pylibpll.so, 2): Symbol not found: _memalign
  Referenced from: /usr/local/lib/libpll-generic.1.dylib
  Expected in: flat namespace
 in /usr/local/lib/libpll-generic.1.dylib

Result of pip freeze:

DendroPy==3.12.2
bitarray==0.8.1
networkx==1.9.1
numpy==1.9.0
p4==0.92.-2014.10.30-
pandas==0.14.1

Using libpll 1.0.0

On Mac OS X Yosemite (which OS are you using? I'd like to use this but can't right now).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.