Coder Social home page Coder Social logo

veg / hyphy Goto Github PK

View Code? Open in Web Editor NEW
194.0 22.0 68.0 58.11 MB

HyPhy: Hypothesis testing using Phylogenies

Home Page: http://www.hyphy.org

License: Other

CMake 0.14% C 29.47% C++ 24.07% Shell 0.01% Python 0.04% HyPhy 46.19% JavaScript 0.03% SWIG 0.05%
science phylogenetics c-plus-plus bioinformatics comparative-genomics evolution statistical-methods

hyphy's Introduction

Build Status

HyPhy - Hypothesis testing using Phylogenies

HyPhy is an open-source software package for the analysis of genetic sequences using techniques in phylogenetics, molecular evolution, and machine learning. It features a complete graphical user interface (GUI) and a rich scripting language for limitless customization of analyses. Additionally, HyPhy features support for parallel computing environments (via message passing interface (MPI)) and it can be compiled as a shared library and called from other programming environments such as Python and R. HyPhy is the computational backbone powering datamonkey.org. Additional information is available at hyphy.org.

Quick Start

Install

conda install hyphy

Run with Command Line Arguments

hyphy <method_name> --alignment <path_to_alignment_file> <additional_method_specific_arguments>

  • <method_name> is the name of the analysis you wish to run (can be: absrel, bgm, busted, fade, fel, fubar, gard, meme, relax or slac)
  • <path_to_alignment_file> is the relative or absolute path to a fasta, nexus or phylib file containing an alignment and tree
  • A list of the available <additional_method_specific_arguments> can be seen by running hyphy <method_name> --help

or

Run in Interactive Mode

hyphy -i

Building from Source

Requirements

  • cmake >= 3.12
  • gcc >= 4.9
  • libcurl
  • libpthread
  • openmp (can be installed on mac via brew install libomp)

Download

You can download a specific release here or clone this repo with

git clone https://github.com/veg/hyphy.git

Change your directory to the downloaded/cloned directory

cd hyphy

Build

cmake .

make install

Additional Options for Building from Source

Build Systems

If you prefer to use other build systems, such as Xcode, configure using the -G switch

cmake -G Xcode .

CMake supports a number of build system generators, feel free to peruse these and use them if you wish.

If you are on an OS X platform, you can specify which OS X SDK to use

cmake -DCMAKE_OSX_SYSROOT=/Developer/SDKs/MacOSX10.9.sdk/ .

If building on a heterogeneous cluster with some nodes that do not support auto-vectorization

cmake -DNOAVX=ON ..

If you're on a UNIX-compatible system, and you're comfortable with GNU make, then run make with one of the following build targets:

  • MP or hyphy - build a HyPhy executable (This used to be "HYPHYMP" but is now just "hyphy") using pthreads to do multiprocessing
  • MPI - build a HyPhy executable (HYPHYMPI) using MPI to do multiprocessing
  • HYPHYMPI - build a HyPhy executable (HYPHYMPI) using openMPI
  • LIB - build a HyPhy library (libhyphy_mp) using pthreads to do multiprocessing
  • GTEST - build HyPhy's gtest testing executable (HYPHYGTEST)

Example (MPI build of hyphy using openMPI)

Ensure that you have openmpi installed and available on your path. You can check if this is the case after running cmake . you should see something similar to this in your output

-- Found MPI_C: /opt/scyld/openmpi/1.6.3/gnu/lib/libmpi.so;/usr/lib64/libibverbs.so;/usr/lib64/libdat.so;/usr/lib64/librt.so;/usr/lib64/libnsl.so;/usr/lib64/libutil.so;/usr/lib64/libm.so;/usr/lib64/libtorque.so;/usr/lib64/libm.so;/usr/lib64/libnuma.so;/usr/lib64/librt.so;/usr/lib64/libnsl.so;/usr/lib64/libutil.so;/usr/lib64/libm.so

-- Found MPI_CXX: /opt/scyld/openmpi/1.6.3/gnu/lib/libmpi_cxx.so;/opt/scyld/openmpi/1.6.3/gnu/lib/libmpi.so;/usr/lib64/libibverbs.so;/usr/lib64/libdat.so;/usr/lib64/librt.so;/usr/lib64/libnsl.so;/usr/lib64/libutil.so;/usr/lib64/libm.so;/usr/lib64/libtorque.so;/usr/lib64/libm.so;/usr/lib64/libnuma.so;/usr/lib64/librt.so;/usr/lib64/libnsl.so;/usr/lib64/libutil.so;/usr/lib64/libm.so

Then run

make HYPHYMPI

And then run make install to install the software

make install

  • hyphy will be installed at /location/of/choice/bin
  • libhyphy_mp.(so/dylib/dll) will be installed at /location/of/choice/lib
  • HyPhy's standard library of batchfiles will go into /location/of/choice/lib/hyphy

Testing

Use make test after running cmake ..

Benchmarks for CMake Tests

Benchmarks, using Github Actions, can be found at http://hyphy.org/bench

Executable Location

By default, HyPhy installs into /usr/local but it can be installed on any location of your system by providing an installation prefix

cmake -DCMAKE_INSTALL_PREFIX:PATH=/location/of/choice

For example, this configuration will install hyphy at /opt/hyphy

mkdir -p /opt/hyphy

cmake -DCMAKE_INSTALL_PREFIX:PATH=/opt/hyphy .

Building Documentation

make docs
cd docs
python3 -m http.server

CLI notes

As noted in the documentation here hyphy can be run as a command line tool. Indeed for many analyses the hyphy CLI will return useful help messages, showing which parameter values can be set to specify your analysis. For example, running hyphy gard --help

hyphy gard --help 

Available analysis command line options
---------------------------------------
Use --option VALUE syntax to invoke
If a [reqired] option is not provided on the command line, the analysis will prompt for its value
[conditionally required] options may or not be required based on the values of other options

type
	The type of data to perform screening on
	default value: nucleotide

code
	Genetic code to use (for codon alignments)
	default value: Universal
	applies to: Choose Genetic Code

alignment [required]
	Sequence alignment to screen for recombination

model
	The substitution model to use
	default value: JTT

rv
	Site to site rate variation
	default value: None

max-breakpoints
	Maximum number of breakpoints to consider
	default value: 10000

rate-classes
	How many site rate classes to use
	default value: 4

output
	Write the resulting JSON to this file (default is to save to the same path as the alignment file + 'GARD.json')
	default value: gard.defaultJsonFilePath [computed at run time]

mode
	Run mode (Normal or Faster)
	default value: Normal

output-lf
	Write the best fitting HyPhy analysis snapshot to (default is to save to the same path as the alignment file + 'best-gard')
	default value: gard.defaultFitFilePath [computed at run time]

will show you the options that can be set for the gard analysis. So for instance one could specify a gard run on the command line with the following command

hyphy gard --alignment /path/to/file --rv GDD --mode Faster --rate-classes 3

While this is a useful feature, it is not always the case that older analyses will have the same level of support for command line. For instance, the acd analysis does not have CLI support and so if one runs the help command

hyphy acd --help 

Available analysis command line options
---------------------------------------
Use --option VALUE syntax to invoke
If a [reqired] option is not provided on the command line, the analysis will prompt for its value
[conditionally required] options may or not be required based on the values of other options

No annotated keyword arguments are available for this analysis

one will see that there are no options available. In this case, you can use a different CLI specification. Indeed the CLI will accept all of the options that are asked for in an interactive session, as positional arguments. In this case I could run the acd analysis with

hyphy acd Universal <alignment.fa> MG94CUSTOMCF3X4 Global 012345 <treefile> Estimate

where the options are specified in the exact order that they are asked for in the interactive session. This will work for all hyphy analyses and provides a less readable but more flexible way to run hyphy analyses.

hyphy's People

Contributors

andrewkern avatar artpoon avatar davebx avatar fd00 avatar hverdonk avatar jehops avatar jonchang avatar kjlevitz avatar lgtm-migrator avatar lhui2010 avatar lkremer avatar mdsmith avatar mgalardini avatar mr-c avatar mylez avatar nlhepler avatar rpatel avatar sjspielman avatar spond avatar srwis avatar stephenshank avatar stevenweaver avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hyphy's Issues

Turn message.log OFF by default

It may be advisable to have message.log files turned off by default; whenever you spawn MPI jobs on the cluster that run a while, you end up with megabytes of log files ••per node**.

Bugs in AVL1*AVL2

If done multiple times, will still segfault (code 11). Need to find a self-contained example.

matrix==matrix missing

There is no way to see if a matrix is identical. I've added an operator in the matrix class in my fork in the branch matrix_unit_test which hasn't been written for all types yet. Therefore, assigning to me.

!1 returns 1

>!1
1
>a = 1
NO RETURN VALUE
>!a
0

!1 should return 0

Remove xcode directory

Is there a reason why we have the xcode/ directory in git?
I was under the impression that we generate an Xcode project with cmake.

I'd like to remove it.

Need a None object

Need to define a None object in HyPhy. It already implicitly exists (e.g. anything that returns an instance of the base _MathObject class is effectively a None). What is missing is a language constant None.

massive mac gui fail to compile

I just pulled the master in an attempt to keep my BGM development current, but now I can't compile any Mac targets. I get hundreds of errors such as:

/Users/apoon/git/hyphy/src/gui/mac/HYPlatformComponent.cpp:174:0 /Users/apoon/git/hyphy/src/gui/mac/HYPlatformComponent.cpp:174: error: cannot convert 'GtkWidget_' to 'OpaqueControlRef_' for argument '1' to 'void SetControl32BitMinimum(OpaqueControlRef*, SInt32)'

sigh.

Columns() returns incorrect dimension for Formula string matrix

mixedBN = {};

mixedBN["G"] = {};

(mixedBN["G"])["CPDFs"] = {{
"Random({{-2.0, 1.0}}, {"PDF":"Gaussian", "ARG0":Inverse({{1,0},{0,1}}) * (Random({{1.0}}, {"PDF":"InverseWishart", "ARG0":{{1.0}} }))[0] }) ",
"Random({{2.0, -1.0}}, {"PDF":"Gaussian", "ARG0":Inverse({{1,0},{0,1}}) * (Random({{1.0}}, {"PDF":"InverseWishart", "ARG0":{{1.0}} }))[0] }) ",
}};

Columns((mixedBN["G"])["CPDFs"])
3

((mixedBN["G"])["CPDFs"])[0]
Random({{-2.0, 1.0}}, {"PDF":"Gaussian", "ARG0":Inverse({{1,0},{0,1}}) * (Random({{1.0}}, {"PDF":"InverseWishart", "ARG0":{{1.0}} }))[0] })

((mixedBN["G"])["CPDFs"])[1]
Random({{2.0, -1.0}}, {"PDF":"Gaussian", "ARG0":Inverse({{1,0},{0,1}}) * (Random({{1.0}}, {"PDF":"InverseWishart", "ARG0":{{1.0}} }))[0] })

((mixedBN["G"])["CPDFs"])[0]
Random({{-2.0, 1.0}}, {"PDF":"Gaussian", "ARG0":Inverse({{1,0},{0,1}}) * (Random({{1.0}}, {"PDF":"InverseWishart", "ARG0":{{1.0}} }))[0] })

((mixedBN["G"])["CPDFs"])[2]
**** CRASH ****

at _Matrix::MAccess line 4872:
return (_PMathObj)entryFla->Compute()->makeDynamic();

Buggy OpenMP in _LikelihoodFunction::ComputeBlock()

Whilst running the 454 UDS pipeline, we get this with MP2, but not with DEBUG. Interesting, eh?

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x00000009045f5740
[Switching to process 25604 thread 0x1903]
0x00000001002c4cb1 in _LikelihoodFunction::ComputeBlock ()
(gdb) bt
#0 0x00000001002c4cb1 in _LikelihoodFunction::ComputeBlock ()
#1 0x0000000100002da3 in gomp_thread_start ()
#2 0x00007fff8ca318bf in _pthread_start ()
#3 0x00007fff8ca34b75 in thread_start ()

Nicer AStyle

See if AStyle can be run with nicer settings.

Standardize

Standardize the Batch Language in order to follow operators that are widely accepted paradigms

installing hyphy on redhat Enterprise Linux 6.2

Hi,

I tried to install HyPhy on RHEL 6.2 but encountered the following errors when I try to compile the code:

make MP2
Scanning dependencies of target HYPHYMP
[ 0%] Building CXX object CMakeFiles/HYPHYMP.dir/src/core/alignment.cpp.o
[ 6%] Building CXX object CMakeFiles/HYPHYMP.dir/src/core/calcnode.cpp.o
[ 6%] Building CXX object CMakeFiles/HYPHYMP.dir/src/core/avllistx.cpp.o
[ 6%] Building CXX object CMakeFiles/HYPHYMP.dir/src/core/avllist.cpp.o
[ 12%] Building CXX object CMakeFiles/HYPHYMP.dir/src/core/likefunc.cpp.o
[ 12%] Building CXX object CMakeFiles/HYPHYMP.dir/src/core/batchlanhelpers.cpp.o
[ 12%] Building CXX object CMakeFiles/HYPHYMP.dir/src/core/category.cpp.o
[ 18%] Building CXX object CMakeFiles/HYPHYMP.dir/src/core/operation.cpp.o
[ 18%] Building CXX object CMakeFiles/HYPHYMP.dir/src/core/formula.cpp.o
[ 18%] Building CXX object CMakeFiles/HYPHYMP.dir/src/core/constant.cpp.o
[ 25%] Building CXX object CMakeFiles/HYPHYMP.dir/src/core/matrix.cpp.o
/usr/local/packages/hyphy/veg-hyphy-8b48fc7/src/core/matrix.cpp: In member function ‘void _Matrix::Multiply(_Matrix&, _Matrix&)’:
/usr/local/packages/hyphy/veg-hyphy-8b48fc7/src/core/matrix.cpp:3680: error: ‘secondArg’ not specified in enclosing parallel
/usr/local/packages/hyphy/veg-hyphy-8b48fc7/src/core/matrix.cpp:3679: error: enclosing parallel
/usr/local/packages/hyphy/veg-hyphy-8b48fc7/src/core/matrix.cpp:3702: error: ‘storage’ not specified in enclosing parallel
/usr/local/packages/hyphy/veg-hyphy-8b48fc7/src/core/matrix.cpp:3679: error: enclosing parallel
make[3]: *** [CMakeFiles/HYPHYMP.dir/src/core/matrix.cpp.o] Error 1
make[2]: *** [CMakeFiles/HYPHYMP.dir/all] Error 2
make[1]: *** [CMakeFiles/MP2.dir/rule] Error 2
make: *** [MP2] Error 2

Any suggestion on the problem is greatly appreciated.

Will

Create a text console

Features should include:

  • History
  • Tab Completion
  • Scrolling
  • Proper error output?
  • Copy
  • Multi-line paste execution
  • HyPhy version and initial diagnostics.
  • Citation reminder
  • Tutorial

BGM crashing HyPhy on exit

When there is only one BGM (now BayesianGraphicalModel) object tracked in the _List object bgmList and the user attempts to overwrite it with a new instance, deletion of the original BGM object causes Hyphy to call the _List destructor and crashes. Similarly, when quitting HyPhy the BGM object is release which also calls the _List destructor and a crash.

Memory leaks

I've accumulating massive amounts of memory leaks in low-level stuff like ExecuteFormula() and MemReallocate(), apparently from running AlignSequences() within a loop.
Will try to track down today.
0- Art.

Uninformative error message

function testFunction () {
    fprintf (stdout, "Hello, this is a test\n");
    return 0;
}

testFunction = 0;

It should indeed not be possible now to assign a value to a function, but the error message simply states

Can't assign like this

which makes debugging hard.

Buggy HBL parser 2

Expressions that use inline object definitions (e.g. matrices, dictionaries, substituted strings) are evaluated once and not updated if the inline object definitions change between calls.

For example

for (k = 0; k < 10; k+=1)
{
    str = ""+k;
    substitutedString = "`str`-substituted";
    fprintf (stdout, substitutedString, "\n");
}

Prints 0-substituted 10 times.

Location for doxygen

I've been considering where to put doxygen generated pages.

Even if they are mostly just stubs at the moment, it would be useful to have this available for easy navigation and representation of the codebase, and it is relatively little effort to setup.

We can either put this on the hyphy.org domain, or we can just make this as part of the github pages.

The url for the github pages would look like this:

http://veg.github.com/hyphy/

Using this method:
https://gist.github.com/825950

This can be assigned back to me once a decision has been made.

[Bug][GUI] File opening

When opening a file, the default is "All Readable Documents".

However, this results in grayed out files that are perfectly executable.

On Mac OSX 10.7

Reproduce:
Go to File->Open->Open Batch File
Navigate to proper bf file.
Notice unselectable.
Enable must be changed to "All Documents"

Matrix >= -1

{{0,7,11,14}{7,0,6,9}{11,6,0,7}{14,9,7,0}}>=-1

Will cause a segfault. Perhaps it should fail gracefully?

Need a DEBUG build...

We should have a make option which turns off optimizations and uses '-g' flags to run in gdb and valgrind.

Buggy HBL parser 1

In HBL, an assignment like this

a = "" + {{1,2}} + "2";

barfs with error

Attempting to operate on an undefined value; this is probably a result of an earlier 'soft' error condition

Associative List multiplication.

Assignment of second factor variable after two associative lists being multiplied causes crash

> a = {"key":"value"};
> b = {"key2":"value2", "key3":"value3"};
> a*b;
> b

Will cause a segfault.

[Bug][HBL] Modulo operation on Matrix

In HBL, an operation like this

a = {{1,2,3,4,5}{5,4,3,2,1}}
b = {{1,2,3,4,5}}
a%b

Will cause a segmentation fault. Perhaps it should fail gracefully.

Path-related characters in file names

HyPhy doesn't correctly handle paths which contain characters that can be interpreted as directory delimiters on any platform, e.g. a path containing the '' character should keep that character on a Mac/Linux platform, instead of treating this as a directory separator.

More generally, ProcessFileName needs to be updated to handle escape characters, etc.

Function modifies matrix parameter passed by value

A call to proposalDistribution modifies m because of the [] operation.

m = {100,1}["1/100"];
nm = proposalDistribution (m, 100, 50);

assert (Abs(m-nm)>0);

//------------------------------------------------------------------------------------------------//

function proposalDistribution (proposed_weights, dim, sites) {
    idx    = Random (0, dim)$1;
    idx2   = Random (0, dim)$1;
    while (idx == idx2) {
        idx2   = Random (0, dim)$1;     
    }
    change = Random (0,1/sites);

    proposed_weights[idx]  += (-change);
    proposed_weights[idx2] += change;

    return proposed_weights;
}

gcc version check needed

The default cmake compiler flag (-flto) is not supported by gcc4.2 on OS X 10.6. Perhaps we should do a gcc version check?

"Run" in Xcode does not properly work.

If I build and try to run a debug version of iHyphyDebug, I get the error that "No executable file" is specified.
I'm guessing Cmake has to be updated somewhere to resolve the issue.

Random number generator seeding

Current seeding mechanism (seconds since epoch start) fails when multiple processes are launched within at the same time (e.g. MPI), because they will all be generating the same stream of numbers. This can be solved by using a more robust seed (e.g. from /dev/random or something based on a per-process unique thing, e.g. a combination of time, process ID, memory available to process, to ensure uniqueness).

[HBL] Document every command.

Create a wiki page for each command.
Document to best of ability.
If needs further work, mark the category as TODO.
Properly categorize each command.

Create mock-up of new design

Before doing this, I need to define:
All current windows that can be displayed.
Those that are menu items, and those that are prompts from the Batch Language.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.