Coder Social home page Coder Social logo

python-panther's Introduction

PANTHER_API

Use the PANTHER system to do statistical overrepresentaion tests in Python.

See http://pantherdb.org/help/PANTHERhelp.jsp#V. for the PANTHER webservices instructions.

Documentation

Command-line usage:

Requires:

  • pandas
  • requests
  • bs4
  • html5lib
usage: panther_api.py [-h] [--organism ORGANISM] [--test_type TEST_TYPE]
                      [--annotation_option ANNOTATION_OPTION]
                      inputfile outputfile

PANTHER overrepresentation test of a gene list.

positional arguments:
  inputfile             Gene list file. One ID per line.
  outputfile            File to save results

optional arguments:
  -h, --help            show this help message and exit
  --organism ORGANISM   Organism for reference/background
  --test_type TEST_TYPE
                        One of FISHER or BINOMIAL
  --annotation_option ANNOTATION_OPTION
                        Annotation option, see table below

Annotation options

Option Annotation Dataset
pathway PANTHER Pathways
panther_mf PANTHER GO-Slim Molecular Function
panther_bp PANTHER GO-Slim Biological Process
panther_cc PANTHER GO-Slim Cellular Component
panther_pc PANTHER Protein Class
fullgo_mf_comp GO molecular function complete
fullgo_bp_comp GO biological process complete
fullgo_cc_comp GO cellular component complete
reactome Reactome pathways

Examples

./panther_api.py example_0.txt example_0_panther.txt

Reference size:         21042
Number IDs mapped:      66
Number IDs not mapped:  4

Output:

            name                            # in reference  # in list   # expected in list  fold_enrichment direction   pvalue      FDR
GO:0000902  cell morphogenesis              696             11          2.22                4.95            +           1.31E-05    8.60E-03
GO:0001701  in utero embryonic development  347             7           1.11                6.32            +           1.31E-04    4.58E-02
GO:0001890  placenta development            151             5           .48                 10.38           +           1.39E-04    4.78E-02
GO:0001892  embryonic placenta development  87              5           .28                 18.01           +           1.11E-05    7.63E-03
GO:0002009  morphogenesis of an epithelium  420             11          1.34                8.21            +           1.06E-07    2.80E-04
GO:0002064  epithelial cell development     186             6           .59                 10.11           +           3.34E-05    1.76E-02
GO:0003382  epithelial cell morphogenesis   35              4           .11                 35.81           +           7.07E-06    6.57E-03
GO:0007043  cell-cell junction assembly     104             5           .33                 15.07           +           2.53E-05    1.48E-02
GO:0007155  cell adhesion                   916             13          2.92                4.45            +           6.10E-06    6.42E-03

./panther_api.py example_1.txt example_1_panther.txt

Reference size:         21042
Number IDs mapped:      21
Number IDs not mapped:  2
No statistically significant results.

No output.

clusters_to_panther.py

This script takes as argument an output file generated by a clustering algorithm (i.e. one line per cluster, tab-delimited genes), generates the nessecary gene list files and calls the main panther_api.py function on each of them.

Instructions

usage: clusters_to_panther.py [-h] [--remove_version] [--organism ORGANISM]
                                [--test_type TEST_TYPE]
                                [--annotation_option ANNOTATION_OPTION]
                                [--min_size MIN_SIZE]
                                [--start_cluster START_CLUSTER]
                                inputfile outputprefix

PANTHER overenrichment tests on the output of a clustering algorithm.

positional arguments:
  inputfile             Paraclique file. One cluster per line, tab seperated
                        gene IDs.
  outputprefix          Prefix to give to output gene lists and enrichment
                        result files. Output will be saved to
                        OUTPUTPREFIX_cluster_<num>.txt and
                        OUTPUTPREFIX_cluster_<num>.panther

optional arguments:
  -h, --help            show this help message and exit
  --remove_version      Whether to remove version number from gene IDs
  --organism ORGANISM   Organism for reference/background.
  --test_type TEST_TYPE
                        One of FISHER or BINOMIAL
  --annotation_option ANNOTATION_OPTION
                        Annotation option, see table in code
  --min_size MIN_SIZE   Minimum cluster size to test
  --start_cluster START_CLUSTER
                        Line in file to start at (from 0)

Examples

./clusters_to_panther.py --remove_version --min_size 8 cluster_example.txt cluster_example.out

Cluster 0`
Reference size:         21042
Number IDs mapped:      66
Number IDs not mapped:  4
-----------------------------

Cluster 1
Reference size:         21042
Number IDs mapped:      21
Number IDs not mapped:  2
No statistically significant results.
No results to save
-----------------------------

Cluster 2
Reference size:         21042
Number IDs mapped:      20
Number IDs not mapped:  1
-----------------------------

If a "Session Exceeded" error occurs, you can restart from the last cluster tested using the start_cluster optional argument.

python-panther's People

Contributors

carissableker avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.