Coder Social home page Coder Social logo

biomartpy's Introduction

biomartpy

Simple interface to access BioMart from Python (Python -> rpy2 -> R's biomaRt -> pandas.DataFrame), originally written to get a lookup table of gene IDs -> various attributes for downstream work...

Install from PyPI:

$ pip install biomartpy

Or from github:

$ git clone [email protected]:daler/biomartpy.git
$ cd biomartpy
$ python setup.py develop

Choose a mart (use list_marts() to decide):

>>> mart_name = 'ensembl'

Choose a dataset (use list_datasets(mart_name) to decide):

>>> dataset = 'dmelanogaster_gene_ensembl'

Choose some attributes (use list_attributes(mart_name, dataset) to decide):

>>> attributes = ['flybase_gene_id', 'flybasename_gene', 'description']

Get a pandas.DataFrame as a lookup table, indexed by the first attribute in the provided list:

>>> df = make_lookup(mart_name, dataset, attributes=attributes)

.ix to extract rows:

>>> df.ix['FBgn0031209']
flybasename_gene                                                Ir21a
description         Ionotropic receptor 21a [Source:FlyBase gene n...
Name: FBgn0031209

>>> df.ix['FBgn0031209']['flybasename_gene']
'Ir21a'

When providing filters and values, you can either provide them in the way R expects (filters is a list, values is a list-of-lists with one list for each filter) or as a more convenient dictionary (here, only geting these IDs, and only for chromosome 2L):

>>> filters = {
... 'flybase_gene_id': ['FBgn0031208', 'FBgn0002121', 'FBgn0031209', 'FBgn0051973'],
... 'chromosome_name': ['2L']}

Set up attributes (here, including chromosome_name to make sure results are correct, but attributes and filters don't have to necessarily match):

>>> attributes = ['flybase_gene_id', 'flybasename_gene', 'chromosome_name']

Get data:

>>> df = make_lookup(
... mart_name=mart_name,
... dataset=dataset,
... attributes=attributes,
... filters=filters)

Check results:

>>> df
                flybasename_gene chromosome_name
flybase_gene_id
FBgn0002121               l(2)gl              2L
FBgn0031208              CG11023              2L
FBgn0031209                Ir21a              2L
FBgn0051973                 Cda5              2L

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.