Coder Social home page Coder Social logo

dynamictable's Introduction

MapTracker Graph Database

MapTracker is a massive graph database - over a terabyte on disk, 1.2B nodes, 2.0B edges, 3.5B metadata assignments. It's used at BMS to "resolve X to Y" - that is, given an object of "type X" find - in a qualitative way - all "related" objects of "type Y". This is done using an aggresively normalized triple store and a large set of rules that dictate what kinds of edges are reasonable to traverse when going from X to Y.

MapTracker is generally not used "on its own", but is rather a component in other tools. Examples avaiable here are:

  • Chem-Bio Hopper - "Hop" from biology to chemistry, or vice-versa, using published chemical activities
  • Hypergeometric Affy - Given a set of "interesting" (generally overexpressed) Affymetrix probesets, run Fisher's Exact Test to identify ontologies that appear overrepresented in the set.
  • Standardize Gene - Given a set of gene identifiers (eg symbols), attempt to determine what they "really are" (ie, given messy gene symbols, convert to rigorous gene accessions)

The schema (tables) is relatively simple. What has made MapTracker particularly powerful is:

  • Careful normalization of loaded data
  • Segregation of nodes into namespaces. Ameliorates collisions, particularly with identifiers like gene symbols
  • Exhaustive logic defining valid connections between X-to-Y. Example, RNA to probeset
  • Generic transitive logic that lets X-to-Y be automatically merged with Y-to-Q and Q-to-W in order to find X-to-W. Such "chains" allow only fundamental connections to be defined yet allow the network to be (safely, rationally) explored far beyone its expected "neighbors"

The image below is an auto-generated network, created by sampling 20,000 random edges from the database (created by exploreSelf.pl). It represents, at a high level, the common node-edge-node triples held by the database.

Network overview

All edges are part of a controlled vocabulary. Most (though not all) are directional. The edges in the above sample include:

Edge overview

dynamictable's People

Contributors

maptracker avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.