Coder Social home page Coder Social logo

enhancer-snakemake-demo's Introduction

Overview

This is a Snakefile demo to identify candidate enhancer regions in mouse embryonic liver using data from PMID: 22763441. This uses a different method than the authors used to identify enhancers.

Input is an internet connection (files are downloaded from GEO). Output is 2, 3, 4, and 5-state models from ChromHMM.

Prepare

  • Install Snakemake: https://bitbucket.org/johanneskoester/snakemake. Snakemake blends the best of Bash, Python, and Makefiles.
  • Download http://compbio.mit.edu/ChromHMM/.
  • Edit config.py to point to the paths on disk.
  • Prepare the BED files you'd like to check against chromatin states by adding them to the compare/links directory. There are already some data from related ENCODE data (created by running the get-data.py script) and some positive mouse enhancers from enhancer.lbl.gov (from running enhancer.lbl.gov.py). These files are small enough to include in the repo, hence they're not downloaded in the Snakefile.

Run

  • Run snakemake -npr as a dry-run to see what will be run.
  • Run snakemake -pr -j$N , where $N is the number of CPUs, to run the pipeline.

Output

For each number of states $s, see:

  • output/$s-state/webpage_$s.html for the states,
  • output/$s-state/*_enrichment.png for the enrichment with supplied BED files.
  • output/$s-stats/*_dense.bed for a BED file to upload to UCSC.

For example, state 2 in this 4-state model has strongest emmission parameters for marks we'd expect over enhancers, and has the strongest enrichment for other factors we'd expect to be at/near enhancers.

image

image

enhancer-snakemake-demo's People

Contributors

daler avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

enhancer-snakemake-demo's Issues

Only getting half the data from enhancer.lbl.gov

It seems as though you are only grabbing the coordinates in the fasta file that are annotated a Mouse. However this is only grabbing about half of the data. The enhancer guys do a provide the coordinates between human and mouse on the interface, but not in the fasta file. I would suggest grabbing human data and doing a lift over to map human to mouse coordinates in order to get all the rested enhances

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.