Coder Social home page Coder Social logo

genes's Introduction

Genes for Project Cognoma

This repository creates the set of genes to be used in Project Cognoma. The human subset of Entrez Gene is the basis of Cognoma genes. All genes in Cognoma should be converted to Entrez GeneIDs (using a preferred variable name of entrez_gene_id).

When encountering genes in Project Cognoma, identify which of the following approach should be applied:

  • If the input genes are only in symbols, open an issue to discuss mapping options.
  • If the input genes contain chromosome and symbol information, use chromosome-symbol-map.tsv to map the genes to Entrez GeneIDs.
  • If the genes are already encoded as Entrez GeneIDs, update the Gene_IDs to their most recent versions using updater.tsv and remove GeneIDs that are not in genes.tsv.

Downloads and data

The raw (downloaded) data is stored in the download directory. versions.json contains timestamps for the raw data. The raw data is tracked since the Entrez Gene FTP site doesn't version and archive files.

Created data is stored in the data directory. Applications should use the processed data rather than the raw data, if possible. Applications are strongly encouraged to use versioned (commit-hash-containing) links when accessing data from this repository.

Execution

Use the following commands to run the analysis, inside the environment specified by environment.yml:

# To run the entire analysis
python 1.download.py
python 2.process.py

# To run just the data processing
python 2.process.py

In general, we don't anticipate redownloading the data frequently. If you submit a pull request to create additional datasets, please do not execute 1.download.py.

genes's People

Contributors

dhimmel avatar

Watchers

James Cloos avatar Patrick Miller avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.