Coder Social home page Coder Social logo

tcga_example's Introduction

TCGA实战大全

首先需要自行根据我在生信技能树平台发布的系列教程来了解TCGA基础知识,需要至少14个小时的持续学习,目录见:TCGA基础知识传送门

如果需要视频讲解,欢迎购买我的网易云课程:https://study.163.com/course/introduction/1006067243.htm (如无必要,请勿购买,谢谢理解)

TCGA数据的探索最基本的就是3个需求:

  • 根据各种指标(某基因突变与否,肿瘤分期)把样本分组来比较感兴趣基因的表现(表达,突变,甲基化)情况。
  • 使用统计学方法看某个感兴趣基因的重要性,比如生存分析,差异分析等等。
  • 看某两个感兴趣基因的相关性,调控或者其它。

KIRC的miRNA实战

首先需要了解TCGA计划中的KIRC这个癌症背景知识,见PPT

然后需要通读我们本次实战所需要复现的文章Integrated genomic analysis identifies subclasses and prognosis signatures of kidney cancer. 该文章并没有任何特殊之处,纯粹是举个例子,这样类似的文章多达3000篇。

通过文章我们了解到了实现一个TCGA数据挖掘的基本步骤

  • 下载对应的TCGA数据,主要是根据癌症种类选择6种数据,比如KIRC的clinical和miRNA数据,这里有8个数据中心供选择。
  • 把病人队列分成训练集和测试集,然后可能需要在GEO数据库也同步查找可供挖掘数据
  • 然后走一波统计分析,比如差异分析,生存分析,lasso回归,随机森林等等找到目标基因集
  • 接着一波可视化说明找到的基因集具有明显的意义,包括森林图,热图,火山图等等
  • 对最后的基因集计算得到预测风险的公式,还有可视化展现风险因子关联情况。

TCGA高阶分析

主要是针对TCGA的全部类型数据,包括:

  • DNA Sequencing(包括全基因组和全外显子组的maf格式somatic突变数据)
  • miRNA Sequencing (表达矩阵)
  • Protein Expression(表达矩阵)
  • mRNA Sequencing(测序的表达矩阵)
  • Total RNA Sequencing(表达矩阵)
  • Array-based Expression(芯片的表达矩阵)
  • DNA Methylation (25/450/850K的甲基化芯片或者WGBS)
  • Copy Number(主要是SNP6.0芯片,还有测序后计算的拷贝数变异情况)

首先可以使用maftools等工具来可视化全基因组和全外显子组的maf格式somatic突变数据,代码是:

网页工具大全

多不胜数,简单列举如下:

重点不是介绍这些网页工具的用法,如果真正理解了TCGA计划的前因后果以及数据规律,就很容易明白网页工具的设计逻辑,更重要的是可以合理利用网页工具,在它们的基础上面使用R语言做定制化的深度分析。

tcga_example's People

Contributors

jmzeng1314 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tcga_example's Issues

cbind,cl

when I run the commend "cl_df <- t(do.call(cbind,cl))", the process stopped and it said "Error in (function (..., deparse.level = 1) : 矩阵的行数必需相符(见arg5)"

env_set

could you offer a conda env for us ? Give out a tcga_env.yaml file is better , Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.