Coder Social home page Coder Social logo

teachingmaterial's Introduction

RNA-seq data analysis practical

This tutorial will illustrate how to use standalone tools, together with R and Bioconductor for the analysis of RNA-seq data. Keep in mind that this is a rapidly evolving field and that this document is not intended as a review of the many tools available to perform each step; instead, we will cover one of the many existing workflows to analyse this type of data.

We will be working with a subset of a publicly available dataset from Drosophila melanogaster, which is available both in the Short Read archive (SRP001537 - raw data) and in Bioconductor (pasilla package - processed data). For more information about this dataset please refer to the original publication (Brooks et al. 2010).

The tools and R packages that we will be using during the practical are listed below (see Software requirements) and the necessary data files can be found here. After dowloading and uncompressing the tar.gz file, you should have the following directory structure in your computer:

RNAseq
|-- reference               # reference info (e.g. genome sequence and annotation)
`-- data
    |-- raw                 # raw data: fastq files
    |-- mapped              # mapped data: BAM files
    `-- demultiplexing      # extra fastq files for the demultiplexing section

This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License. This means that you are able to copy, share and modify the work, as long as the result is distributed under the same license.

Table of contents

  1. Dealing with raw data
    1. The FASTQ format
    2. Quality assessment (QA)
    3. Filtering FASTQ files
    4. De-multiplexing samples
    5. Aligning reads to the genome
  2. Dealing with aligned data
    1. The SAM/BAM format
    2. Visualising aligned reads
    3. Filtering BAM files
    4. Gene-centric analyses:
      1. Counting reads overlapping annotated genes
        • With htseq-count
        • With R
        • Alternative approaches
      2. Normalising counts
        • With RPKMs
        • With DESeq2
      3. Differential gene expression
    5. Exon-centric analyses:
      1. Differential exon usage
    6. Transcript-centric analyses:
      1. Identification, annotation and visualisation of splicing switch events

Software requirements

Note: depending on the topics covered in the course some of these tools might not be used.

Other resources

Course data

Getting started in R and UNIX

More on RNA-seq and NGS

Aknowledgments

This tutorial has been inspired on material developed by Ângela Gonçalves, Nicolas Delhomme, Simon Anders and Martin Morgan, who I would like to thank and acknowledge. Special thanks must go to Ângela Gonçalves and Mitra Barzine, with whom I have been teaching, and to Gabriella Rustici, for always finding a way to organise a new course.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.