Coder Social home page Coder Social logo

dyskerin_manuscript's Introduction

DOI

Dyskerin manuscript

Code Repository for the Data Analyses of Co-Transcriptional Pseudouridylation

Requirements

In order to succesfully execute the code in this repository, the following software and packages are needed:

  • R >= 4.1.0
  • R packages (available via Bioconductor):
    • Rsamtools
    • GenomicAlignments
    • GenomicFeatures
    • GenomeInfoDb
    • rtracklayer
    • data.table
    • ggpubr
    • cowplot
    • ggsci
    • scales
    • ggthemes
    • ggrepel
    • ggVennDiagram
    • viridis
    • tximport
    • DESeq2
    • IHW
    • org.Hs.eg.db
    • UpSetR
    • GGally
    • outliers
  • [OPTIONAL] deepTools
  • [OPTIONAL] SparK

Preliminary actions

Obtain the required script and files

Dowload or clone the repository and move to the working directory:

cd Dyskerin_manuscript

[OPTIONAL] Setup a conda environment

To install Miniconda, a free minimal installer for conda, follow these instructions.

Assuming that conda is available on the system, use the environment.yml file to generate a minimal working environment, comprising R and some basic packages

conda env create -f environment.yml

The installation of the additional pacakges has to be perfomed from within the newly generated environment:

conda activate dyskerin_manuscript

Install Bioconductor packages

Open an R session:

R

Install Bioconductor

if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install(version = "3.16")

Install missing packages

BiocManager::install(c("GenomicFeatures", "GenomeInfoDb", "org.Hs.eg.db", "tximport", "DESeq2",  "IHW", "apeglm"))

Prepare the inputs

Download the Gencode v27 annotation file (GFF3 format), rename it gencode.v27.annotation.gff3.gz, and place it into data/Annotation.

Generate the RepeatMasker annotation (TSV format) by using the TableBrowser, rename it hg38_repeatMasker.tsv.gz, and place it into data/Annotation.

Some of the files that are employed by the script were too large to be uploaded on this repository and, therefore, they must be either retrieved from the GEO repository (GSE211202) or generated (e.g. mapped BAM files, normalised BigWig coverage files, etc.) before they can be read into R or deepTools.

NOTE: The code can be executed even without BAM files, but some steps might not return the expected output.

[CRUCIAL] Define local paths

In the main R markdown file, which contains the procedure on the majority of the analysis and visualisations, there are some headers marked as USER_ACTION; these define sections/statements that have to be changed/modified by the user in order to locate and import the files required for these steps. Briefly, these include the RepeatMasker annotation, the Salmon quantification paths, and the BAM files for the different datasets. For convenience, the rosetta table file, which contains labels and paths to the required BAM files, is read into R; so, make sure to modify it according to your setup.

Genomic coverage profiles

All genomic coverage profiles were generated using SparK (see createGenomicPlots.sh script for procedure).

Genomic metadata profiles

Genome coverage normalisations and genomic metadata matrices were generated using deepTools (see createNormalisedTracks.sh and createMatrixPlots.sh scripts for procedure). The metadata matrices were imported into R and plotted.

dyskerin_manuscript's People

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.