csoneson / armor Goto Github PK

View Code? Open in Web Editor NEW

156.0 16.0 34.0 29.59 MB

Light-weight Snakemake workflow for preprocessing and statistical analysis of RNA-seq data

License: MIT License

R 55.67% Python 44.33%

rna-seq workflow

armor's People

Contributors

Stargazers

Watchers

armor's Issues

consistent chunk names in Rmd files

Name all code chunks in Rmd templates
Keep consistent naming scheme (e.g. dashes rather than underscores)
Add method identifier (edgeR/DRIMSeq) to chunk name?

Input exception wit SE reads

hi, for some reason my SE input is calling on trimgalorePE instead of SE?
Log:
Building DAG of jobs...
MissingInputException in line 297 of /home/ruhrig/ARMOR/Snakefile:
Missing input files for rule trimgalorePE:
/nfs/scratch/RNA-seq/Human/carrier/fastq/SRR3583291 SE MCF7_control_None.fastq.gz

Metadata.txt:
names type condition
SRR3583291 SE MCF7_control
SRR3583292 SE MCF7_RA
SRR3583293 SE BT474_control
SRR3583294 SE BT474_RA

Config:
metatxt: /nfs/scratch/ARMOR-runs/metadata_carrier.txt
design: "~ 0 + condition"
contrast: conditionMCF7_RA-conditionMCF7_control,conditionBT474_RA-conditionBT474_control
genesets: H,C5
ncores: 7
FASTQ: /nfs/scratch/RNA-seq/Human/carrier/fastq
fqext1: 1
fqext2: 2
fqsuffix: fastq
output: /nfs/scratch/ARMOR-runs/carrierout
useCondaR: True
Rbin: R
run_trimming: True
run_STAR: True
run_DRIMSeq: True
run_camera: False

Make nice diagram of workflow for paper

groupvar

Do we still need the groupvar (specified in the config file)? It is only used to create the bwCond column of the final SCEs, and in the end any column could be used there.

extend DRIMSeq Rmd text

Make sure gitinstall rule runs first

CalledProcessError in EdgeR

Hi, I'm not sure what this means? I've previously run the workflow successfully with another dataset. In this case I'm getting the following error:
RuleException:
CalledProcessError in line 526 of /home/ruhrig/ARMOR/Snakefile:
Command 'source /home/ruhrig/miniconda3/bin/activate '/home/ruhrig/ARMOR/.snakemake/conda/17597c04'; set -euo pipefail; R CMD BATCH --no-restore --no-save "--args se='/nfs/scratch/ARMOR-runs/williamsonout/outputR/tximeta_se.rds' organism='Homo_sapiens' design='~0+condition' contrast='UV_8_hr-Untreated,UV_24_hr-Untreated' rmdtemplate='scripts/edgeR_dge.Rmd' outputdir='/nfs/scratch/ARMOR-runs/williamsonout/outputR' outputfile='edgeR_dge.html'" scripts/run_render.R /nfs/scratch/ARMOR-runs/williamsonout/Rout/run_dge_edgeR.Rout' returned non-zero exit status 1.
File "/home/ruhrig/ARMOR/Snakefile", line 526, in __rule_edgeR
File "/home/ruhrig/miniconda3/lib/python3.6/concurrent/futures/thread.py", line 56, in run
[Tue Mar 19 00:19:26 2019]
Finished job 9.
34 of 62 steps (55%) done
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message

Gencode / Ensembl

I am a bit confused regarding parameters in the Reference annotation details-Section of config.yaml:

What does Ensembl vs Gencode mean? What is its influence on the analysis?
Why does one even have to provide organism, build and release? My GTF is depending on an denovo assembly of the transcriptome which got merged with an already existing annotation of the genome.

My organism is Apis mellifera and I the annotation and the genome assembly is fetched from NCBI (ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/003/254/395/GCF_003254395.2_Amel_HAv3.1). How do I have to set the parameters in this case?

Vignette of public data set analysis

airway data?

About threads/cores

In the configuration file, config.yaml, says

## The maximal number of cores to use for FastQC, STAR, Salmon and DRIMSeq.                  
## Note that the actual number of cores available to Snakemake is determined by              
## the --cores argument when it is invoked.
ncores: 1

and then in some rules you have:

threads:
	config["ncores"]

I got confused a couple of moths ago when I didn't know snakemake as I do know. As you say, with --cores you specify the maximal number of cores used by snakemake, so when your are running the rule star_SE with snakemake --cores 12 and you specify in the config file ncores: 2, you'll be running 6 instances of STAR in parallel, but each instance will be using 2 cores, you won't be running STAR in one file with 12 cores (one instance). I didn't get that from the config file. I'm saying this because it's important to know how many input files you have (approximately) to set ncores properly and, I think, many times is better to set threads independently in each rule because of this and also, because different software has different performance when running with multiple cores in parallel. My intention isn't that you change the behavior of this pipeline, It is rather to try to avoid confusing (unintentionally) some users (like me). Maybe you can add some information in the wiki or in the config file about this.

Unit testing

Use Travis CI?
Test different versions of config.yaml?
Use a predefined output to test, put it in github?

`Ncpu` in `install.packages()` ?

Can we improve R package installation times using Ncpu? Study this.

input and output directories

In the config file, allow the user to specify an input directory (where the fastq.gz files are), as well as an output directory (recommended default .) where the output will be generated.

Add bias parameters to Salmon call

Add bias parameters (and potentially also other arguments) to Salmon call

Add success/error messages in Snakefile

E.g., like in https://twitter.com/khanaziz84/status/1082203240752734208 (probably printing a success/failure message to the console is enough)

Clean up `multi_experiment` branch

If we want to go this route, we could/should:

clean up the RMD files to get rid of the XXXX and ******* parts
expose other variables, like geneset subsets, DRIMSeq filtering to config.yaml?
"arbitrary design and multiple contrasts" ?

https://github.com/csoneson/ARMOR/tree/multi_experiment

Error when running the example dataset

I ran snakemake --use-conda on the example dataset to test my installation.

I get the following error:

Quitting from lines 396-398 (edgeR_dge.Rmd) 
Error in msigdbr(species = organism, category = genesets) : 
  please specify only one category at a time

clarify how paths in config.yaml need to be specified

Unless we can find a good solution to allow relative paths, stress that all paths must be either absolute or relative to the location of the Snakefile.

Add documentation about .Renviron file

If not using conda R, need to set the path to the library you want to use. Also (important!), the library directory must exist, otherwise it will not be used by R.

useCondaR variable

make sure packages are installed in .libPaths()[1]
separate environments (one with R, one with everything else)
in config file: useCondaR (TRUE/FALSE) -> in the Snakefile, set variable with environment to use for the conda directive in the R rules

Test across operating systems

Add system-dependent installation instructions

Export `SummarizedExperiment` object

Export SummarizedExperiment object (with data and DGE results) for use with e.g. iSEE.

Changes to SCEs

• Somewhere (already in the tximeta script?) we should/could set the rownames of the SEs to geneID__geneSymbol or so (I'd prefer not to use . as the separator since it may already be in the gene ID if it's versioned, but __ should not be in any of the individual components)
• When adding the edgeR/DRIMSeq results, we could add an additional column mlog10PValue or something, containing -log10(PValue). This makes it easier to make volcano plots.

extend the checkinputs to check all inputs

R packages not available from conda

This could be e.g. new packages or packages for which one would like a specific version which is not available via conda. How do we handle this? Maybe one option would be to have an R script with dedicated code to install all such packages, and a rule to run this script and install the packages in the environment?

Error when using bacterial fa and gtf file in ARMOR

Dear Authors and Experts ARMOR,

I am a microbiologist, I am learning RNAseq analysis. However, when I run the file, I had the problem when the ARMOR was running:
Activating conda environment: /home/leo/ARMOR/.snakemake/conda/10da7374
Jul 01 15:58:02 ..... started STAR run
Jul 01 15:58:02 ..... loading genome
Jul 01 15:58:03 ..... started mapping
/bin/bash: line 1: 37248 Segmentation fault (core dumped) STAR --genomeDir /home/leo/ARMOR/E109_E453/E109_E453_starindex.idx --readFilesIn /home/leo/ECOPAIR_RNAseq_rawdata/E109_453_RNAseq_output/FASTQtrimmed/E453_4_2_R1_val_1.fq.gz /home/leo/ECOPAIR_RNAseq_rawdata/E109_453_RNAseq_output/FASTQtrimmed/E453_4_2_R2_val_2.fq.gz --runThreadN 18 --outFileNamePrefix /home/leo/ECOPAIR_RNAseq_rawdata/E109_453_RNAseq_output/STAR/E453_4_2/E453_4_2_ --outSAMtype BAM SortedByCoordinate --readFilesCommand gunzip -c
[Wed Jul 1 15:58:05 2020]
Error in rule starPE:
jobid: 57
output: /home/leo/ECOPAIR_RNAseq_rawdata/E109_453_RNAseq_output/STAR/E453_4_2/E453_4_2_Aligned.sortedByCoord.out.bam
log: /home/leo/ECOPAIR_RNAseq_rawdata/E109_453_RNAseq_output/logs/STAR_E453_4_2.log (check log file(s) for error message)
conda-env: /home/leo/ARMOR/.snakemake/conda/10da7374
shell:
echo 'STAR version:
' > /home/leo/ECOPAIR_RNAseq_rawdata/E109_453_RNAseq_output/logs/STAR_E453_4_2.log; STAR --version >> /home/leo/ECOPAIR_RNAseq_rawdata/E109_453_RNAseq_output/logs/STAR_E453_4_2.log; STAR --genomeDir /home/leo/ARMOR/E109_E453/E109_E453_starindex.idx --readFilesIn /home/leo/ECOPAIR_RNAseq_rawdata/E109_453_RNAseq_output/FASTQtrimmed/E453_4_2_R1_val_1.fq.gz /home/leo/ECOPAIR_RNAseq_rawdata/E109_453_RNAseq_output/FASTQtrimmed/E453_4_2_R2_val_2.fq.gz --runThreadN 18 --outFileNamePrefix /home/leo/ECOPAIR_RNAseq_rawdata/E109_453_RNAseq_output/STAR/E453_4_2/E453_4_2_ --outSAMtype BAM SortedByCoordinate --readFilesCommand gunzip -c
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Removing output files of failed job starPE since they might be corrupted:
/home/leo/ECOPAIR_RNAseq_rawdata/E109_453_RNAseq_output/STAR/E453_4_2/E453_4_2_Aligned.sortedByCoord.out.bam
[Wed Jul 1 16:04:31 2020]
Error in rule tximeta:
jobid: 62
output: /home/leo/ECOPAIR_RNAseq_rawdata/E109_453_RNAseq_output/outputR/tximeta_se.rds
log: /home/leo/ECOPAIR_RNAseq_rawdata/E109_453_RNAseq_output/Rout/tximeta_se.Rout (check log file(s) for error message)
conda-env: /home/leo/ARMOR/.snakemake/conda/9f6359b5
shell:
R CMD BATCH --no-restore --no-save "--args salmondir='/home/leo/ECOPAIR_RNAseq_rawdata/E109_453_RNAseq_output/salmon' json='/home/leo/ARMOR/E109_E453/E109_E453_saslmonindex.sidx.json' metafile='/home/leo/ARMOR/E109_E453/metadata_E109_E453.txt' outrds='/home/leo/ECOPAIR_RNAseq_rawdata/E109_453_RNAseq_output/outputR/tximeta_se.rds' annotation='Ensembl' organism='Ecoli'" scripts/run_tximeta.R /home/leo/ECOPAIR_RNAseq_rawdata/E109_453_RNAseq_output/Rout/tximeta_se.Rout
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Error! The Snakemake workflow aborted.
Complete log: /home/leo/ARMOR/.snakemake/log/2020-07-01T152546.626254.snakemake.log

I am not a bioinformatic expert. I do not know how to fix the problem,
When you have time, could you please let me know how to solve this problem?

Thank you in advance,
Kind regards,
Le Phuong.

FASTQ file sample suffix

Is it unnecessarily restrictive to assume that FASTQ files are of the form <sample>_{R1/R2}.fastq.gz? I.e., should we allow the user to specify the {R1/R2} part in the config file (just {1/2} is also very common)?

Make trimming optional

Software version numbers

We discussed this, and probably it's best to remove the version numbers from the environment.yaml file, so that users will not be stuck with old versions. Any user can always specify the versions they want (to make their own analyses reproducible), and we can provide a list of the versions that we have tested the workflow with. In the Wiki, we can also show how to export an environment file from a specific conda environment.

trouble with running the example dataset

First time trying to use the pipeline on a server.
OS is Linux Mint 18.1
Python 3.5.2
Snakemake installed (pip3), conda installed.
I'm trying to run the example set, but when I run snakemake --use-conda, I get the following error.

File "/usr/local/lib/python3.5/dist-packages/snakemake/init.py", line 624, in snakemake
batch=batch,
File "/usr/local/lib/python3.5/dist-packages/snakemake/workflow.py", line 728, in execute
quiet=list_conda_envs,
File "/usr/local/lib/python3.5/dist-packages/snakemake/dag.py", line 260, in create_conda_envs
env.create(dryrun)
File "/usr/local/lib/python3.5/dist-packages/snakemake/conda.py", line 227, in create
conda = Conda(self._singularity_img)
File "/usr/local/lib/python3.5/dist-packages/snakemake/conda.py", line 338, in new
inst.init(singularity_img=singularity_img)
File "/usr/local/lib/python3.5/dist-packages/snakemake/conda.py", line 352, in init
self.info = json.loads(shell.check_output(self._get_cmd("conda info --json")))
File "/usr/lib/python3.5/json/init.py", line 312, in loads
s.class.name))
TypeError: the JSON object must be str, not 'bytes'

Any ideas on what could be wrong?

provide example data

Provide a link to some small example data that can be used to test that the workflow works and does what it is intended to do (perhaps also a way to check the output).

pandoc

Do we need to include pandoc and pandoc-citeproc in the environment file, to make sure that the Rmarkdown files will compile?

error in rule bigwig

When running ARMOR on some data I get an error when it is trying to generate the bigwig files.
Error in rule bigwig: jobid: 51 output: Jin/output/STARbigwig/p63KO35_Aligned.sortedByCoord.out.bw log: Jin/output/logs/bigwig_p63KO35.log (check log file(s) for error message) conda-env: /home/ryan/ARMOR/.snakemake/conda/74957150 shell: echo 'bedtools version: ' > Jin/output/logs/bigwig_p63KO35.log; bedtools --version >> Jin/output/logs/bigwig_p63KO35.log; bedtools genomecov -split -ibam Jin/output/STAR/p63KO35/p63KO35_Aligned.sortedByCoord.out.bam -bg | sort -k1,1 -k2,2n > Jin/output/STARbigwig/p63KO35_Aligned.sortedByCoord.out.bedGraph; bedGraphToBigWig Jin/output/STARbigwig/p63KO35_Aligned.sortedByCoord.out.bedGraph /home/ryan/ARMOR/reference/human/STARIndex/Homo_sapiens.GRCh38.V30.STAR.idx/chrNameLength.txt Jin/output/STARbigwig/p63KO35_Aligned.sortedByCoord.out.bw; rm -f Jin/output/STARbigwig/p63KO35_Aligned.sortedByCoord.out.bedGraph

In the terminal i see:
Jin/output/STARbigwig/p63KO35_Aligned.sortedByCoord.out.bedGraph is not case-sensitive sorted at line 38881834. Please use "sort -k1,1 -k2,2n" with LC_COLLATE=C, or bedSort and try again.

when looking at that specific line, we see:

`sed -n '38881833,38881835p' p63KO35_Aligned.sortedByCoord.out.bedGraph

chrY 57215886 57215994 1

GL000008.2 83526 83545 1

GL000008.2 85565 85566 1
`
I am using the Genome sequence, primary assembly (GRCh38) from gencode.

Not sure how to proceed. Thanks!

Add 'camera' for gene set testing?

what set of gene sets to use? MSigDB?
link to Ensembl identifiers (MSigDB has Entrez ids)
how to integrate with existing BioC annotation packages

trouble running the pipeline for a bacterial species

Hi,

after successfully running the pipeline for human and zebrafish, I've encountered some problems with a bacterial species. After going through (hopefully all) issues here that were already encountered and solved on github, I can't really find a solution for this one, so maybe you can help?
The error occurs at tximeta, here's an error Output, apparently there is a problem with .makesplicing. Any ideas on what may be causing this?

args <- (commandArgs(trailingOnly = TRUE))
for (i in seq_len(length(args))) {

```
eval(parse(text = args[[i]]))
```
}

suppressPackageStartupMessages({

```
library(dplyr)
```
```
library(tximport)
```
```
library(tximeta)
```
```
library(SingleCellExperiment)
```
})

print(salmondir)
[1] "example_data/output/salmon"
print(json)
[1] "example_data/reference/SalmonIndex/Cg.sidx.json"
print(metafile)
[1] "example_data/metadata.txt"
print(outrds)
[1] "example_data/output/outputR/tximeta_se.rds"
print(annotation)
[1] "Gencode"
print(organism)
[1] "Corynebacterium_glutamicum"

Load json linkedTxome

loadLinkedTxome(json)
saving linkedTxome in bfc (first time)

Read metadata

metadata <- read.delim(metafile, header = TRUE, as.is = TRUE, sep = "\t")

List Salmon directories

salmonfiles <- paste0(salmondir, "/", metadata$names, "/quant.sf")
names(salmonfiles) <- metadata$names

Add file column to metadata and import annotated abundances

In transcript level

coldata <- cbind(metadata, files = salmonfiles, stringsAsFactors = FALSE)
st <- tximeta::tximeta(coldata)
importing quantifications
reading in files with read_tsv
1 2 3 4
found matching linked transcriptome:
[ GENCODE - Corynebacterium glutamicum - release 1 ]
building TxDb with 'GenomicFeatures' package
Import genomic features from the file as a GRanges object ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... Error in .make_splicings(exons, cds, stop_codons) :
some CDS cannot be mapped to an exon
Calls: ... makeTxDbFromGFF -> makeTxDbFromGRanges -> .make_splicings
In addition: Warning message:
In .get_cds_IDX(mcols0$type, mcols0$phase) :
The "phase" metadata column contains non-NA values for features of type
stop_codon. This information was ignored.
Execution halted

best

Anze

"Methods" document

Make a summary document describing the provided workflow in words (with references)

extend checks of input variables

Extend checks of input variables (design, contrast, etc) in run_render.R/generate_report.R. In addition to just checking that these are character scalars, check that they are formatted correctly and give an informative error message otherwise.

Convert statistical analysis scripts into R/markdown templates

Provide edgeR and DRIMSeq analysis code as .Rmd files.
Save figures in png/pdf
Export results of the analyses

Ensembl -> Gencode

Change the default annotation catalog from Ensembl to Gencode. Avoids the need to merge cDNA and ncRNA fasta files, and ensures that the gtf and fasta file have the same set of transcripts.

Affects at least the following files:

README.md (instructions for downloading reference files)
config.yaml (remove cdna and ncrna, use only txome)
Snakefile (no need to merge cDNA and ncRNA files)
generate_tx2gene.R (transcript ID format is different)
run_dge_edgeR.R and run_dtu_drimseq.R (set ignoreAfterBar=TRUE in tximport call - alternatively run Salmon indexing with gencode flag)

To make things more general (allow as many species as possible), we could allow both Ensembl and Gencode (potentially requiring the user to merge the cDNA and ncRNA files on their own beforehand, if they want to use both). Then, in both cases we would only require an input txome, and remove the cDNA/ncRNA merging.

tximeta error

Hi all,

I just git cloned a copy of ARMOR and invoked the workflow with
snakemake --cores 10

It came to an error :
rule tximeta:
input: example_data/output/Rout/pkginstall_state.txt, example_data/output/salmon/SRR1039508/quant.sf, example_data/output/salmon/SRR1039509/quant.sf, example_data/output/salmon/SRR1039512/quant.sf, example_data/output/salmon/SRR1039513/quant.sf, example_data/metadata.txt, example_data/reference/SalmonIndex/Homo_sapiens.GRCh38.93_0.8.2/hash.bin, example_data/reference/SalmonIndex/Homo_sapiens.GRCh38.93_0.8.2.json, scripts/run_tximeta.R
output: example_data/output/outputR/tximeta_se.rds
log: example_data/output/Rout/tximeta_se.Rout
jobid: 44

Error in job tximeta while creating output file example_data/output/outputR/tximeta_se.rds.
RuleException:
CalledProcessError in line 499 of /mnt/mcfiles/rlyu/Projects/Snakemake_projects/ARMOR/ARMOR/Snakefile:
Command '/usr/bin/R CMD BATCH --no-restore --no-save "--args salmondir='example_data/output/salmon' json='example_data/reference/SalmonIndex/Homo_sapiens.GRCh38.93_0.8.2.json' metafile='example_data/metadata.txt' outrds='example_data/output/outputR/tximeta_se.rds' annotation='Ensembl' organism='Homo_sapiens'" scripts/run_tximeta.R example_data/output/Rout/tximeta_se.Rout' returned non-zero exit status 1.
File "/mnt/mcfiles/rlyu/Projects/Snakemake_projects/ARMOR/ARMOR/Snakefile", line 499, in __rule_tximeta
File "/mnt/mcfiles/rlyu/Software/miniconda/miniconda3/envs/ARMOR/lib/python3.6/concurrent/futures/thread.py", line 56, in run
Will exit after finishing currently running jobs.
Exiting because a job execution failed. Look above for error message

I looked at the Rout, it was like,

OK
Processing exons ... OK
Processing chromosomes ... Fetch seqlengths from ensembl ... Error in readLines(curl(my_url)) :
Timeout was reached: Connection timed out after 10001 milliseconds
Error in readLines(curl(my_url)) :
Timeout was reached: Connection timed out after 10001 milliseconds
Error in readLines(curl(my_url)) :
Timeout was reached: Connection timed out after 10000 milliseconds
Error in readLines(curl(my_url)) :
Timeout was reached: Connection timed out after 10000 milliseconds
Error in readLines(curl(my_url)) :
Timeout was reached: Connection timed out after 10000 milliseconds
FAIL
OK

...

sg <- summarizeToGene(st)
loading existing EnsDb created: 2019-03-28 04:55:12
obtaining transcript-to-gene mapping from TxDb
Error in tximport::summarizeToGene(txi, tx2gene, varReduce = varReduce, :
unused argument (varReduce = varReduce)
Calls: summarizeToGene
Execution halted

Please let me know if you have any idea about how to fix it.

Thanks,

Docker?

Provide a docker container with the workflow?

Complete description in Wiki (snakemake installation, any others?)

Consider moving instructions from README to a Wiki. The wiki can also include a Methods document, describing the provided workflow in words (with references).

add conda environment to individual rules

By adding the conda environment to each rule, a user can either activate the environment and run snakemake, or directly run snakemake --use-conda.

Rename wiki pages with number to put them in the right order

integrate `tximeta`

Use tximeta to retrieve reference/add gene information. This will affect run_dge_edger.R and run_dtu_drimseq.R. Also, there may not be the need for generate_tx2gene.R anymore.

`setup` rule

Add a rule to be run after setting up the config file, but before running the whole workflow. The rule could be "optional, but recommended", and should do things like:

setting up the conda environment(s)
installing R packages (i.e. run the pkginstall rule)
potentially check if all specified reference files exist
...?
If it is not run specificially, all these things will be done later in the workflow, but it can be useful to do these checks in an early stage.

New commit breaks conda activation?

I just replaced my repo with the new one to get the new -ncore update. But now when I try to run it, it hangs at this step:
(base) ruhrig@ubuntu-ruhrig:~/ARMOR$ snakemake --use-conda
/home/ruhrig/miniconda3/lib/python3.6/site-packages/snakemake/workflow.py:34: FutureWarning: read_table is deprecated, use read_csv instead, passing sep='\t'.
import snakemake.wrapper
Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 1
Rules claiming more threads will be scaled down.
Job counts:
count jobs
1 DRIMSeq
1 all
4 bigwig
1 edgeR
8 fastqc
8 fastqc2
1 linkedtxome
1 multiqc
1 pkginstall
4 salmonPE
1 salmonindex
1 shiny
4 starPE
4 staridx
1 starindex
4 trimgalorePE
1 tximeta
46

[Wed Mar 20 16:57:42 2019]
rule pkginstall:
input: scripts/install_pkgs.R
output: example_data/output/Rout/pkginstall_state.txt
log: example_data/output/Rout/install_pkgs.Rout
jobid: 31
priority: 50

Activating conda environment: /home/ruhrig/ARMOR/.snakemake/conda/17597c04

^CTerminating processes on user request, this might take some time.

^CCancelling snakemake on user request.

^CError in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/home/ruhrig/miniconda3/lib/python3.6/concurrent/futures/thread.py", line 40, in _python_exit
t.join()
File "/home/ruhrig/miniconda3/lib/python3.6/threading.py", line 1056, in join
self._wait_for_tstate_lock()
File "/home/ruhrig/miniconda3/lib/python3.6/threading.py", line 1072, in _wait_for_tstate_lock
elif lock.acquire(block, timeout):
KeyboardInterrupt

Add multi-core into the R part?

study run times a bit (%STAR versus %R) on real dataset
study run times of just the R part (%edgeR, %DRIMSeq, %shiny prep, etc.)
seems most useful to have multicore in the DRIMSeq part, since Gosia already coded it. edgeR is fast enough already, I think

move example data into repo

Fix bigwig rule

Currently, the bigwig rule is not run (it should be a dependency of the shiny rule). If invoked explicitly, it gives the following error:

Target rules may not contain wildcards. Please specify concrete files or a rule without wildcards.

csoneson / armor Goto Github PK

armor's People

Contributors

Stargazers

Watchers

Forkers

armor's Issues

Load json linkedTxome

Read metadata

List Salmon directories

Add file column to metadata and import annotated abundances

In transcript level

Recommend Projects

Recommend Topics

Recommend Org