stianlagstad / chimeraviz Goto Github PK
View Code? Open in Web Editor NEWchimeraviz is an R package that automates the creation of chimeric RNA visualizations.
chimeraviz is an R package that automates the creation of chimeric RNA visualizations.
I had hoped that the plotFusionReads
would infer the junction sequence, but in my case fusion@gene{A,B}@junctionSequence was never set (using STAR-fusion, I guess I'll have to run it with Trinity first). If would be nice if function is.nucleotideAmount.valid
would return immediately (with an clear message) if fusionJunctionSequenceLength
has length zero (in most cases that would be because they weren't loaded. Below is code that implements this.
BTW, is there a way to issue pull requests? I see that the code has moved to Bioconductor, but that is not (as far as I can see ) on a platform that can deal with pull requests.
is.nucleotideAmount.valid <- function(argument_checker, nucleotideAmount, fusion) {
fusionJunctionSequenceLength <-
length(fusion@geneA@junctionSequence) +
length(fusion@geneB@junctionSequence)
if (fusionJunctionSequenceLength == 0 ) {
ArgumentCheck::addError(
msg = "length of junction sequence is 0, have they been determined?",
argcheck = argument_checker
)
}
if (class(nucleotideAmount) != "numeric" ||
nucleotideAmount <= 0 ||
nucleotideAmount > fusionJunctionSequenceLength) {
ArgumentCheck::addError(
msg = paste0("'nucleotideAmount' must be a numeric larger > 0 ",
"and <= fusion junction sequence length."),
argcheck = argument_checker
)
}
argument_checker
}
After a recent change in the readme, the Travis build started failing. I believe this is connected to: https://travis-ci.community/t/in-r-version-4-0-0-library-path-not-writable/9744/9.
Although documented here, it should be easier to create the data needed for the protein domain plot. This issue will track my progress on this.
Initial ideas:
Hi,
I'm not able to create the reference fasta file to get a bam file for my fusion. I'm using STAR-fusion on a Linux operating system. The error I get is "Error in write_fusion_reference(fusion = a[[2]], filename = referenceFilename) :
The fusion sequence length is zero, so the fusion reference sequence cannot be written."
My fusion is stored in a[[2]]. I'm not able to figure out where its trying to get the fusion reference sequence length from. Any help is greatly appreciated. Thanks!
Regards,
Ashish
Thank you for your interest in chimeraviz. To make it easier for me to help you, please provide this information in your submitted issue:
sessionInfo()
.As requested by Jakob here: #42
This feature request is for the plot_fusion and plot_transcripts plots.
Hi, can I submit a request for support for importing data from pizzly? (or providing a generic import function so data from fusion-finders not currently supported can be converted and then imported?)
Many thanks.
The 5' and 3' markers on the fusion plot seems to be off. See the plot here.
Fetching transcripts for gene partners..
..transcripts fetched.
Selecting transcripts for B3GNT6..
..found transcripts of type intergenic
Selecting transcripts for INCA1..
..found transcripts of type intergenic
As INCA1 is on the minus strand, the plot for this gene will be reversed
Error in Gviz::plotTracks(collapse = FALSE, list(trBtrack, alTrackHighlightB), :
object 'alTrackHighlightB' not found
Calls: plotTranscripts ->
Gviz::plotTracks(
collapse = FALSE, # without this gviz create cluster_X entries in the GeneRegionTrack
list(grTrackHighlightB, alTrackHighlightB),
sizes = c(5, 2),
add = TRUE,
margin = 3,
innerMargin = 0,
# Plot reverse if gene is at minus strand
reverseStrand = geneBatMinusStrand)
} else {
Gviz::plotTracks(
collapse = FALSE, # without this gviz create cluster_X entries in the GeneRegionTrack
list(trBtrack, alTrack), <<<<<<<<< here is the stuff
sizes = c(5, 2),
add = TRUE,
margin = 3,
innerMargin = 0,
# Plot reverse if gene is at minus strand
reverseStrand = geneBatMinusStrand)
}
Got following error message:
?plotFusionTranscriptWithProteinDomain()
Error in .helpForCall(topicExpr, parent.frame()) :
no methods for 'plotFusionTranscriptWithProteinDomain' and no documentation for it as a function
Hello,
Thank you for developing chimeraviz. I run plot_fusion on my fusions and it worked well. Now I am trying to rerun the same command (same conda environment, same input) but it doesn't work anymore. Here are the commands I executed:
star_fusion_res <- import_starfusion("star-fusion.fusion_predictions.abridged.tsv", "hg38")
fusion <- get_fusion_by_id(star_fusion_res, 12)
edb <- ensembldb::EnsDb("Homo_sapiens.GRCh38.103.sqlite")
bamfile <- "sample1.pos.bam"
pdf(file="fusion_plot_sample1.pdf")
plot_fusion(fusion = fusion, bamfile = bamfile, edb = edb, non_ucsc = FALSE, reduce_transcripts = TRUE)
dev.off()
But when the figure is generating, I get this error:
Fetching transcripts for gene partners..
..transcripts fetched.
Fusion is interchromosomal. Plot separate!
Selecting transcripts for ACTA2..
..found transcripts of type exonBoundary
Selecting transcripts for MITF..
..found transcripts of type exonBoundary
Error in .order_seqlevels(chrom_sizes[, "chrom"]) :
!anyNA(m32) is not TRUE
Do you have any idea why the command no longer works?
Thank you in advance,
Best regards,
Damien Plassard
Here is the output of sessionInfo():
sessionInfo()
R version 4.1.3 (2022-03-10)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Debian GNU/Linux 9 (stretch)
Matrix products: default
BLAS/LAPACK: /shared/ngs/home/plassard/Conda/envs_conda_flash/chimeraviz/lib/libopenblasp-r0.3.21.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] grid stats4 stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] chimeraviz_1.20.0 data.table_1.14.6 ensembldb_2.18.1
[4] AnnotationFilter_1.18.0 GenomicFeatures_1.46.1 AnnotationDbi_1.56.2
[7] Biobase_2.54.0 Gviz_1.38.0 GenomicRanges_1.46.1
[10] Biostrings_2.62.0 GenomeInfoDb_1.30.1 XVector_0.34.0
[13] IRanges_2.28.0 S4Vectors_0.32.4 BiocGenerics_0.40.0
loaded via a namespace (and not attached):
[1] colorspace_2.0-3 rjson_0.2.21
[3] deldir_1.0-6 ellipsis_0.3.2
[5] biovizBase_1.42.0 htmlTable_2.4.1
[7] base64enc_0.1-3 dichromat_2.0-0.1
[9] rstudioapi_0.14 DT_0.26
[11] bit64_4.0.5 fansi_1.0.3
[13] xml2_1.3.3 splines_4.1.3
[15] cachem_1.0.6 knitr_1.41
[17] Formula_1.2-4 Rsamtools_2.10.0
[19] cluster_2.1.4 dbplyr_2.2.1
[21] png_0.1-8 BiocManager_1.30.19
[23] compiler_4.1.3 httr_1.4.4
[25] backports_1.4.1 assertthat_0.2.1
[27] Matrix_1.5-3 fastmap_1.1.0
[29] lazyeval_0.2.2 cli_3.5.0
[31] org.Mm.eg.db_3.14.0 htmltools_0.5.4
[33] prettyunits_1.1.1 tools_4.1.3
[35] gtable_0.3.1 glue_1.6.2
[37] GenomeInfoDbData_1.2.7 dplyr_1.0.10
[39] rappdirs_0.3.3 Rcpp_1.0.9
[41] vctrs_0.5.1 rtracklayer_1.54.0
[43] xfun_0.35 stringr_1.5.0
[45] lifecycle_1.0.3 restfulr_0.0.15
[47] XML_3.99-0.13 org.Hs.eg.db_3.14.0
[49] zlibbioc_1.40.0 RCircos_1.2.2
[51] scales_1.2.1 BiocStyle_2.22.0
[53] BSgenome_1.62.0 VariantAnnotation_1.40.0
[55] hms_1.1.2 MatrixGenerics_1.6.0
[57] ProtGenerics_1.26.0 parallel_4.1.3
[59] SummarizedExperiment_1.24.0 RColorBrewer_1.1-3
[61] yaml_2.3.6 curl_4.3.3
[63] memoise_2.0.1 gridExtra_2.3
[65] ggplot2_3.4.0 biomaRt_2.50.0
[67] rpart_4.1.19 latticeExtra_0.6-30
[69] stringi_1.7.8 RSQLite_2.2.19
[71] BiocIO_1.4.0 checkmate_2.1.0
[73] filelock_1.0.2 BiocParallel_1.28.3
[75] rlang_1.0.6 pkgconfig_2.0.3
[77] matrixStats_0.63.0 bitops_1.0-7
[79] evaluate_0.19 lattice_0.20-45
[81] GenomicAlignments_1.30.0 htmlwidgets_1.6.0
[83] bit_4.0.5 tidyselect_1.2.0
[85] plyr_1.8.8 magrittr_2.0.3
[87] R6_2.5.1 generics_0.1.3
[89] Hmisc_4.7-2 DelayedArray_0.20.0
[91] DBI_1.1.3 pillar_1.8.1
[93] foreign_0.8-84 survival_3.4-0
[95] KEGGREST_1.34.0 RCurl_1.98-1.9
[97] nnet_7.3-18 tibble_3.1.8
[99] crayon_1.5.2 interp_1.1-3
[101] utf8_1.2.2 BiocFileCache_2.2.0
[103] rmarkdown_2.19 jpeg_0.1-10
[105] progress_1.2.2 blob_1.2.3
[107] digest_0.6.31 munsell_0.5.0
Hello,
I am trying to visualize GeneFusion data from FusionCatcher, however, it seems that my data is in a different format from what the Chemiaviz package use. I looked at the sample data included with the turtorial and it is different.
Is there some preliminary steps involved to make the data compatible, to make it look the same as the data in the turorial.
Thank you for your interest in chimeraviz. To make it easier for me to help you, please provide this information in your submitted issue:
Here's what my data looks like..
See below for details about the software version and OS version i am using for analysis
The output of sessionInfo()
.
N/A
Which fusion-finder tool you are using, and its version.
Software version: fusioncatcher.py 1.00
Which operating system you are using, and its version.
MacOS Catalina
Version 10.15.3
Example code leading to the error (if you're experiencing an error).
N/A
Hi,
in the latest version of importStarfusion
function does not fill in the ensemblId
slot of the fusion partners. Would be nice to have. This is using output from STAR-fusion 1.2.0 (run CentOS 7, 3.10.0-693.11.6.el7.x86_64), analyzed on Mac OSX (Darwin PMC-GEN003 15.6.0 Darwin Kernel Version 15.6.0: Tue Jan 9 20:12:05 PST 2018; root:xnu-3248.73.5~1/RELEASE_X86_64 x86_64 i386 MacBookPro12).
The LeftGene
and RightGene
columns of the star-fusion.fusion_predictions.abridged.tsv
file look like MT-ATP6^ENSG00000198899.2
> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: OS X El Capitan 10.11.6
Matrix products: default
BLAS: /opt/local/Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.dylib
LAPACK: /opt/local/Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] C
attached base packages:
[1] grid stats4 parallel stats datasets graphics grDevices
[8] utils methods base
other attached packages:
[1] chimeraviz_1.4.1 ensembldb_2.2.0 AnnotationFilter_1.3.1
[4] GenomicFeatures_1.30.3 AnnotationDbi_1.40.0 Biobase_2.38.0
[7] Gviz_1.22.2 GenomicRanges_1.30.1 GenomeInfoDb_1.14.0
[10] Biostrings_2.46.0 XVector_0.18.0 IRanges_2.12.0
[13] S4Vectors_0.16.0 BiocGenerics_0.24.0 uuutils_1.48
[16] gplots_3.0.1
loaded via a namespace (and not attached):
[1] ProtGenerics_1.10.0 bitops_1.0-6
[3] matrixStats_0.53.1 devtools_1.13.4
[5] bit64_0.9-7 RColorBrewer_1.1-2
[7] progress_1.1.2 httr_1.3.1
[9] rprojroot_1.3-2 tools_3.4.3
[11] backports_1.1.2 DT_0.4
[13] R6_2.2.2 rpart_4.1-11
[15] KernSmooth_2.23-15 Hmisc_4.1-1
[17] DBI_0.7-15 lazyeval_0.2.1
[19] colorspace_1.3-2 nnet_7.3-12
[21] withr_2.1.1 gridExtra_2.3
[23] prettyunits_1.0.2 RMySQL_0.10.13
[25] bit_1.1-12 curl_3.1
[27] compiler_3.4.3 git2r_0.21.0
[29] htmlTable_1.11.2 DelayedArray_0.4.1
[31] rtracklayer_1.38.3 caTools_1.17.1
[33] scales_0.5.0 checkmate_1.8.5
[35] readr_1.1.1 RCircos_1.2.0
[37] stringr_1.2.0 digest_0.6.15
[39] Rsamtools_1.30.0 foreign_0.8-69
[41] rmarkdown_1.8 pkgconfig_2.0.1
[43] base64enc_0.1-3 dichromat_2.0-0
[45] htmltools_0.3.6 BSgenome_1.46.0
[47] htmlwidgets_1.0 rlang_0.1.6
[49] rstudioapi_0.7 RSQLite_2.0
[51] BiocInstaller_1.28.0 shiny_1.0.5
[53] BiocParallel_1.12.0 gtools_3.5.0
[55] acepack_1.4.1 VariantAnnotation_1.24.5
[57] RCurl_1.95-4.10 magrittr_1.5
[59] GenomeInfoDbData_1.0.0 Formula_1.2-2
[61] Matrix_1.2-12 Rcpp_0.12.15
[63] munsell_0.4.3 stringi_1.1.6
[65] yaml_2.1.16 SummarizedExperiment_1.8.1
[67] zlibbioc_1.24.0 org.Hs.eg.db_3.5.0
[69] plyr_1.8.4 AnnotationHub_2.10.1
[71] blob_1.1.0 gdata_2.18.0
[73] lattice_0.20-35 splines_3.4.3
[75] hms_0.4.1 knitr_1.19
[77] pillar_1.1.0 biomaRt_2.34.2
[79] XML_3.98-1.9 evaluate_0.10.1
[81] biovizBase_1.26.0 latticeExtra_0.6-28
[83] data.table_1.10.4-3 httpuv_1.3.5
[85] gtable_0.2.0 assertthat_0.2.0
[87] ggplot2_2.2.1 mime_0.5
[89] xtable_1.8-2 ArgumentCheck_0.10.2
[91] survival_2.41-3 tibble_1.4.2
[93] GenomicAlignments_1.14.1 memoise_1.1.0
[95] cluster_2.0.6 interactiveDisplayBase_1.16.0
[97] BiocStyle_2.6.1
have a look at
there are lots of
in the document. Let me know if you need more information.
Hello,
I noticed that the line width in the plot_circle function do not correspond to the reads. If you look at your own example using
soapfuse833ke <- system.file( "extdata", "soapfuse_833ke_final.Fusion.specific.for.genes", package = "chimeraviz") fusions <- import_soapfuse(soapfuse833ke, "hg38", 10)
You will notice that the link which should be the widest is the one between gene EEF1A1P5 and the gene PABPC1 (chr8 and chr9).
The issue is in the RCircos.Link.Plot, the parameter lineWidth expects an ordered numeric vector according to the chromosome:position.
RCircos.Link.Plot(link.data = link_data, track.num = track_num, by.chromosome = TRUE, start.pos = NULL, genomic.columns = 3, is.sorted = FALSE, lineWidth = link_data$link_width)
Here is the solution I used :
Create a reorder function that uses the mixedsort
function from the gtools
library.
(This function was not written by me but I took it from : https://stackoverflow.com/questions/20396582/order-a-mixed-vector-numbers-with-letters
multi.mixedorder <- function(..., na.last = TRUE, decreasing = FALSE){ do.call(order, c( lapply(list(...), function(l){ if(is.character(l)){ factor(l, levels=mixedsort(unique(l))) } else { l } }), list(na.last = na.last, decreasing = decreasing) )) }
Then apply that function in the plot_circle function:
ordered_link_width <- link_data[multi.mixedorder(as.character(link_data$chromosome),link_data$chrom_start),]$link_width
Finally return the RCircos function with the ordred link width data:
return(RCircos::RCircos.Link.Plot(link.data = link_data, track.num = track_num, by.chromosome = TRUE, start.pos = NULL, genomic.columns = 3, is.sorted = FALSE, lineWidth = ordered_link_width))
Hello Stian,
I got the output from STAR-Fusion(Release v1.2.0) and installed the chimeraviz for bioconductor.
After preparation, I run the following command:
fusions <- importStarfusion("Sample1_star_fusion_outdir/star-fusion.fusion_predictions.abridged.tsv", "hg19")
Reading filename caused a warning:
Error in vector("list", dim(report)[1]) : 'length' is not corrent
Warning message:
The following named parsers don't match the column names: J_FFPM, S_FFPM
So I compared the columns of examplefile in chimeraviz and output of the starfusion:
#FusionName JunctionReadCount SpanningFragCount SpliceType LeftGene LeftBreakpoint RightGene RightBreakpoint LargeAnchorSupport LeftBreakDinuc LeftBreakEntropy RightBreakDinuc RightBreakEntropy J_FFPM S_FFPM
#FusionName JunctionReadCount SpanningFragCount SpliceType LeftGene LeftBreakpoint RightGene RightBreakpoint LargeAnchorSupport LeftBreakDinuc LeftBreakEntropy RightBreakDinuc RightBreakEntropy FFPM
In STAR-Fusion , I saw FFPM in example format while J_FFPM, S_FFPM were mentioned below:
Since the number of fusion-supporting reads depends on both the expression of the fusion transcript and the number of reads sequenced, we provide normalized measures of the split reads and spanning fragments as FFPM (fusion fragments per million total reads) measures: J_FFPM for the junction/split reads and S_FFPM for the spanning fragments. If you sum them (a column not yet included but will be soon), you can filter based on this value to remove many lowly expressed and likely artifact fusions. A filter of 0.1 sum FFPM tends to be very effective.
I am confused about this now, could you please give some suggestions?
Thank you:)
Hi,
I would like to know whether chimeraviz can be used for the visualization of the chimeric transcript from non-human organims. I tried import the fusion data with "fusions <- importStarfusion(defuse833ke, "mm10")", but failed with genome version info.
Thanks!
Best,
Wenyu Zhang
The package vignette has a few warnings as a result of the plot_fusion_transcript_with_protein_domain
-call:
## Warning in `[<-.data.table`(`*tmp*`, i, , value =
## structure(list(Transcript_id = "ENST00000370031", : Coerced integer
## RHS to character to match the type of the target column (column 8 named
## 'plot_start'). If the target column's type character is correct, it's best
## for efficiency to avoid the coercion and create the RHS as type character.
## To achieve that consider R's type postfix: typeof(0L) vs typeof(0), and
## typeof(NA) vs typeof(NA_integer_) vs typeof(NA_real_). You can wrap the RHS
## with as.character() to avoid this warning, but that will still perform the
## coercion. If the target column's type is not correct, it's best to revisit
## where the DT was created and fix the column type there; e.g., by using
## colClasses= in fread(). Otherwise, you can change the column type now by
## plonking a new column (of the desired type) over the top of it; e.g. DT[,
## `plot_start`:=as.integer(`plot_start`)]. If the RHS of := has nrow(DT)
## elements then the assignment is called a column plonk and is the way to
## change a column's type. Column types can be observed with sapply(DT,typeof).
## Warning in `[<-.data.table`(`*tmp*`, i, , value =
## structure(list(Transcript_id = "ENST00000370031", : Coerced integer
## RHS to character to match the type of the target column (column 9 named
## 'plot_end'). If the target column's type character is correct, it's best
## for efficiency to avoid the coercion and create the RHS as type character.
## To achieve that consider R's type postfix: typeof(0L) vs typeof(0), and
## typeof(NA) vs typeof(NA_integer_) vs typeof(NA_real_). You can wrap the RHS
## with as.character() to avoid this warning, but that will still perform the
## coercion. If the target column's type is not correct, it's best to revisit
## where the DT was created and fix the column type there; e.g., by using
## colClasses= in fread(). Otherwise, you can change the column type now by
## plonking a new column (of the desired type) over the top of it; e.g. DT[,
## `plot_end`:=as.integer(`plot_end`)]. If the RHS of := has nrow(DT) elements
## then the assignment is called a column plonk and is the way to change a
## column's type. Column types can be observed with sapply(DT,typeof).
Figure out what's wrong and fix it.
The filter classes will move from the ensembldb
package to a new package AnnotationFilter
(https://github.com/Bioconductor/AnnotationFilter). This package is not yet in Bioconductor, but once it is (hopefully soon) I will remove the filter classes from the ensembldb
package.
I checked your package and all you will have to do seems to be to inport the GeneIdFilter
(has been renamed from GeneidFilter
) from AnnotationFilter
instead.
There will also be the possibility to do the filtering with filter = ~ gene_id == "BCL2"
instead of explicitly create a filter object.
Let me know if you run into problems or need help. I'll let you know once I've removed the filter classes from ensembldb
(eventually already this week).
Hi Stian,
I was trying to create the overview plot but a little confused as to what the link widths mean here:
Initially I thought it represents how many times the event was observed. But the width between EDA-RFWD3
is greater than EEF1A1P5-PABPC1
even though EEF1A1P5-PABPC1
is seen 4 times and EDA-RFWD3
is seen only once. Can you please explain?
Thank you,
Komal
When I visualize the vignette the section 3.2.1.5 is not shown in the TOC:
You should set the level of depth of the TOC at the yaml header if you want it to be formatted as such
---
title: "chimeraviz"
author: "Stian Lรฅgstad"
date: "`r Sys.Date()`"
output:
BiocStyle::html_document2:
toc_depth: 5
vignette: >
%\VignetteIndexEntry{chimeraviz}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
Hi there,
I am trying to use plot_fusion but I am not able to produce the plot with coverage data (I do not get an Error, I just do not get the coverage subplot). I looked into your code and I think I found an explanation at lines 617, 679, 743 in plot_fusion.R and they all involve ' exists("al_track") ' .
Is not supposed to be ' exists("alignment_track") ' instead of ' exists("al_track") ' ?
It makes sense in plot_fusion_together since you define 'al_track', but it does not in plot_fusion_separate because you define 'alignment_track' .
Please let me know what you think, thank you!!
Hi I'm wondering if it is possible to plot without a bam file? So I ran fusioncatcher and one of the fusions I'm interested is this.
[1] "Fusion object"
[1] "id: 4"
[1] "Fusion tool: fusioncatcher"
[1] "Genome version: hg19"
[1] "Gene names: COL1A1-SERPING1"
[1] "Chromosomes: chr17-chr11"
[1] "Strands: -,+"
[1] "In-frame?: TRUE"
I only have a bamfile generated previously from a STAR alingment. So I'm wondering since I don't care about coverage can I just have a plot drawn without the bamfile? When I tried to run it with the STAR bam file,
plot_fusion(
fusion = draw.fuse, # the fusion above of interest
edb = edb,
bamfile = bamfile, # the star bam file
reduce_transcripts = T
)
I get an error,
Error in .Call2("solve_user_SEW0", start, end, width, PACKAGE = "IRanges") :
solving row 1: negative widths are not allowed
Also if not for this error, often I get other fusions that just refuses to plot because I think the bam file is somehow does not have the reads?
thanks!
Hi Stian,
I am facing a lot of graphics issue so for simplicity, I am just using your test code so it is easier for you to reproduce the problem:
library(chimeraviz)
# Get reference to results file from deFuse
defuse833ke <- system.file(
"extdata",
"defuse_833ke_results.filtered.tsv",
package="chimeraviz")
# Load the results file into a list of fusion objects
fusions <- importDefuse(defuse833ke, "hg19", 1)
length(fusions)
fusion <- getFusionById(fusions, 5267)
fastq1 <- system.file(
"extdata",
"reads_supporting_defuse_fusion_5267.1.fq",
package = "chimeraviz")
fastq2 <- system.file(
"extdata",
"reads_supporting_defuse_fusion_5267.2.fq",
package = "chimeraviz")
referenceFilename <- "reference.fa"
writeFusionReference(fusion = fusion, filename = referenceFilename)
source(system.file(
"scripts",
"rsubread.R",
package="chimeraviz"))
# Then create index
rsubreadIndex(referenceFasta = referenceFilename)
rsubreadAlign(
referenceName = referenceFilename,
fastq1 = fastq1,
fastq2 = fastq2,
outputBamFilename = "fusionAlignment")
if(!exists("bamfile5267")) {
bamfile5267 <- system.file(
"extdata",
"5267readsAligned.bam",
package="chimeraviz")
}
fusion <- addFusionReadsAlignment(fusion, bamfile5267)
plotFusionReads(fusion = fusion)
# another bam file
if(!exists("bamfile5267")) {
bamfile5267 <- system.file(
"extdata",
"5267readsAligned.bam",
package="chimeraviz")
}
fusion <- addFusionReadsAlignment(fusion, bamfile5267)
plotFusionReads(fusion = fusion, showAllNucleotides = TRUE)
At this point, this is the plot I am getting -
Why is it not showing the ATGC and showing colored bars instead?
Another plot that is not showing correctly:
if(!exists("defuse833ke"))
defuse833ke <- system.file(
"extdata",
"defuse_833ke_results.filtered.tsv",
package = "chimeraviz")
# Then load the fusion events
fusions <- importDefuse(defuse833ke, "hg19", 1)
fusion <- getFusionById(fusions, 5267)
if(!exists("edbSqliteFile"))
edbSqliteFile <- system.file(
"extdata",
"Homo_sapiens.GRCh37.74.sqlite",
package="chimeraviz")
# Then load it
edb <- ensembldb::EnsDb(edbSqliteFile)
if(!exists("fusion5267and11759reads"))
fusion5267and11759reads <- system.file(
"extdata",
"fusion5267and11759reads.bam",
package = "chimeraviz")
plotFusion(
fusion = fusion,
bamfile = bamfile5267,
edb = edb,
nonUCSC = TRUE,
reduceTranscripts = TRUE)
Here the red arc is not showing properly that should connect the two exons of RCC1 and HENMT1.
Here is my sessionInfo:
> sessionInfo()
R version 3.4.2 (2017-09-28)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: OS X El Capitan 10.11.6
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8
attached base packages:
[1] grid stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] chimeraviz_1.2.2 ensembldb_2.0.4 AnnotationFilter_1.0.0 GenomicFeatures_1.28.5 AnnotationDbi_1.38.2
[6] Biobase_2.36.2 Gviz_1.20.0 GenomicRanges_1.28.6 GenomeInfoDb_1.12.3 Biostrings_2.44.2
[11] XVector_0.16.0 IRanges_2.10.5 S4Vectors_0.14.7 BiocGenerics_0.22.1 BiocInstaller_1.26.1
loaded via a namespace (and not attached):
[1] ProtGenerics_1.8.0 bitops_1.0-6 matrixStats_0.52.2
[4] bit64_0.9-7 RColorBrewer_1.1-2 httr_1.3.1
[7] rprojroot_1.2 tools_3.4.2 backports_1.1.1
[10] R6_2.2.2 DT_0.2 rpart_4.1-11
[13] Hmisc_4.0-3 DBI_0.7 lazyeval_0.2.0
[16] colorspace_1.3-2 nnet_7.3-12 gridExtra_2.3
[19] bit_1.1-12 curl_3.0 compiler_3.4.2
[22] htmlTable_1.9 DelayedArray_0.2.7 rtracklayer_1.36.6
[25] scales_0.5.0 checkmate_1.8.4 readr_1.1.1
[28] RCircos_1.2.0 stringr_1.2.0 digest_0.6.12
[31] Rsamtools_1.28.0 foreign_0.8-69 rmarkdown_1.6
[34] Rsubread_1.26.1 base64enc_0.1-3 dichromat_2.0-0
[37] pkgconfig_2.0.1 htmltools_0.3.6 BSgenome_1.44.2
[40] htmlwidgets_0.9 rlang_0.1.2 RSQLite_2.0
[43] shiny_1.0.5 BiocParallel_1.10.1 acepack_1.4.1
[46] VariantAnnotation_1.22.3 RCurl_1.95-4.8 magrittr_1.5
[49] GenomeInfoDbData_0.99.0 Formula_1.2-2 Matrix_1.2-11
[52] Rcpp_0.12.13 munsell_0.4.3 stringi_1.1.5
[55] yaml_2.1.14 SummarizedExperiment_1.6.5 zlibbioc_1.22.0
[58] org.Hs.eg.db_3.4.1 plyr_1.8.4 AnnotationHub_2.8.2
[61] blob_1.1.0 lattice_0.20-35 splines_3.4.2
[64] hms_0.3 knitr_1.17 biomaRt_2.32.1
[67] XML_3.98-1.9 evaluate_0.10.1 biovizBase_1.24.0
[70] latticeExtra_0.6-28 data.table_1.10.4-1 httpuv_1.3.5
[73] gtable_0.2.0 ggplot2_2.2.1 mime_0.5
[76] xtable_1.8-2 ArgumentCheck_0.10.2 survival_2.41-3
[79] tibble_1.3.4 GenomicAlignments_1.12.2 memoise_1.1.0
[82] cluster_2.0.6 interactiveDisplayBase_1.14.0 BiocStyle_2.4.1
Hello,
Here is the issue i am facing:
when I try to "plotFusion" or "plotFusionTranscriptWithProteinDomain" without providing a bam file I get an error.
Here is the reproducible code and the error:
`
if(!exists("defuse833ke"))
defuse833ke <- system.file(
"extdata",
"defuse_833ke_results.filtered.tsv",
package = "chimeraviz")
fusions <- importDefuse(defuse833ke, "hg19", 1)
fusion <- getFusionById(fusions, 5267)
if(!exists("edbSqliteFile"))
edbSqliteFile <- system.file(
"extdata",
"Homo_sapiens.GRCh37.74.sqlite",
package="chimeraviz")
edb <- ensembldb::EnsDb(edbSqliteFile)
plotFusion(
fusion = fusion,
edb = edb,
nonUCSC = TRUE)
Error:
.validatePlotFusionParams(fusion, edb, bamfile, whichTranscripts,
ylim, nonUCSC, reduceTranscripts, bedgraphfile)
1: Either 'bamfile' or 'bedgraphfile' must be given
`
Is this normal, I saw in your vignette (and your code) that the coverage seems optional, am I mistaken?
fusion-finder:
FusionCatcher 0.99.7c beta
`
sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.1 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
locale:
[1] LC_CTYPE=en_CA.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_CA.UTF-8 LC_COLLATE=en_CA.UTF-8
[5] LC_MONETARY=en_CA.UTF-8 LC_MESSAGES=en_CA.UTF-8
[7] LC_PAPER=en_CA.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] grid stats4 parallel stats graphics grDevices utils
[8] datasets methods base
other attached packages:
[1] chimeraviz_1.4.3 ensembldb_2.2.2 AnnotationFilter_1.2.0
[4] GenomicFeatures_1.30.3 AnnotationDbi_1.40.0 Biobase_2.38.0
[7] Gviz_1.22.3 GenomicRanges_1.30.3 GenomeInfoDb_1.14.0
[10] Biostrings_2.46.0 XVector_0.18.0 IRanges_2.12.0
[13] S4Vectors_0.16.0 BiocGenerics_0.24.0
loaded via a namespace (and not attached):
[1] ProtGenerics_1.10.0 bitops_1.0-6
[3] matrixStats_0.54.0 bit64_0.9-7
[5] RColorBrewer_1.1-2 progress_1.2.0
[7] httr_1.3.1 rprojroot_1.3-2
[9] tools_3.4.4 backports_1.1.2
[11] DT_0.5 R6_2.3.0
[13] rpart_4.1-13 Hmisc_4.1-1
[15] DBI_1.0.0 lazyeval_0.2.1
[17] colorspace_1.3-2 nnet_7.3-12
[19] tidyselect_0.2.5 gridExtra_2.3
[21] prettyunits_1.0.2 RMySQL_0.10.15
[23] curl_3.2 bit_1.1-14
[25] compiler_3.4.4 htmlTable_1.12
[27] DelayedArray_0.4.1 rtracklayer_1.38.3
[29] scales_1.0.0 checkmate_1.8.5
[31] readr_1.1.1 RCircos_1.2.0
[33] stringr_1.3.1 digest_0.6.18
[35] Rsamtools_1.30.0 foreign_0.8-71
[37] rmarkdown_1.10 base64enc_0.1-3
[39] dichromat_2.0-0 pkgconfig_2.0.2
[41] htmltools_0.3.6 BSgenome_1.46.0
[43] htmlwidgets_1.3 rlang_0.3.0.1
[45] rstudioapi_0.8 RSQLite_2.1.1
[47] BiocInstaller_1.28.0 shiny_1.2.0
[49] bindr_0.1.1 BiocParallel_1.12.0
[51] acepack_1.4.1 dplyr_0.7.8
[53] VariantAnnotation_1.24.5 RCurl_1.95-4.11
[55] magrittr_1.5 GenomeInfoDbData_1.0.0
[57] Formula_1.2-3 Matrix_1.2-15
[59] Rcpp_1.0.0 munsell_0.5.0
[61] stringi_1.2.4 yaml_2.2.0
[63] SummarizedExperiment_1.8.1 zlibbioc_1.24.0
[65] org.Hs.eg.db_3.5.0 plyr_1.8.4
[67] AnnotationHub_2.10.1 blob_1.1.1
[69] promises_1.0.1 crayon_1.3.4
[71] lattice_0.20-38 splines_3.4.4
[73] hms_0.4.2 knitr_1.20
[75] pillar_1.3.0 biomaRt_2.34.2
[77] XML_3.98-1.16 glue_1.3.0
[79] evaluate_0.12 biovizBase_1.26.0
[81] latticeExtra_0.6-28 data.table_1.11.8
[83] httpuv_1.4.5 gtable_0.2.0
[85] purrr_0.2.5 assertthat_0.2.0
[87] ggplot2_3.1.0 mime_0.6
[89] xtable_1.8-3 later_0.7.5
[91] ArgumentCheck_0.10.2 survival_2.43-1
[93] tibble_1.4.2 GenomicAlignments_1.14.2
[95] memoise_1.1.0 bindrcpp_0.2.2
[97] cluster_2.0.7-1 interactiveDisplayBase_1.16.0
[99] BiocStyle_2.6.1
`
OS :
Ubuntu 18.04.1 LTS (Bionic Beaver)
Francois
PS. Thank you for chimeraviz
Ref bcgsc/pavfinder#6.
Hi is there a way to change the font for the circos plots? They look good but would be great if there was a way to change the font size. thanks.
Hi,
I am trying to import the results of STAR-Fusion (STAR version=STAR_2.5.3a
) and facing some issues:
library(chimeraviz)
# Get reference to results file from deFuse
sf <- importStarfusion(filename = 'star-fusion.fusion_predictions.tsv', "hg19", 10)
# Load the results file into a list of fusion objects
fusion <- getFusionByGeneName(sf, geneName = 'KANSL1')
fusion <- getFusionById(fusion, id = '3')
# read fastq files with reads for that fusion
fastq1 <- 'SRR1559088_fusion_R1.fq'
fastq2 <- 'SRR1559088_fusion_R2.fq'
# extract the fusion junction sequence
referenceFilename <- "reference.fa"
writeFusionReference(fusion = fusion, filename = referenceFilename)
But the reference file is empty and there are no errors associated with it. Can you have a look at this when you get a chance?
Here is my sessionInfo:
> sessionInfo()
R version 3.4.2 (2017-09-28)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: OS X El Capitan 10.11.6
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8
attached base packages:
[1] grid stats4 parallel stats graphics grDevices utils datasets
[9] methods base
other attached packages:
[1] chimeraviz_1.5.4 ensembldb_2.2.2 AnnotationFilter_1.2.0
[4] GenomicFeatures_1.30.3 AnnotationDbi_1.40.0 Biobase_2.38.0
[7] Gviz_1.22.3 GenomicRanges_1.30.3 GenomeInfoDb_1.14.0
[10] Biostrings_2.46.0 XVector_0.18.0 IRanges_2.12.0
[13] S4Vectors_0.16.0 BiocGenerics_0.24.0 BiocInstaller_1.28.0
loaded via a namespace (and not attached):
[1] ProtGenerics_1.10.0 bitops_1.0-6
[3] matrixStats_0.53.0 devtools_1.13.4
[5] bit64_0.9-7 RColorBrewer_1.1-2
[7] progress_1.1.2 httr_1.3.1
[9] rprojroot_1.3-2 tools_3.4.2
[11] backports_1.1.2 utf8_1.1.3
[13] DT_0.4.4 R6_2.2.2
[15] rpart_4.1-12 Hmisc_4.1-1
[17] DBI_0.7 lazyeval_0.2.1
[19] colorspace_1.3-2 nnet_7.3-12
[21] withr_2.1.1.9000 gridExtra_2.3
[23] prettyunits_1.0.2 RMySQL_0.10.13
[25] git2r_0.21.0 bit_1.1-12
[27] curl_3.1 compiler_3.4.2
[29] cli_1.0.0 htmlTable_1.11.2
[31] DelayedArray_0.4.1 rtracklayer_1.38.3
[33] scales_0.5.0.9000 checkmate_1.8.5
[35] readr_1.1.1 RCircos_1.2.0
[37] stringr_1.2.0 digest_0.6.15
[39] Rsamtools_1.30.0 foreign_0.8-69
[41] rmarkdown_1.8 Rsubread_1.28.1
[43] pkgconfig_2.0.1 base64enc_0.1-3
[45] dichromat_2.0-0 htmltools_0.3.6
[47] highr_0.6 BSgenome_1.46.0
[49] htmlwidgets_1.0 rlang_0.1.6.9003
[51] rstudioapi_0.7 RSQLite_2.0
[53] shiny_1.0.5 BiocParallel_1.12.0
[55] acepack_1.4.1 VariantAnnotation_1.24.5
[57] RCurl_1.95-4.10 magrittr_1.5
[59] GenomeInfoDbData_1.0.0 Formula_1.2-2
[61] Matrix_1.2-12 Rcpp_0.12.15
[63] munsell_0.4.3 stringi_1.1.6
[65] yaml_2.1.16 SummarizedExperiment_1.8.1
[67] zlibbioc_1.24.0 org.Hs.eg.db_3.5.0
[69] plyr_1.8.4 AnnotationHub_2.10.1
[71] blob_1.1.0 crayon_1.3.4
[73] lattice_0.20-35 splines_3.4.2
[75] hms_0.4.1 knitr_1.19
[77] pillar_1.1.0 biomaRt_2.34.2
[79] XML_3.98-1.9 evaluate_0.10.1
[81] biovizBase_1.26.0 latticeExtra_0.6-28
[83] data.table_1.10.4-3 httpuv_1.3.5
[85] org.Mm.eg.db_3.5.0 gtable_0.2.0
[87] assertthat_0.2.0 ggplot2_2.2.1.9000
[89] mime_0.5 xtable_1.8-2
[91] ArgumentCheck_0.10.2 survival_2.41-3
[93] tibble_1.4.2 GenomicAlignments_1.14.1
[95] memoise_1.1.0 cluster_2.0.6
[97] interactiveDisplayBase_1.16.0 BiocStyle_2.6.1
STAR output files (there are four .tsv files from STAR output, you can check all four), fastq files and reference output (empty) are attached here:
https://drive.google.com/drive/u/0/folders/1NTk0PZRMlH1KFD3xHg8_UXu5m1lmGyKj
Thanks!
Hey Stian,
Below is a description of the issue I am getting.
I am using your suggested numbered list of items to include but starting with the code (4):
fusions<-import_soapfuse("https://github.com/stianlagstad/chimeraviz/files/4112483/tcga_annot_soapfuse_format.txt", "hg38")
edb <- EnsDb.Hsapiens.v86
plot_fusion(
fusion = get_fusion_by_id(fusions, 15),
edb = edb)
With "fusions" being a fusion object with ID=15 being:
[1] "Fusion object"
[1] "id: 15"
[1] "Fusion tool: soapfuse"
[1] "Genome version: hg38"
[1] "Gene names: SFPQ-TFE3"
[1] "Chromosomes: chr1-chrX"
[1] "Strands: -,-"
[1] "In-frame?: NA"
"fusions" has 36 other fusions and most of them run fine.
The ones that are causing an error are those that have "SPFQ" or "CADM2" as one of the genes.
I am getting the following error:
Fetching transcripts for gene partners..
'select()' returned 1:many mapping between keys and columns
..transcripts fetched.
Fusion is interchromosomal. Plot separate!
Fetching transcripts for gene partners..
..transcripts fetched.
Error in select_transcript(fusion@gene_upstream, which_transcripts) :
genePartner has no transcripts. See get_transcripts_ensembl_db()
In addition: There were 14 warnings (use warnings() to see them)
warnings()
Warning messages:
1: In if (S4Vectors::mcols(gr)$gene_id[[1]] == fusion@gene_upstream@ensembl_id) { ... :
the condition has length > 1 and only the first element will be used
2: In if (S4Vectors::mcols(gr)$gene_id[[1]] == fusion@gene_upstream@ensembl_id) { ... :
the condition has length > 1 and only the first element will be used
3: In if (S4Vectors::mcols(gr)$gene_id[[1]] == fusion@gene_upstream@ensembl_id) { ... :
the condition has length > 1 and only the first element will be used
4: In if (S4Vectors::mcols(gr)$gene_id[[1]] == fusion@gene_upstream@ensembl_id) { ... :
the condition has length > 1 and only the first element will be used
5: In if (S4Vectors::mcols(gr)$gene_id[[1]] == fusion@gene_upstream@ensembl_id) { ... :
the condition has length > 1 and only the first element will be used
6: In get_transcripts_ensembl_db(fusion, edb) :
No transcripts available for the upstream gene SFPQ available.
7: In get_transcripts_ensembl_db(fusion, edb) :
No transcripts available for the downstream gene TFE3 available.
8: In if (S4Vectors::mcols(gr)$gene_id[[1]] == fusion@gene_upstream@ensembl_id) { ... :
the condition has length > 1 and only the first element will be used
9: In if (S4Vectors::mcols(gr)$gene_id[[1]] == fusion@gene_upstream@ensembl_id) { ... :
the condition has length > 1 and only the first element will be used
10: In if (S4Vectors::mcols(gr)$gene_id[[1]] == fusion@gene_upstream@ensembl_id) { ... :
the condition has length > 1 and only the first element will be used
11: In if (S4Vectors::mcols(gr)$gene_id[[1]] == fusion@gene_upstream@ensembl_id) { ... :
the condition has length > 1 and only the first element will be used
12: In if (S4Vectors::mcols(gr)$gene_id[[1]] == fusion@gene_upstream@ensembl_id) { ... :
the condition has length > 1 and only the first element will be used
13: In get_transcripts_ensembl_db(fusion, edb) :
No transcripts available for the upstream gene SFPQ available.
14: In get_transcripts_ensembl_db(fusion, edb) :
No transcripts available for the downstream gene TFE3 available.
I also tried filtering the edb object or inputting the transcripts but that doesn't work either.
I also checked the edb object and it does have information for "SPFQ" and "CADM2" and I can't tell why the function is failing with only these 2 genes.
Matrix products: default
locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252 LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C LC_TIME=English_United Kingdom.1252
attached base packages:
[1] grid stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] AnnotationHub_2.18.0 BiocFileCache_1.10.2 dbplyr_1.4.2 EnsDb.Hsapiens.v86_2.99.0 chimeraviz_1.12.0
[6] ensembldb_2.10.2 AnnotationFilter_1.10.0 GenomicFeatures_1.38.0 Gviz_1.30.0 biomaRt_2.42.0
[11] dendsort_0.3.3 metaseqR_1.26.0 qvalue_2.18.0 limma_3.42.0 DESeq_1.38.0
[16] locfit_1.5-9.1 EDASeq_2.20.0 ShortRead_1.44.1 GenomicAlignments_1.22.1 SummarizedExperiment_1.16.1
[21] DelayedArray_0.12.1 matrixStats_0.55.0 Rsamtools_2.2.1 GenomicRanges_1.38.0 GenomeInfoDb_1.22.0
[26] Biostrings_2.54.0 XVector_0.26.0 BiocParallel_1.20.1 reshape2_1.4.3 Hmisc_4.3-0
[31] Formula_1.2-3 lattice_0.20-38 viridis_0.5.1 viridisLite_0.3.0 RColorBrewer_1.1-2
[36] pheatmap_1.0.12 psych_1.9.12 survminer_0.4.6 ggpubr_0.2.4 magrittr_1.5
[41] survival_3.1-8 table1_1.1 msigdbr_7.0.1 GSVA_1.34.0 GSEABase_1.48.0
[46] graph_1.64.0 annotate_1.64.0 XML_3.98-1.20 AnnotationDbi_1.48.0 IRanges_2.20.1
[51] S4Vectors_0.24.1 Biobase_2.46.0 BiocGenerics_0.32.0 broom_0.5.3 ggrepel_0.8.1
[56] gmodels_2.18.1 BH_1.72.0-2 data.table_1.12.8 forcats_0.4.0 stringr_1.4.0
[61] purrr_0.3.3 readr_1.3.1 tidyr_1.0.0 tibble_2.1.3 ggplot2_3.2.1
[66] tidyverse_1.3.0 dplyr_0.8.3
loaded via a namespace (and not attached):
[1] rappdirs_0.3.1 rtracklayer_1.46.0 R.methodsS3_1.7.1 acepack_1.4.1
[5] bit64_0.9-7 knitr_1.26 aroma.light_3.16.0 R.utils_2.9.2
[9] rpart_4.1-15 hwriter_1.3.2 RCurl_1.95-4.12 generics_0.0.2
[13] org.Mm.eg.db_3.10.0 preprocessCore_1.48.0 RSQLite_2.1.5 bit_1.1-14
[17] BiocStyle_2.14.2 xml2_1.2.2 lubridate_1.7.4 httpuv_1.5.2
[21] assertthat_0.2.1 xfun_0.11 hms_0.5.2 evaluate_0.14
[25] promises_1.1.0 fansi_0.4.0 progress_1.2.2 caTools_1.17.1.3
[29] readxl_1.3.1 km.ci_0.5-2 DBI_1.1.0 geneplotter_1.64.0
[33] htmlwidgets_1.5.1 corrplot_0.84 backports_1.1.5 vctrs_0.2.1
[37] abind_1.4-5 log4r_0.3.1 withr_2.1.2 BSgenome_1.54.0
[41] checkmate_1.9.4 prettyunits_1.0.2 mnormt_1.5-5 cluster_2.1.0
[45] NBPSeq_0.3.0 lazyeval_0.2.2 crayon_1.3.4 genefilter_1.68.0
[49] edgeR_3.28.0 pkgconfig_2.0.3 nlme_3.1-143 ProtGenerics_1.18.0
[53] nnet_7.3-12 rlang_0.4.2 lifecycle_0.1.0 affyio_1.56.0
[57] modelr_0.1.5 dichromat_2.0-0 cellranger_1.1.0 Matrix_1.2-18
[61] KMsurv_0.1-5 zoo_1.8-6 reprex_0.3.0 base64enc_0.1-3
[65] png_0.1-7 rjson_0.2.20 bitops_1.0-6 NOISeq_2.30.0
[69] R.oo_1.23.0 KernSmooth_2.23-16 blob_1.2.0 brew_1.0-6
[73] jpeg_0.1-8.1 ggsignif_0.6.0 scales_1.1.0 memoise_1.1.0
[77] plyr_1.8.5 gplots_3.0.1.1 gdata_2.18.0 zlibbioc_1.32.0
[81] compiler_3.6.1 ArgumentCheck_0.10.2 cli_2.0.0 affy_1.64.0
[85] htmlTable_1.13.3 MASS_7.3-51.5 tidyselect_0.2.5 vsn_3.54.0
[89] stringi_1.4.3 yaml_2.2.0 askpass_1.1 latticeExtra_0.6-29
[93] survMisc_0.5.5 VariantAnnotation_1.32.0 tools_3.6.1 rstudioapi_0.10
[97] foreign_0.8-74 gridExtra_2.3 digest_0.6.23 BiocManager_1.30.10
[101] shiny_1.4.0 Rcpp_1.0.3 BiocVersion_3.10.1 later_1.0.0
[105] org.Hs.eg.db_3.10.0 httr_1.4.1 RCircos_1.2.1 biovizBase_1.34.1
[109] colorspace_1.4-1 rvest_0.3.5 fs_1.3.1 splines_3.6.1
[113] shinythemes_1.1.2 xtable_1.8-4 jsonlite_1.6 baySeq_2.20.0
[117] zeallot_0.1.0 R6_2.4.1 pillar_1.4.3 htmltools_0.4.0
[121] mime_0.8 glue_1.3.1 fastmap_1.0.1 DT_0.11
[125] interactiveDisplayBase_1.24.0 utf8_1.1.4 curl_4.3 gtools_3.8.1
[129] openssl_1.4.1 rmarkdown_2.0 munsell_0.5.0 GenomeInfoDbData_1.2.2
[133] haven_2.2.0 gtable_0.3.0
Fusion caller:
I am using some custom fusions derived from a multi-caller that I have reformatted to a soapfuse format.
OS:
Windows 10
This is just as an FYI. You are currently exporting the rather odd RCircos.Env
object to the gobal environment to make your package work: https://github.com/stianlagstad/chimeraviz/blob/master/R/plot_circle.R#L249.
This is not particularly nice, since users now have this floating around their global envir, and it makes R CMD give you a NOTE. Here's a StackOverflow question with a simple alternative solution to the identical issue: https://stackoverflow.com/questions/56875962/r-package-transferring-environment-from-imported-package/56894153#56894153.
Hope this helps!
Hi, @stianlagstad !
I would like to know if there is a way to make a custom genome database, such as plant genomes, and then feed to import_jaffa with a new parameter?
I saw you have added mm10 since initial human genome. I thought with genome/transcript fasta and corresponding gff annotation file, we can make a database for any species.
It would also be great if you can provide me some suggestions like where I need to start and what fucntions I should modify. I may try myself then.
Hi Stian,
I am facing a different issue with another sample:
library(chimeraviz)
fc <- importFusioncatcher(filename = 'sample_out/final-list_candidate-fusion-genes.txt', genomeVersion = "hg38", limit = 100)
# circos plot
createFusionReport(fusions = fc, outputFilename = "sample_FusionCatcher_output.html")
# fusion of interest
fusion <- getFusionByGeneName(fc, geneName = 'RCC1')
fusion <- getFusionById(fc, id = '91')
# pull out reads from fusion catcher-STAR alignment
# split into fq1 and fq2
fastq1 <- 'sample_R1.fq'
fastq2 <- 'sample_R2.fq'
referenceFilename <- "sample_reference.fa"
writeFusionReference(fusion = fusion, filename = referenceFilename)
# First load the rsubread functions
source(system.file(
"scripts",
"rsubread.R",
package="chimeraviz"))
# Then create index
rsubreadIndex(referenceFasta = referenceFilename)
# And align
rsubreadAlign(
referenceName = referenceFilename,
fastq1 = fastq1,
fastq2 = fastq2,
outputBamFilename = "sample_fusionAlignment")
# plot reads
bamfile <- 'sample_fusionAlignment.bam'
fusion <- addFusionReadsAlignment(fusion, bamfile)
# getting an error on this step
plotFusionReads(fusion = fusion, showAllNucleotides = TRUE)
The plotFusionReads
function is giving me this error:
Error in data.frame(x1 = start(pairGaps) - 1, y1 = gy, x2 = end(pairGaps) + :
arguments imply differing number of rows: 0, 1
I am attaching all required data and script here: https://drive.google.com/open?id=0B-8gQV1WZcYdNnJyWmROclI0UFE
Please look at this whenever you get some time.
Thanks for all your help!
Hi, I was going through your code for importing ericscript
output and I think you may be naming the counts wrongly.
In the ericscript documentation it is stated:
crossingreads | the number of paired end discordant reads. |
---|---|
spanningreads | the number of paired end reads spanning the junction. |
In your code, you are defining the variables like this:
# Number of supporting reads
split_reads_count <- report[[i, "crossingreads"]]
spanning_reads_count <- report[[i, "spanningreads"]]
Shouldn't it be reversed? I understand it that a discordant read is a form of spanning read when the reads map to two different genes while split read is the read that covers the junction.
Or am I understanding it wrong?
Thank you very much :)
Hello,
I am trying to run chimeraviz ( version 1.4.3) on some tsv files generated from defuse ( mouse RNA-seq data, mm10 genome version ) but I am getting an error : Invalid genome version.
for some reason I am not able to update chimeraviz to the latest version and it seems the problem is related to not being able to update R( currently version 3.4.4) and Bioconductor on this version of Ubuntu
any ideas on how to get around the above problems?
Below are the requested information
Thank you
sessionInfo()
.sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.4 LTS
Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] grid stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] chimeraviz_1.4.3 ensembldb_2.2.2 AnnotationFilter_1.2.0 GenomicFeatures_1.30.3 AnnotationDbi_1.40.0
[6] Biobase_2.38.0 Gviz_1.22.3 GenomicRanges_1.30.3 GenomeInfoDb_1.14.0 Biostrings_2.46.0
[11] XVector_0.18.0 IRanges_2.12.0 S4Vectors_0.16.0 BiocGenerics_0.24.0
loaded via a namespace (and not attached):
[1] ProtGenerics_1.10.0 bitops_1.0-6 matrixStats_0.53.1 bit64_0.9-7
[5] RColorBrewer_1.1-2 progress_1.2.0 httr_1.3.1 rprojroot_1.3-2
[9] tools_3.4.4 backports_1.1.2 DT_0.4 R6_2.2.2
[13] rpart_4.1-13 Hmisc_4.1-1 DBI_1.0.0 lazyeval_0.2.1
[17] colorspace_1.3-2 nnet_7.3-12 gridExtra_2.3 prettyunits_1.0.2
[21] RMySQL_0.10.15 bit_1.1-14 curl_3.2 compiler_3.4.4
[25] htmlTable_1.12 DelayedArray_0.4.1 rtracklayer_1.38.3 scales_0.5.0
[29] checkmate_1.8.5 readr_1.1.1 RCircos_1.2.0 stringr_1.3.1
[33] digest_0.6.15 Rsamtools_1.30.0 foreign_0.8-70 rmarkdown_1.10
[37] base64enc_0.1-3 dichromat_2.0-0 pkgconfig_2.0.1 htmltools_0.3.6
[41] BSgenome_1.46.0 htmlwidgets_1.2 rlang_0.2.1 rstudioapi_0.7
[45] RSQLite_2.1.1 BiocInstaller_1.28.0 shiny_1.1.0 BiocParallel_1.12.0
[49] acepack_1.4.1 VariantAnnotation_1.24.5 RCurl_1.95-4.10 magrittr_1.5
[53] GenomeInfoDbData_1.0.0 Formula_1.2-3 Matrix_1.2-14 Rcpp_0.12.17
[57] munsell_0.5.0 stringi_1.2.3 yaml_2.1.19 SummarizedExperiment_1.8.1
[61] zlibbioc_1.24.0 org.Hs.eg.db_3.5.0 plyr_1.8.4 AnnotationHub_2.10.1
[65] blob_1.1.1 promises_1.0.1 crayon_1.3.4 lattice_0.20-35
[69] splines_3.4.4 hms_0.4.2 knitr_1.20 pillar_1.2.3
[73] biomaRt_2.34.2 XML_3.98-1.11 evaluate_0.10.1 biovizBase_1.26.0
[77] latticeExtra_0.6-28 data.table_1.11.4 httpuv_1.4.4.1 gtable_0.2.0
[81] assertthat_0.2.0 ggplot2_2.2.1 mime_0.5 xtable_1.8-2
[85] later_0.7.3 survival_2.42-3 tibble_1.4.2 GenomicAlignments_1.14.2
[89] memoise_1.1.0 cluster_2.0.7-1 interactiveDisplayBase_1.16.0 BiocStyle_2.6.1
fusions <- importDefuse("results.filtered.tsv", "mm10", 100)
Error in importDefuse("results.filtered.tsv", "mm10", 100) :
Invalid genome version given
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5209911/
https://github.com/Chimera-tools/ChimPipe
The output of ChimPipe is described here: https://chimpipe.readthedocs.io/en/latest/manual.html#output
Hi Stian,
Is it possible to combine the results of different fusion software in chimraviz then generate just one The fusion report report.
i.e. if i ran two fusion software (Ericscript and Fusion catcher), i can import these using the importEricscript() and importFusioncatcher() but i would like to combine these tow objects first then create the html fusion report.
Thanks,
Keyur
Hi,
As of v0.8.0 of STAR-Fusion, it would seem that the column headers of the 'star-fusion.fusion_candidates.final.abridged' have changed, which is causing an error when trying to import the file using importStarfusion.R
:
Reading filename caused a warning:
Error in vector("list", dim(report)[1]) : invalid 'length' argument
In addition: Warning message:
The following named parsers don't match the column names: #fusion_name, JunctionReads, SpanningFrags, Splice_type
It looks like in this version of STAR-Fusion, the column headers and data are as follows:
#FusionName JunctionReadCount SpanningFragCount SpliceType LeftGene LeftBreakpoint RightGene RightBreakpoint LargeAnchorSupport LeftBreakDinuc LeftBreakEntropy RightBreakDinuc RightBreakEntropy
FGFR3--AC016773.1 5051 0 INCL_NON_REF_SPLICE FGFR3^ENSG00000068078.13 chr4:1808661:+ AC016773.1^ENSG00000218422.2 chr4:1741429:+ YES_LDAS GT 1.8892 AG 1.7819
FGFR3--TACC3 4014 0 ONLY_REF_SPLICE FGFR3^ENSG00000068078.13 chr4:1808661:+ TACC3^ENSG00000013810.14 chr4:1741429:+ YES_LDAS GT 1.8892 AG 1.7819
FGFR3--TACC3 1033 0 INCL_NON_REF_SPLICE FGFR3^ENSG00000068078.13 chr4:1808663:+ TACC3^ENSG00000013810.14 chr4:1741432:+ YES_LDAS GA 1.8892 TA 1.5219
TBL1XR1--PIK3CA 8 0 ONLY_REF_SPLICE TBL1XR1^ENSG00000177565.11 chr3:176914909:- PIK3CA^ENSG00000121879.3 chr3:178916538:+ YES_LDAS GT 1.8892 AG 1.6895
BHLHE40--MT-CYB 3 0 INCL_NON_REF_SPLICE BHLHE40^ENSG00000134107.4 chr3:5025126:+ MT-CYB^ENSG00000198727.2 chrM:14815:+ YES_LDAS CT 1.7465 AC 1.4566
I manually edited the file and verified that import is successful if I change the column headers to match what the function is expecting.
The coverage isn't showing in the fusion plot when `reduceTranscripts = TRUE. See the plot here.
My R Command:
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
# The following initializes usage of Bioc devel
BiocManager::install(version='devel')
BiocManager::install("chimeraviz")
library(chimeraviz)
library(RCircos)
starfusionData <- system.file(
"extdata",
"star-fusion.fusion_candidates.final.abridged.txt",
package = "chimeraviz")
fusions <- import_starfusion(starfusionData, "hg19", 3)
length(fusions)
plot_fusion(
fusions,
edb = NULL,
bamfile = NULL,
which_transcripts = "exonBoundary",
ylim = c(0, 1000),
non_ucsc = TRUE,
reduce_transcripts = FALSE,
bedgraphfile = NULL
)
plot_circle(fusions)
create_fusion_report(fusions, "output.html")
I got results Info:
> plot_fusion(
+ fusions,
+ edb = NULL,
+ bamfile = NULL,
+ which_transcripts = "exonBoundary",
+ ylim = c(0, 1000),
+ non_ucsc = TRUE,
+ reduce_transcripts = FALSE,
+ bedgraphfile = NULL
+ )
Error in h(simpleError(msg, call)) :
ๅจ็บ 'isEmpty' ๅฝๅผ้ธๆๆนๆณๆ่ฉไผฐ 'x' ๅผๆธ็ผ็้ฏ่ชค: trying to get slot "gene_upstream" from an object of a basic class ("list") with no slots
>
> plot_circle(fusions)
RCircos.Core.Components initialized.
Type ?RCircos.Reset.Plot.Parameters to see how to modify the core components.
Not all labels will be plotted.
Type RCircos.Get.Gene.Name.Plot.Parameters()
to see the number of labels for each chromosome.
Not all labels will be plotted.
Type RCircos.Get.Gene.Name.Plot.Parameters()
to see the number of labels for each chromosome.
>
> create_fusion_report(fusions, "output.html")
RCircos.Core.Components initialized.
Type ?RCircos.Reset.Plot.Parameters to see how to modify the core components.
Not all labels will be plotted.
Type RCircos.Get.Gene.Name.Plot.Parameters()
to see the number of labels for each chromosome.
Not all labels will be plotted.
Type RCircos.Get.Gene.Name.Plot.Parameters()
to see the number of labels for each chromosome.
Although I have got the figure, but I have 2 question:
question1:
Error in h(simpleError(msg, call)) :
ๅจ็บ 'isEmpty' ๅฝๅผ้ธๆๆนๆณๆ่ฉไผฐ 'x' ๅผๆธ็ผ็้ฏ่ชค: trying to get slot "gene_upstream" from an object of a basic class ("list") with no slots
How to solve that?
question2:
I have 300 fusions, I want all fusion plot into figure, but
It's said "Not all labels will be plotted.".
I try
params <- RCircos.Get.Plot.Parameters() #$char.width
params$char.width <- 100 #default 500
RCircos.Reset.Plot.Parameters(params)
#the maxLabels are updated accordingly
RCircos.Get.Gene.Name.Plot.Parameters()
but results not change
How to solve that?
Thanks so mach!
yubau
ArgumentCheck has been deprecated, which means that chimeraviz cannot be installed using BiocManager anymore:
https://cran.rstudio.com/web/packages/ArgumentCheck/index.html
Migration instructions are available here: https://github.com/cran/ArgumentCheck
Hi!
Thanks for the convenient package to visualize gene fusions!
I've come across an error whilst trying to visualize protein domains. The downstream gene seems to work fine but for the upstream gene only lines with start coordinates as label are displayed instead of proper protein domain blocks.
Looking at the code of plot_fusion_transcript_with_protein_domain.R it is quite easily seen where the bug is, if you compare upstream/downstream gene code, because start coordinates are used instead of end coordinates and instead of the description label. (lines 778 and 788)
Could you fix this?
Best regards,
Ianthe van Belzen
R version 4.0.3 (2020-10-10)
Platform: x86_64-apple-darwin18.7.0 (64-bit)
Running under: macOS Mojave 10.14.6
chimeraviz_1.16.0
Hi,
trying out the examples in: http://www.bioconductor.org/packages/release/bioc/vignettes/chimeraviz/inst/doc/chimeraviz-vignette.html
And when I get to the example at 4.2 The fusion report, the output.html file never generates...
if(!exists("soapfuse833ke"))
soapfuse833ke <- system.file(
"extdata",
"soapfuse_833ke_final.Fusion.specific.for.genes",
package = "chimeraviz")
fusions <- import_soapfuse(soapfuse833ke, "hg38", 10)
# Create report!
create_fusion_report(fusions, "output.html")
No file in the getwd() directory....
Also tried the example in https://www.bioconductor.org/packages/devel/bioc/manuals/chimeraviz/man/chimeraviz.pdf
defuse833ke <- system.file("extdata","defuse_833ke_results.filtered.tsv",package="chimeraviz")
fusions <- import_defuse(defuse833ke, "hg19", 3)
# Temporary file to store the report
output_filename <- tempfile(pattern = "fusionReport",fileext = ".html",tmpdir = tempdir())
# Create report
create_fusion_report(fusions, output_filename)
Nothing in the "output_filename" temp directory.
The rest of the examples work.
Any tips to start to debug this ?
Thanks
B.
=====
> sessionInfo()
R version 3.5.3 (2019-03-11)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.3
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
attached base packages:
[1] grid stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] chimeraviz_1.8.5 data.table_1.12.0 ensembldb_2.6.7 AnnotationFilter_1.6.0 GenomicFeatures_1.34.7
[6] AnnotationDbi_1.44.0 Biobase_2.42.0 Gviz_1.26.5 GenomicRanges_1.34.0 GenomeInfoDb_1.18.2
[11] Biostrings_2.50.2 XVector_0.22.0 IRanges_2.16.0 S4Vectors_0.20.1 BiocGenerics_0.28.0
loaded via a namespace (and not attached):
[1] ProtGenerics_1.14.0 bitops_1.0-6 matrixStats_0.54.0 bit64_0.9-7
[5] RColorBrewer_1.1-2 progress_1.2.0 httr_1.4.0 Rgraphviz_2.26.0
[9] tools_3.5.3 backports_1.1.3 DT_0.5 R6_2.4.0
[13] rpart_4.1-13 Hmisc_4.2-0 DBI_1.0.0 lazyeval_0.2.2
[17] colorspace_1.4-1 nnet_7.3-12 tidyselect_0.2.5 gridExtra_2.3
[21] prettyunits_1.0.2 bit_1.1-14 curl_3.3 compiler_3.5.3
[25] graph_1.60.0 htmlTable_1.13.1 DelayedArray_0.8.0 rtracklayer_1.42.2
[29] scales_1.0.0 checkmate_1.9.1 RCircos_1.2.1 stringr_1.4.0
[33] digest_0.6.18 Rsamtools_1.34.1 foreign_0.8-71 rmarkdown_1.12
[37] base64enc_0.1-3 dichromat_2.0-0 pkgconfig_2.0.2 htmltools_0.3.6
[41] BSgenome_1.50.0 htmlwidgets_1.3 rlang_0.3.2 rstudioapi_0.10
[45] RSQLite_2.1.1 shiny_1.2.0 jsonlite_1.6 crosstalk_1.0.0
[49] gtools_3.8.1 BiocParallel_1.16.6 acepack_1.4.1 dplyr_0.8.0.1
[53] VariantAnnotation_1.28.13 RCurl_1.95-4.12 magrittr_1.5 GenomeInfoDbData_1.2.0
[57] Formula_1.2-3 Matrix_1.2-15 Rcpp_1.0.1 munsell_0.5.0
[61] stringi_1.4.3 yaml_2.2.0 SummarizedExperiment_1.12.0 zlibbioc_1.28.0
[65] org.Hs.eg.db_3.7.0 plyr_1.8.4 blob_1.1.1 promises_1.0.1
[69] crayon_1.3.4 lattice_0.20-38 splines_3.5.3 hms_0.4.2
[73] knitr_1.22 pillar_1.3.1 biomaRt_2.38.0 XML_3.98-1.19
[77] glue_1.3.1 evaluate_0.13 biovizBase_1.30.1 latticeExtra_0.6-28
[81] BiocManager_1.30.4 httpuv_1.5.0 org.Mm.eg.db_3.7.0 gtable_0.3.0
[85] purrr_0.3.2 assertthat_0.2.1 ggplot2_3.1.0 xfun_0.5
[89] mime_0.6 xtable_1.8-3 later_0.8.0 ArgumentCheck_0.10.2
[93] survival_2.43-3 tibble_2.1.1 GenomicAlignments_1.18.1 memoise_1.1.0
[97] cluster_2.0.7-1 BiocStyle_2.10.0
>
Hello Stian,
I am trying to run your plot_fusion_transcript_with_protein_domain without a BAM file but I am getting an error.
These are the code and data I am using:
plot_fusion_transcript_with_protein_domain(
fusion = get_fusion_by_id(fusions, 14),
edb = edb,
bedfile = full_enst_to_pfam_map_format.txt",
gene_upstream_transcript = "ENST00000377604",
gene_downstream_transcript = "ENST00000315869",
plot_downstream_protein_domains_if_fusion_is_out_of_frame = TRUE)
And this is the error I am getting:
Error in file.exists(bamfile) : invalid 'file' argument
Is there not a way to do run this if I do not want to plot the coverage on the plot?
Matrix products: default
locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252 LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C LC_TIME=English_United Kingdom.1252
attached base packages:
[1] grid stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] AnnotationHub_2.18.0 BiocFileCache_1.10.2 dbplyr_1.4.2 EnsDb.Hsapiens.v86_2.99.0 chimeraviz_1.12.0
[6] ensembldb_2.10.2 AnnotationFilter_1.10.0 GenomicFeatures_1.38.0 Gviz_1.30.0 biomaRt_2.42.0
[11] dendsort_0.3.3 metaseqR_1.26.0 qvalue_2.18.0 limma_3.42.0 DESeq_1.38.0
[16] locfit_1.5-9.1 EDASeq_2.20.0 ShortRead_1.44.1 GenomicAlignments_1.22.1 SummarizedExperiment_1.16.1
[21] DelayedArray_0.12.1 matrixStats_0.55.0 Rsamtools_2.2.1 GenomicRanges_1.38.0 GenomeInfoDb_1.22.0
[26] Biostrings_2.54.0 XVector_0.26.0 BiocParallel_1.20.1 reshape2_1.4.3 Hmisc_4.3-0
[31] Formula_1.2-3 lattice_0.20-38 viridis_0.5.1 viridisLite_0.3.0 RColorBrewer_1.1-2
[36] pheatmap_1.0.12 psych_1.9.12 survminer_0.4.6 ggpubr_0.2.4 magrittr_1.5
[41] survival_3.1-8 table1_1.1 msigdbr_7.0.1 GSVA_1.34.0 GSEABase_1.48.0
[46] graph_1.64.0 annotate_1.64.0 XML_3.98-1.20 AnnotationDbi_1.48.0 IRanges_2.20.1
[51] S4Vectors_0.24.1 Biobase_2.46.0 BiocGenerics_0.32.0 broom_0.5.3 ggrepel_0.8.1
[56] gmodels_2.18.1 BH_1.72.0-2 data.table_1.12.8 forcats_0.4.0 stringr_1.4.0
[61] purrr_0.3.3 readr_1.3.1 tidyr_1.0.0 tibble_2.1.3 ggplot2_3.2.1
[66] tidyverse_1.3.0 dplyr_0.8.3
loaded via a namespace (and not attached):
[1] rappdirs_0.3.1 rtracklayer_1.46.0 R.methodsS3_1.7.1 acepack_1.4.1
[5] bit64_0.9-7 knitr_1.26 aroma.light_3.16.0 R.utils_2.9.2
[9] rpart_4.1-15 hwriter_1.3.2 RCurl_1.95-4.12 generics_0.0.2
[13] org.Mm.eg.db_3.10.0 preprocessCore_1.48.0 RSQLite_2.1.5 bit_1.1-14
[17] BiocStyle_2.14.2 xml2_1.2.2 lubridate_1.7.4 httpuv_1.5.2
[21] assertthat_0.2.1 xfun_0.11 hms_0.5.2 evaluate_0.14
[25] promises_1.1.0 fansi_0.4.0 progress_1.2.2 caTools_1.17.1.3
[29] readxl_1.3.1 Rgraphviz_2.30.0 km.ci_0.5-2 DBI_1.1.0
[33] geneplotter_1.64.0 htmlwidgets_1.5.1 corrplot_0.84 backports_1.1.5
[37] vctrs_0.2.1 abind_1.4-5 log4r_0.3.1 withr_2.1.2
[41] BSgenome_1.54.0 checkmate_1.9.4 prettyunits_1.0.2 mnormt_1.5-5
[45] cluster_2.1.0 NBPSeq_0.3.0 lazyeval_0.2.2 crayon_1.3.4
[49] genefilter_1.68.0 edgeR_3.28.0 pkgconfig_2.0.3 nlme_3.1-143
[53] ProtGenerics_1.18.0 nnet_7.3-12 rlang_0.4.2 lifecycle_0.1.0
[57] affyio_1.56.0 modelr_0.1.5 dichromat_2.0-0 cellranger_1.1.0
[61] Matrix_1.2-18 KMsurv_0.1-5 zoo_1.8-6 reprex_0.3.0
[65] base64enc_0.1-3 png_0.1-7 rjson_0.2.20 bitops_1.0-6
[69] NOISeq_2.30.0 R.oo_1.23.0 KernSmooth_2.23-16 blob_1.2.0
[73] brew_1.0-6 jpeg_0.1-8.1 ggsignif_0.6.0 scales_1.1.0
[77] memoise_1.1.0 plyr_1.8.5 gplots_3.0.1.1 gdata_2.18.0
[81] zlibbioc_1.32.0 compiler_3.6.1 ArgumentCheck_0.10.2 cli_2.0.0
[85] affy_1.64.0 htmlTable_1.13.3 MASS_7.3-51.5 tidyselect_0.2.5
[89] vsn_3.54.0 stringi_1.4.3 yaml_2.2.0 askpass_1.1
[93] latticeExtra_0.6-29 survMisc_0.5.5 VariantAnnotation_1.32.0 tools_3.6.1
[97] rstudioapi_0.10 foreign_0.8-74 gridExtra_2.3 digest_0.6.23
[101] BiocManager_1.30.10 shiny_1.4.0 Rcpp_1.0.3 BiocVersion_3.10.1
[105] later_1.0.0 org.Hs.eg.db_3.10.0 httr_1.4.1 RCircos_1.2.1
[109] biovizBase_1.34.1 colorspace_1.4-1 rvest_0.3.5 fs_1.3.1
[113] splines_3.6.1 shinythemes_1.1.2 xtable_1.8-4 jsonlite_1.6
[117] baySeq_2.20.0 zeallot_0.1.0 R6_2.4.1 pillar_1.4.3
[121] htmltools_0.4.0 mime_0.8 glue_1.3.1 fastmap_1.0.1
[125] DT_0.11 interactiveDisplayBase_1.24.0 utf8_1.1.4 curl_4.3
[129] gtools_3.8.1 openssl_1.4.1 rmarkdown_2.0 munsell_0.5.0
[133] GenomeInfoDbData_1.2.2 haven_2.2.0 gtable_0.3.0
Fusion caller:
I am using some custom fusions derived from a multi-caller that I have reformatted to a soapfuse format.
OS:
Windows 10
We are using STAR-SEQR for fusion detection and I am trying to visualize the data using chimeraviz. Is there a good way to input the STAR-SEQR's output? Below is some information about the data and sessionInfo():
samples NAME
1 002-002 CCDC91--CD47
2 002-002 PGD--GK
3 002-002 DSTN--PCSK2
4 002-002 MANBAL--RRBP1
5 002-002 IGKV4-1--IGKJ4
6 002-002 KDM4C--HERC3
NREAD_SPANS NREAD_JXNLEFT NREAD_JXNRIGHT FUSION_CLASS
1 0 0 2 TRANSLOCATION
2 0 0 3 TRANSLOCATION
3 1 0 2 READ_THROUGH
4 0 2 0 INTERCHROM_INTERSTRAND
5 9 3 3 READ_THROUGH
6 0 0 2 TRANSLOCATION
SPLICE_TYPE BRKPT_LEFT BRKPT_RIGHT LEFT_SYMBOL
1 CANONICAL_SPLICING chr12:28515447:+ chr3:107779698:- CCDC91
2 CANONICAL_SPLICING chr1:10460628:+ chrX:30742217:+ PGD
3 CANONICAL_SPLICING chr20:17550855:+ chr20:17389907:+ DSTN
4 CANONICAL_SPLICING chr20:35929815:+ chr20:17641172:- MANBAL
5 NON-CANONICAL_SPLICING chr2:89185666:+ chr2:89160432:- IGKV4-1
6 CANONICAL_SPLICING chr9:6893231:+ chr4:89607886:+ KDM4C
RIGHT_SYMBOL ANNOT_FORMAT
1 CD47 Symbol:Transcript:Strand:Exon_No:Dist_to_Exon:Frame:CDS_Length
2 GK Symbol:Transcript:Strand:Exon_No:Dist_to_Exon:Frame:CDS_Length
3 PCSK2 Symbol:Transcript:Strand:Exon_No:Dist_to_Exon:Frame:CDS_Length
4 RRBP1 Symbol:Transcript:Strand:Exon_No:Dist_to_Exon:Frame:CDS_Length
5 IGKJ4 Symbol:Transcript:Strand:Exon_No:Dist_to_Exon:Frame:CDS_Length
6 HERC3 Symbol:Transcript:Strand:Exon_No:Dist_to_Exon:Frame:CDS_Length
LEFT_ANNOT
1 CCDC91:ENST00000381259.5_1:+:6:0:0:291958,CCDC91:ENST00000539107.5_2:+:7:0:0:291958,CCDC91:ENST00000545336.5_1:+:10:0:0:291958,CCDC91:ENST00000545737.5_1:+:6:0:0:195425,CCDC91:ENST00000536442.5_1:+:7:0:0:195421,CCDC91:ENST00000543809.5_1:+:7:0:0:155580,CCDC91:ENST00000535520.5_1:+:9:0:-1:47469,CCDC91:ENST00000539904.1_1:+:6:0:-1:0,CCDC91:ENST00000540401.5_1:+:6:0:-1:0,CCDC91:ENST00000540794.5_1:+:NA:NA:NA:268730,CCDC91:ENST00000536154.5_1:+:NA:NA:NA:33932
2 PGD:ENST00000270776.13_2:+:3:0:0:20632,PGD:ENST00000460189.1_1:+:2:0:0:13620,PGD:ENST00000491493.5_3:+:3:0:0:11893,PGD:ENST00000465632.5_1:+:2:0:0:8481,PGD:ENST00000477958.5_1:+:3:0:0:1366,PGD:ENST00000483936.5_1:+:NA:NA:NA:18392
3 DSTN:ENST00000246069.12_2:+:1:0:0:36938,DSTN:ENST00000449141.2_1:+:1:0:0:34921,DSTN:ENST00000474024.5_1:+:1:0:-1:6361
4 MANBAL:ENST00000373605.7_1:+:3:0:0:15152,MANBAL:ENST00000373606.7_1:+:2:0:0:15152,MANBAL:ENST00000397151.1_1:+:3:0:0:15152,MANBAL:ENST00000397152.7_1:+:4:0:0:15152,MANBAL:ENST00000397150.5_1:+:2:0:0:1037
5 IGKV4-1:ENST00000390243.2_2:+:2:2:1:582
6 KDM4C:ENST00000381309.7_3:+:8:0:0:381741,KDM4C:ENST00000381306.7_3:+:8:0:0:377052,KDM4C:ENST00000536108.5_2:+:8:0:0:355503,KDM4C:ENST00000543771.5_2:+:8:0:0:283463,KDM4C:ENST00000438023.5_2:+:8:0:0:188860,KDM4C:ENST00000496464.1_1:+:1:0:-1:0,KDM4C:ENST00000489243.5_1:+:8:400:-1:0
RIGHT_ANNOT
1 CD47:ENST00000355354.13_3:-:4:0:1:43625,CD47:ENST00000361309.5_2:-:4:0:1:43621,CD47:ENST00000644850.1_1:-:4:0:1:20799
2 GK:ENST00000378943.7_2:+:18:0:1:75205,GK:ENST00000378945.7_2:+:18:0:1:75205,GK:ENST00000378946.7_1:+:19:0:1:75205,GK:ENST00000427190.5_2:+:19:0:1:75205,GK:ENST00000481024.5_1:+:20:0:-1:20814,GK-AS1:ENST00000464659.1_1:-:1:74:-1:0
3 PCSK2:ENST00000262545.7_2:+:6:0:0:254765,PCSK2:ENST00000536609.1_1:+:5:0:0:254765,PCSK2:ENST00000377899.5_1:+:7:0:0:254708,PCSK2:ENST00000470007.1_1:+:6:0:-1:0
4 RRBP1:ENST00000360807.8_3:-:2:0:0:46326,RRBP1:ENST00000377807.6_3:-:3:0:0:46326,RRBP1:ENST00000377813.5_3:-:3:0:0:46326,RRBP1:ENST00000398782.2_5:-:2:0:0:902,RRBP1:ENST00000455029.3_1:-:NA:NA:NA:28881
5 IGKJ4:ENST00000390239.2_3:-:1:1:0:37,AC244205.1:ENST00000624935.3_2:-:NA:NA:NA:0
6 HERC3:ENST00000264345.7_1:+:20:0:2:101137,HERC3:ENST00000402738.6_3:+:22:0:2:101137,HERC3:ENST00000512194.1_1:+:6:0:2:17161
DISTANCE
1 NA
2 NA
3 160950
4 18288643
5 25234
6 NA
ASSEMBLED_CONTIGS
1 GGATCTATATTTAAGTGCTTATATTCATCCACAATAATGCTGAGGGCTTCG
2 GGGCAAGCTGTGGATGATTTCATCGAGAAATTGAAAGTGAAATTCGTTATT,AAGTGGTATTCCATAAAACCTACCAACTCATGGATTCCCAAGATGTGAGCT
3 CCTGCGACCGCCGCGGCGAAGATGAATGCCGAAGCAAGTTACGACTTCAGCAGCAACGACCCCTATCCTTACCCTCG
4 GGACTCTTCCTGGGAGCCATCTTCCAGCTCATCTGTGTGCTGGCCATCATC
5 GGCCTCTCTGGGATAGAAGTTATTCAGCAGGCACACAACAGAGGCAGTTCCAGATTTCAACTGCTCATCAGATGGCGGGAAGATGAAGACAGATGGTGCAGCCACAGTTCGTTTGATCTCCACCTTGGTCCCTCCGCCGAAAGTGAGAGTATTATAATATTGCTGACAGTAATAAACTGCCACATCTTCAGCCTGCAGGCTGCTGATGGTGAGAG
6 GATTGACTATGGAAAAGTTGCCAAATTGGAGTCTCCAAGAGCTTTTAGAT,ATCTGCCGAGAAAGCTATGGAGTGATTGAACAGAAGAAGCTGATACCTGGG
ASSEMBLY_CROSS_JXN PRIMERS
1 TRUE AGGAAAGCTGGTCACGAAGC,CCTGGGACGAAAAGAATGGC
2 TRUE TGGGCAAGCTGTGGATGATT,AGCTCACATCTTGGGAATCCA
3 TRUE GAGGACGGTCTGCATACTCG,TCGTTGCTGCTGAAGTCGTA
4 FALSE ACTCTTCCTGGGAGCCATCT,AAAGACCACAACCCCCAAGG
5 TRUE TCACTCTCACCATCAGCAGC,TGATCTCCACCTTGGTCCCT
6 TRUE CAAGATAACCCAGGAGGCTGG,AGTCTCCTCCACATCCTCCC
ID SPAN_CROSSHOM_SCORE JXN_CROSSHOM_SCORE
1 chr12:28515449:+:chr3:107779700:-:4:0 0 0
2 chr1:10460630:+:chrX:30742217:+:1:0 0 0
3 chr20:17550857:+:chr20:17389907:+:1:0 0 0
4 chr20:35929817:+:chr20:17641174:-:1:4 0 0
5 chr2:89185668:+:chr2:89160434:-:1:0 0 0
6 chr9:6893233:+:chr4:89607886:+:1:1 0 0
OVERHANG_DIVERSITY MINFRAG20 MINFRAG35 OVERHANG_MEANBQ SPAN_MEANBQ JXN_MEANBQ
1 1 1 0 38.50000 NA 38.75000
2 2 2 0 37.33333 NA 38.00000
3 2 2 0 37.50000 38.50000 36.75000
4 2 2 0 39.00000 NA 39.25000
5 3 3 1 31.66667 37.22222 29.83333
6 1 1 0 36.00000 NA 38.25000
OVERHANG_BQ15 SPAN_BQ15 JXN_BQ15 OVERHANG_MM SPAN_MM JXN_MM
1 2 0 4 0 NA 0.0000000
2 3 0 6 0 NA 0.3333333
3 2 2 4 0 0.5000000 0.0000000
4 2 0 4 0 NA 0.0000000
5 6 18 12 1 0.4444444 0.0000000
6 2 0 4 0 NA 0.0000000
OVERHANG_MEANLEN SPAN_MEANLEN JXN_MEANLEN TPM_FUSION TPM_LEFT TPM_RIGHT
1 34.00000 NA 34.00000 7.333199 6.400550 58.733329
2 32.66667 NA 33.83333 4.518474 14.731858 14.385095
3 23.50000 41.50000 39.25000 8.130358 248.322240 20.247429
4 33.00000 NA 34.00000 111.009156 31.226601 6.077102
5 32.33333 49.33333 30.33333 21.386369 13.404606 110.877316
6 28.00000 NA 36.75000 28.077132 8.035374 12.548261
MAX_TRX_FUSION DISPOSITION
1 ENST00000381259.5_1--ENST00000644850.1_1|670 PASS
2 ENST00000270776.13_2--ENST00000378945.7_2|318 PASS
3 ENST00000246069.12_2--ENST00000470007.1_1|137 PASS
4 ENST00000397150.5_1--ENST00000398782.2_5|244 PASS
5 ENST00000390243.2_2--ENST00000390239.2_3|536 PASS
6 ENST00000543771.5_2--ENST00000512194.1_1|999 PASS
R version 4.0.3 (2020-10-10)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Ubuntu precise (12.04.5 LTS)
Matrix products: default
BLAS/LAPACK: /mounts/isilon/data/eahome/u1072932/anaconda3/envs/r-4.0.3/lib/libopenblasp-r0.3.15.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] writexl_1.3.1 readxl_1.3.1 forcats_0.5.0 stringr_1.4.0
[5] dplyr_1.0.7 purrr_0.3.4 readr_1.4.0 tidyr_1.1.2
[9] tibble_3.1.2 ggplot2_3.3.5 tidyverse_1.3.0
loaded via a namespace (and not attached):
[1] Rcpp_1.0.6 cellranger_1.1.0 pillar_1.6.1 compiler_4.0.3
[5] dbplyr_2.1.1 tools_4.0.3 jsonlite_1.7.2 lubridate_1.7.9.2
[9] lifecycle_1.0.0 gtable_0.3.0 pkgconfig_2.0.3 rlang_0.4.11
[13] reprex_0.3.0 cli_3.0.0 rstudioapi_0.13 DBI_1.1.1
[17] haven_2.3.1 withr_2.4.2 xml2_1.3.2 httr_1.4.2
[21] fs_1.5.0 generics_0.1.0 vctrs_0.3.8 hms_1.1.0
[25] grid_4.0.3 tidyselect_1.1.1 glue_1.4.2 R6_2.5.0
[29] fansi_0.5.0 modelr_0.1.8 magrittr_2.0.1 backports_1.2.1
[33] scales_1.1.1 ellipsis_0.3.2 rvest_0.3.6 assertthat_0.2.1
[37] colorspace_2.0-2 utf8_1.2.1 stringi_1.6.2 munsell_0.5.0
[41] broom_0.7.5 crayon_1.4.1
Thank you for for the nice tool!
However, I'm not able to get the coverage plotted with the plotFusionTranscipt command, but it works for the other plots. I only get the blue/green exons and the axis.
I'm using bedGraph files and not bam files. I'm using chimeraviz_1.4.3 installed with bioconductor
Also, is it possible to change the axis of only one of the genes in "plotFusion"? I have a very highly expressed 5' partner so if you change the axis its difficult to see the changes in the not so highly expressed 3' partner.
I also have a suggestion for a new plot, not sure if its possible but it would be nice to have something like the "plotTranscripts" but only showing the exons like in "plotFusionTranscipt". Then it would be easier to see if there is a change in coverage at the breakpoint.
/ Jakob
Hi,
I'm using STAR-Fusion-v1.3.2 and the chimeraviz: "1.6.0". I was able to input the tsv file from star-fusion and circos plot can be generated. However, when I tried
plot_transcripts(
fusion = get_fusion_by_id(star.fusion, 1) ,
edb = edb,
reduce_transcripts = T)
I get this error.
Fetching transcripts for gene partners..
Error in get_transcripts_ensembl_db(fusion, edb) :
No transcripts available for the genes ZNF384 and TAF15.
When I the same fusion and sample is imported with soap the code runs perfectly.
I included a test tsvfrom Star. ( file name is .txt so that I can upload example)
thanks!
Hi,
I'm using version chimeraviz_1.13.8.
There is strange error I'm getting whereby the labeling of the chromosomes are not labeled correctly?
For example here is a fusion
[1] "Fusion object"
[1] "id: 4"
[1] "Fusion tool: starfusion"
[1] "Genome version: hg38"
[1] "Gene names: NSD1-RHOG"
[1] "Chromosomes: chr5-chr11"
[1] "Strands: +,-"
[1] "In-frame?: FALSE"
so here it should be labled as chr5 and 11 however ths is what I get instead when I try to plot it.
plot_fusion(
fusion = draw.this,
edb = edb,
reduce_transcripts = T
)
attached is an rds for the draw.this object I used to plot.
https://www.dropbox.com/s/ihuyz57oz9sx2oq/test.rds?dl=1
R version 3.6.2 (2019-12-12)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 10 (buster)
Matrix products: default
BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.3.5.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=C LC_PAPER=en_US.UTF-8
[8] LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] grid splines stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] chimeraviz_1.13.8 GSVA_1.34.0 corrplot_0.84 parallelDist_0.2.4 mrfDepth_1.0.11 DrInsight_0.1.1
[7] qusage_2.20.0 igraph_1.2.5 ensembldb_2.10.2 AnnotationFilter_1.10.0 GenomicFeatures_1.38.2 Gviz_1.30.3
[13] Biostrings_2.54.0 XVector_0.26.0 univOutl_0.1-5 Hmisc_4.4-0 Formula_1.2-3 robustbase_0.93-6
[19] moments_0.14 org.Hs.eg.db_3.10.0 AnnotationDbi_1.48.0 heatmap3_1.1.7 ggrepel_0.8.2 PerformanceAnalytics_2.0.4
[25] xts_0.12-0 zoo_1.8-7 d3heatmap_0.6.1.2 dendextend_1.13.4 wesanderson_0.3.6 vegan_2.5-6
[31] lattice_0.20-38 permute_0.9-5 FactoMineR_2.3 factoextra_1.0.6 edgeR_3.28.1 gtable_0.3.0
[37] forestplot_1.9 checkmate_2.0.0 data.table_1.12.8 ggbeeswarm_0.6.0 survminer_0.4.6.999 survival_3.1-11
[43] karyoploteR_1.12.4 regioneR_1.18.1 gtools_3.8.1 garnett_0.2.11 shiny_1.4.0.2 monocle3_0.2.1
[49] SingleCellExperiment_1.8.0 SummarizedExperiment_1.16.1 DelayedArray_0.12.2 BiocParallel_1.20.1 matrixStats_0.56.0 GenomicRanges_1.38.0
[55] GenomeInfoDb_1.22.0 IRanges_2.20.2 S4Vectors_0.24.3 leidenbase_0.1.0 gridExtra_2.3 fgsea_1.12.0
[61] Rcpp_1.0.4 ComplexHeatmap_2.2.0 reshape2_1.4.3 feather_0.3.5 patchwork_1.0.0 monocle_2.14.0
[67] DDRTree_0.1.5 irlba_2.3.3 VGAM_1.1-2 Biobase_2.46.0 BiocGenerics_0.32.0 ggridges_0.5.2
[73] openxlsx_4.1.4 limma_3.42.2 ggpubr_0.2.5 magrittr_1.5 forcats_0.5.0 stringr_1.4.0
[79] purrr_0.3.3 readr_1.3.1 tidyr_1.0.2 tibble_2.1.3 ggplot2_3.3.0 tidyverse_1.3.0
[85] kableExtra_1.1.0 knitr_1.28 biomaRt_2.42.1 cowplot_1.0.0 RColorBrewer_1.1-2 Matrix_1.2-18
[91] dplyr_0.8.5 Seurat_3.1.4 BiocManager_1.30.10
loaded via a namespace (and not attached):
[1] pbapply_1.4-2 haven_2.2.0 vctrs_0.2.4 usethis_1.5.1 fastICA_1.2-2 mgcv_1.8-31 blob_1.2.1
[8] later_1.0.0 DBI_1.1.0 rappdirs_0.3.1 uwot_0.1.8 jpeg_0.1-8.1 zlibbioc_1.32.0 htmlwidgets_1.5.1
[15] mvtnorm_1.1-0 GlobalOptions_0.1.1 future_1.16.0 leaps_3.1 leiden_0.3.3 DEoptimR_1.0-8 KernSmooth_2.23-16
[22] DT_0.13 promises_1.1.0 gdata_2.18.0 pkgload_1.0.2 graph_1.64.0 RcppParallel_5.0.0 fs_1.3.2
[29] fastmatch_1.1-0 mnormt_1.5-6 digest_0.6.25 png_0.1-7 qlcMatrix_0.9.7 sctransform_0.2.1 pkgconfig_2.0.3
[36] docopt_0.6.1 estimability_1.3 reticulate_1.14 circlize_0.4.8 beeswarm_0.2.3 GetoptLong_0.1.8 xfun_0.12
[43] tidyselect_1.0.0 ica_1.0-2 viridisLite_0.3.0 rtracklayer_1.46.0 pkgbuild_1.0.6 rlang_0.4.5 glue_1.3.2
[50] metap_1.3 modelr_0.1.6 emmeans_1.4.5 ggsignif_0.6.0 labeling_0.3 gbRd_0.4-11 mutoss_0.1-12
[57] httpuv_1.5.2 Rttf2pt1_1.3.8 TH.data_1.0-10 annotate_1.64.0 webshot_0.5.2 jsonlite_1.6.1 bit_1.1-15.2
[64] mime_0.9 gplots_3.0.3 Rsamtools_2.2.3 BiocStyle_2.14.4 stringi_1.4.6 processx_3.4.2 quadprog_1.5-8
[71] bitops_1.0-6 cli_2.0.2 Rdpack_0.11-1 RSQLite_2.2.0 pheatmap_1.0.12 rstudioapi_0.11 org.Mm.eg.db_3.10.0
[78] GenomicAlignments_1.22.1 nlme_3.1-142 fastcluster_1.1.25 locfit_1.5-9.4 VariantAnnotation_1.32.0 listenv_0.8.0 survMisc_0.5.5
[85] dbplyr_1.4.2 sessioninfo_1.1.1 readxl_1.3.1 lifecycle_0.2.0 munsell_0.5.0 cellranger_1.1.0 caTools_1.18.0
[92] codetools_0.2-16 coda_0.19-3 magic_1.5-9 vipor_0.4.5 lmtest_0.9-37 htmlTable_1.13.3 lsei_1.2-0
[99] xtable_1.8-4 ROCR_1.0-7 flashClust_1.01-2 scatterplot3d_0.3-41 abind_1.4-5 farver_2.0.3 FNN_1.1.3
[106] km.ci_0.5-2 RANN_2.6.1 askpass_1.1 biovizBase_1.34.1 sparsesvd_0.2 bibtex_0.4.2.2 RcppAnnoy_0.0.16
[113] shinythemes_1.1.2 dichromat_2.0-0 cluster_2.1.0 future.apply_1.4.0 extrafontdb_1.0 ellipsis_0.3.0 prettyunits_1.1.1
[120] lubridate_1.7.4 reprex_0.3.0 multtest_2.42.0 remotes_2.1.1 slam_0.1-47 TFisher_0.2.0 testthat_2.3.2
[127] geometry_0.4.5 htmltools_0.4.0 BiocFileCache_1.10.2 yaml_2.2.1 plotly_4.9.2 XML_3.99-0.3 foreign_0.8-72
[134] withr_2.1.2 fitdistrplus_1.0-14 bit64_0.9-7 multcomp_1.4-12 ProtGenerics_1.18.0 combinat_0.0-8 rsvd_1.0.3
[141] devtools_2.2.2 waffle_0.7.0 bamsignals_1.18.0 memoise_1.1.0 evaluate_0.14 callr_3.4.3 geneplotter_1.64.0
[148] extrafont_0.17 ps_1.3.2 curl_4.3 fansi_0.4.1 highr_0.8 GSEABase_1.48.0 acepack_1.4.1
[155] desc_1.2.0 npsurv_0.4-0 rjson_0.2.20 rprojroot_1.3-2 clue_0.3-57 tools_3.6.2 sandwich_2.5-1
[162] RCurl_1.98-1.1 ape_5.3 bezier_1.1.2 xml2_1.2.5 httr_1.4.1 assertthat_0.2.1 rmarkdown_2.1
[169] globals_0.12.5 R6_2.4.1 nnet_7.3-12 progress_1.2.2 shape_1.4.4 colorspace_1.4-1 RCircos_1.2.1
[176] generics_0.0.2 base64enc_0.1-3 pillar_1.4.3 sn_1.6-0 HSMMSingleCell_1.6.0 GenomeInfoDbData_1.2.2 plyr_1.8.6
[183] rvest_0.3.5 zip_2.0.4 latticeExtra_0.6-29 fastmap_1.0.1 broom_0.5.5 openssl_1.4.1 BSgenome_1.54.0
[190] scales_1.1.0 backports_1.1.5 plotrix_3.7-7 densityClust_0.3 ArgumentCheck_0.10.2 hms_0.5.3 Rtsne_0.15
[197] KMsurv_0.1-5 numDeriv_2016.8-1.1 lazyeval_0.2.2 tsne_0.1-3 crayon_1.3.4 MASS_7.3-51.4 viridis_0.5.1
[204] rpart_4.1-15 compiler_3.6.2
Love the program. I know this issue was brought up before (almost 1 year ago), but I don't see if there is a solution yet. I can make the fusion plot just fine, but there is no coverage shown. I am using STAR-fusion 1.4.0. to call fusions, and running R 3.5.2 on a Mac OS Mojave 10.14.2. I know you said previously that the tool does not yet support it, but I was hoping that by now it would. Can you please let me know?
Here is my sessioninfo:
`
R version 3.5.2 (2018-12-20)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.2
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] grid stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] Rsamtools_1.34.1 BiocInstaller_1.32.1 chimeraviz_1.8.1 data.table_1.12.0 ensembldb_2.6.6 AnnotationFilter_1.6.0 GenomicFeatures_1.34.3 AnnotationDbi_1.44.0 Biobase_2.42.0 Gviz_1.26.4 GenomicRanges_1.34.0 GenomeInfoDb_1.18.2 Biostrings_2.50.2
[14] XVector_0.22.0 IRanges_2.16.0 S4Vectors_0.20.1 BiocGenerics_0.28.0
loaded via a namespace (and not attached):
[1] ProtGenerics_1.14.0 bitops_1.0-6 matrixStats_0.54.0 bit64_0.9-7 RColorBrewer_1.1-2 progress_1.2.0 httr_1.4.0 tools_3.5.2 backports_1.1.3 DT_0.5
[11] R6_2.4.0 rpart_4.1-13 Hmisc_4.2-0 DBI_1.0.0 lazyeval_0.2.1 colorspace_1.4-0 nnet_7.3-12 tidyselect_0.2.5 gridExtra_2.3 prettyunits_1.0.2
[21] DESeq2_1.22.2 bit_1.1-14 curl_3.3 compiler_3.5.2 htmlTable_1.13.1 DelayedArray_0.8.0 rtracklayer_1.42.1 scales_1.0.0 checkmate_1.9.1 genefilter_1.64.0
[31] RCircos_1.2.0 stringr_1.4.0 digest_0.6.18 foreign_0.8-71 rmarkdown_1.11 base64enc_0.1-3 dichromat_2.0-0 pkgconfig_2.0.2 htmltools_0.3.6 BSgenome_1.50.0
[41] htmlwidgets_1.3 rlang_0.3.1 rstudioapi_0.9.0 RSQLite_2.1.1 gtools_3.8.1 BiocParallel_1.16.6 acepack_1.4.1 dplyr_0.8.0.1 VariantAnnotation_1.28.11 RCurl_1.95-4.11
[51] magrittr_1.5 GenomeInfoDbData_1.2.0 Formula_1.2-3 Matrix_1.2-15 Rcpp_1.0.0 munsell_0.5.0 yaml_2.2.0 stringi_1.3.1 SummarizedExperiment_1.12.0 zlibbioc_1.28.0
[61] org.Hs.eg.db_3.7.0 plyr_1.8.4 blob_1.1.1 crayon_1.3.4 lattice_0.20-38 splines_3.5.2 annotate_1.60.0 hms_0.4.2 locfit_1.5-9.1 knitr_1.21
[71] pillar_1.3.1 geneplotter_1.60.0 biomaRt_2.38.0 XML_3.98-1.17 glue_1.3.0 evaluate_0.13 biovizBase_1.30.1 latticeExtra_0.6-28 BiocManager_1.30.4 org.Mm.eg.db_3.7.0
[81] gtable_0.2.0 purrr_0.3.0 assertthat_0.2.0 ggplot2_3.1.0 xfun_0.5 xtable_1.8-3 ArgumentCheck_0.10.2 survival_2.43-3 tibble_2.0.1 GenomicAlignments_1.18.1
[91] memoise_1.1.0 cluster_2.0.7-1 BiocStyle_2.10.0
`
Hi,
I have successfully been able to follow all steps from the chimeraviz manual until plotFusionTranscript
. When I run it, it gives me the following error:
> plotFusionTranscript(fusion = fusion,
+ edb = edb, bamfile = bamfile)
Fetching transcripts for gene partners..
..transcripts fetched.
Selecting transcripts for RCC1..
..found transcripts of type exonBoundary
Selecting transcripts for HENMT1..
..found transcripts of type exonBoundary
Error in .io_bam(.scan_bamfile, file, reverseComplement, yieldSize(file), :
seqlevels(param) not in BAM header:
seqlevels: '11'
file: fusionAlignment.bam
index: fusionAlignment.bam.bai
These are the steps I followed:
> fusion
[1] "Fusion object"
[1] "id: 22"
[1] "Fusion tool: fusioncatcher"
[1] "Genome version: hg19"
[1] "Gene names: RCC1-HENMT1"
[1] "Chromosomes: chr11-chr11"
[1] "Strands: -,-"
[1] "In-frame?: FALSE"
referenceFilename <- "reference.fa"
writeFusionReference(fusion = fusion, filename = referenceFilename)
rsubreadIndex(referenceFasta = referenceFilename)
This is how the reference fasta referenceFilename
looks like:
# not sure why is it called chrNA
>chrNA
TGTATCACCTCGACTGCTTCGCCTGCCAGCTCTGCAACCAGAGGAAAATTGGGCCGATTTCCACCTATGATGCATCATCA
CCAGGC
These are first few lines of the fastq files:
$ head sample_R1.fq
@NB501069:24:HY2VNBGXX:1:11101:20482:1153 1:N:0:CGATGT
CTGCAATCCTGACAGGGTCCTGCCACTTACCTGTCCCCACCACCCTCCCAACTTCTCTCAGGCTTGAGTGAGGCCTTCTGAAGTTGAAGGGCTTTCTGC
+
A/AAAEEEAE6EEEEEEEEAEEEAEEEEEEAEEEEEAA/A<EAEEEA<EEEAEEEE<EEEEEEEAEE/EEAA<6EEAAE<EEEA<EEAA<EAE<E/EAA
@NB501069:24:HY2VNBGXX:1:11101:24825:1900 1:N:0:CGATGT
GCCACATCGTCCTAACCTGGTAGAGTCAGCCCCCAGGTGATGCCCTAAACCTCCAGACATGGAGGCCCCTTCTAGGTCCTCAAGGGGTGAATCCCCAGGA
+
AAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEAEEEEEEEEEEEEEEEEEE
@NB501069:24:HY2VNBGXX:1:11101:5270:2388 1:N:0:CGATGT
CCCAAATCCTACGGCAAGCTTTGACAATGTAACATCTTTATCTTGTGGTTAGGAAAATGGACCTAGAGAGATTATGTGGTTTGCTCAAAATCACATAGCT
$ head sample_R2.fq
@NB501069:24:HY2VNBGXX:1:11101:20482:1153 2:N:0:CGATGT
CCTGGCCCTTAAACAACTGCAGAAAGCCCTTCAACTTCAGAAGGCCTCACTCAAGCCTGAGAGAAGTTGGGAGGGTGGTGGGGACAGGTAAGTGGCAGGAC
+
AAAAAEEEEEEEEE6EEEEEAEEE6/EA/EEEE/EEEEEEEEEEEEEEAEEE//EEEEEEE<E</AEEAE<EEEEEEEEEEEEE/AEE/EAEAE<//EE/<
@NB501069:24:HY2VNBGXX:1:11101:24825:1900 2:N:0:CGATGT
GCTGCAGTGAGCTGTGATTGCGCCACTGTACTCCAGCTTGGGTGCCAGAGCAAGACCCTGTCTCAAAAAAGAAAAGAATGTTCCTGGGGATTCACCCCTTG
+
AAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEAAEEEEEEEEEEEAEEEEEEEEEEEEE<<EEE<EEEEEEE6EEEE<
@NB501069:24:HY2VNBGXX:1:11101:5270:2388 2:N:0:CGATGT
GTTGGTCTTATAGAAAGCACTACTGCACTTAGTAGCTATGTGATTTTGAGCAAACCACATAATCTCTCTAGGTCCATTTTCCTAACCACAAGATAAAGATG
bamfile
using rsubreadAlign
:rsubreadAlign(
referenceName = referenceFilename,
fastq1 = fastq1,
fastq2 = fastq2,
outputBamFilename = "fusionAlignment")
And the bam file looks like this - again not sure what is chrNA
doing in the header or in the alignments:
@HD VN:1.0 SO:coordinate
@SQ SN:chrNA LN:86
@PG ID:subread PN:subread VN:Rsubread 1.26.1
NB501069:24:HY2VNBGXX:1:13107:9045:19192 163 chrNA 1 40 43M42S = 1 -85 TGTATCACCTCGACTGCTTCGCCTGCCAGCTCTGCAACCAGAGATTTTGTGTGGGAGACAAATTCTTCCTGAAGAACAACATGAT AAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEEEEEAEEE HI:i:1 NH:i:1 NM:i:0
NB501069:24:HY2VNBGXX:1:13107:9045:19192 83 chrNA 1 40 43M42S = 1 85 TGTATCACCTCGACTGCTTCGCCTGCCAGCTCTGCAACCAGAGATTTTGTGTGGGAGACAAATTCTTCCTGAAGAACAACATGAT EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAAAAA HI:i:1 NH:i:1 NM:i:0
NB501069:24:HY2VNBGXX:2:13105:4497:1842 163 chrNA 2 40 42M58S = 26 125 GTATCACCTCGACTGCTTCGCCTGCCAGCTCTGCAACCAGAGATTTTGTGTGGAGACAAATTCTTCCTGAAGAACAACATGATCTTGTGTCAGATGGACT AAAAAEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE6EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAAEAEEE<E HI:i:1 NH:i:1 NM:i:0
Any help would be much appreciated. Thanks!!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.