Coder Social home page Coder Social logo

immunomind / immunarch Goto Github PK

View Code? Open in Web Editor NEW
295.0 13.0 65.0 435.31 MB

🧬 Immunarch: an R Package for Fast and Painless Exploration of Single-cell and Bulk T-cell/Antibody Immune Repertoires

Home Page: https://immunarch.com

License: Apache License 2.0

R 97.96% CSS 1.35% C++ 0.53% Dockerfile 0.16%
immunology tcr tcr-repertoire immunoinformatics immune-repertoire rep-seq bcr-repertoire bcr ig ig-repertoire

immunarch's Issues

mixcr clna file loading fails

hello,
loading data from mixcr generates a bunch of errors. I attach the mixcr report which looks quite normal.

> library("immunarch")
> results <- "mixcr"
> rep <- repLoad(results, .format = "mixcr")
== Step 1/3: loading repertoire files... ==
Processing "mixcr" ...
  -- Parsing "mixcr/99753.clna" -- mixcr
Error: can't find a column with V genes
Error: can't find a column with J genes
Error: can't find a column with D genes
|=================================================================| 100%  481 MB
Warning: 14752614 parsing failures.
row            col  expected        actual                                                                        file
  1 NA             2 columns 1 columns     'mixcr/99753.clna'
  2 NA             2 columns 1 columns     'mixcr/99753.clna'
  3 MiXCR.CLNA.V04           embedded null 'mixcr/99753.clna'
  3 NA             2 columns 1 columns     'mixcr/99753.clna'
  4 MiXCR.CLNA.V04           embedded null 'mixcr/99753.clna'
... .............. ......... ............. ...........................................................................
See problems(...) for more details.

Error in `[[.tbl_df`(head(df), .vgenes) : object '.vgenes' not found
In addition: Warning messages:
1: In read.table(.filename, sep = .sep, skip = 0, nrows = 1, stringsAsFactors = F,  :
  line 1 appears to contain embedded nulls
2: Missing column names filled in: 'X2' [2]

99753.report.txt

rarefraction - seq default error

rarefraction - seq.default error

Hello,

When I try to run rarefraction on my data it returns the following error. Can you please help?

imm_raref = repDiversity(mydata$data, "raref", .extrapolation = 200000, .verbose = F)
Error in seq.default(tail(seq(.step, sum(.data[[i]]), .step), 1) + .step, :
wrong sign in 'by' argument

I am not sure if the error is because of huge datasets.
mydata is a list of 3 dataframes.

dim(mydata[["data"]][["R_1"]])
[1] 143866 11
dim(mydata[["data"]][["R_2"]])
[1] 823220 11
dim(mydata[["data"]][["R_3"]])
[1] 980159 11

Thanks!

Is there a way to order my samples in a specific order?

For example, if I visualize my data with vis(.by = cluster, .meta=immdata.opc.tcr$meta), the orders are determined by alphabetical. However, I would like to reorder the group into specific orders (Immune Rich, Mixture, Immune Desert). Is there a way I could do this in R? I have retried rearranging the order of the factors in immdata.opc.tcr$meta, but it doesn't seem to work.

image

Thanks!

Wrong formula for Jaccard index?

🐛 Bug

When using the repOverlap function using the Jaccard index, I think it calculates wrong numbers. When looking at the source code in overlap.R, I saw that the Jaccard index is calculated as follows:

jaccard_index.default <- function(.x, .y) {
  .x = collect(.x, n = Inf)
  .y = collect(.y, n = Inf)
  intersection = nrow(dplyr::intersect(.x, .y))
  intersection/(nrow(.x) + nrow(.y) + intersection)
}

However, doesn't the intersection need to be subtracted:
intersection/(nrow(.x) + nrow(.y) - intersection)
?

I'm using Immunarch v. 0.5.5.

Thank you.

Install issue

After running the following code from documentation:

> devtools::install_local("C:/immunarch/immunarch.tar.gz", dependencies = T)

I'm receiving the following error:

Skipping 1 packages not available: MonetDBLite
Installing 40 packages: airr, circlize, config, cowplot, dbplyr, dendextend, diptest, dtplyr, ellipse, factoextra, FactoMineR, fastcluster, flashClust, flexmix, forge, fpc, generics, ggpubr, ggrepel, ggsci, ggsignif, GlobalOptions, gridBase, heatmap3, kernlab, leaps, mclust, modeltools, MonetDBLite, polynom, prabclus, r2d3, Rtsne, scatterplot3d, shape, shinythemes, sparklyr, treemap, trimcluster, viridis
Installing packages into ‘C:/Users/rgorsuch/Documents/R/win-library/3.5’
(as ‘lib’ is unspecified)
Error: (converted from warning) package ‘MonetDBLite’ is not available (for R version 3.5.1)

immunarch is not showing up in my Packages library and the repLoad function is not present when trying to import data.

Thanks to anyone who can provide some insight!

Ryne

10x parser wrongly uses UMI as clones / proportion

🐛 Bug

As titled, 10x parser wrongly used the UMI slot as count for clones. However, 10x uses the barcode as the 'count' of cells and 'UMI' as count of transcript.

To Reproduce

Steps to reproduce the behavior:

  1. Read 10x consensus annotation.csv with repLoad
    immdata <- repLoad("/path/to/consensus_annotation.csv", .format = "10x")

  2. view the data my immdata
    head(immdata$data)

Expected behavior

Count the number of barcode with the same VDJ (perhaps use just CDR3 at amino acid level) as the count.

Additional context

Since consensus annotation.csv contains no barcode information, probably need to use filtered_contig_annotation.csv instead.

Error installing on windows

Hi

After running the following code from documentation in Windows10:

install.packages("devtools", dependencies = T)
devtools::install_local("path/to/your/folder/with/immunarch.tar.gz", dependencies=T)

I'm receiving the following error:

The downloaded binary packages are in
C:\Users\João Drama\AppData\Local\Temp\Rtmp8a5Pdw\downloaded_packages

checking for file 'C:\Users\João Drama\AppData\Local\Temp\Rtmp8a5Pdw\remotes1bc85014173\immunarch/DESCRIPTION' ...

checking for file 'C:\Users\João Drama\AppData\Local\Temp\Rtmp8a5Pdw\remotes1bc85014173\immunarch/DESCRIPTION' ...

√ checking for file 'C:\Users\João Drama\AppData\Local\Temp\Rtmp8a5Pdw\remotes1bc85014173\immunarch/DESCRIPTION' (1.9s)

  • preparing 'immunarch': (2.8s)
    checking DESCRIPTION meta-information ...

    checking DESCRIPTION meta-information ...

√ checking DESCRIPTION meta-information

  • cleaning src

    checking vignette meta-information ...

√ checking vignette meta-information

  • excluding invalid files (1.1s)

    Subdirectory 'inst/doc' contains invalid file names:
    '1_introduction.Rmd' '2_data.Rmd' '3_basic_analysis.Rmd'
    '4_overlap.Rmd' '5_gene_usage.Rmd' '6_diversity.Rmd' '7_fixvis.Rmd'
    '1_introduction.html' '2_data.html' '3_basic_analysis.html'
    '4_overlap.html' '5_gene_usage.html' '6_diversity.html'
    '7_fixvis.html'

Warning in .write_description(db, ldpath) :

Warning in .write_description(db, ldpath) :
Unknown encoding with non-ASCII data: converting to ASCII

  • checking for LF line-endings in source and make files and shell scripts

  • checking for empty or unneeded directories

  • looking to see if a 'data/datalist' file should be added

  • building 'immunarch_0.3.2.tar.gz'

Installing package into ‘C:/Users/João Drama/Documents/R/win-library/3.5’
(as ‘lib’ is unspecified)

  • installing source package 'immunarch' ...
    Warning in file(file, if (append) "a" else "w") :
    cannot open file 'C:/Users/Joco Drama/Documents/R/win-library/3.5/immunarch/DESCRIPTION': No such file or directory
    Error in file(file, if (append) "a" else "w") :
    não é possível abrir a conexão
    ERROR: installing package DESCRIPTION failed for package 'immunarch'
  • removing 'C:/Users/João Drama/Documents/R/win-library/3.5/immunarch'
    In R CMD INSTALL
    Error in i.p(...) :
    (convertido do aviso) installation of package ‘C:/Users/JOODRA~1/AppData/Local/Temp/Rtmp8a5Pdw/file1bc81dbd528c/immunarch_0.3.2.tar.gz’ had non-zero exit status


Erro: unexpected input in "l´"

Thanks to anyone who can provide some insight!

Joao

NA group appears in group drawing

Hi
I checked the group information, there is no data missing, I can't solve this problem, can you help me?
my code
vis(imm_tail, .by="Type3", .meta=BJclinical)
Rplot

repExplore .col="aa" and .col="nt"

When using repExplore, I am unable to select the amino acid or nucleotide CDR3 columns- the output is always based on the "Sequence" column (total rows in the dataset).

So, for example, these three output the same number of clonotypes:
exp_vol <- repExplore(immdata$data, .method = "volume")
exp_vol <- repExplore(immdata$data, .method = "volume", .col = "nt")
exp_vol <- repExplore(immdata$data, .method = "volume", .col = "aa")

Thanks in advance for the assistance!

Ordering clonotypes per sample, and colour scheme using trackClonotypes

Hi!

I just had a couple of questions regarding the latest version, and the function 'trackClonotypes'.

  1. If I run trackClonotypes on more than 1 sample, I get the stacked bar plots, with the same clonotypes across different samples, coloured the same.

When I tried to order the input data for trackClonotypes, it doesn't change the order of the clonotypes (I want it to be in ascending or descending order of number)and instead I get this-
Warning:
In melt.data.table(.data) :
To be consistent with reshape2's melt, id.vars and measure.vars are internally guessed when both are 'NULL'. All non-numeric/integer/logical type columns are considered id.vars, which in this case are columns [CDR3.aa]. Consider providing at least one of 'id' or 'measure' vars in future.

Is there any way I can change this?

  1. Also, is there anyway to change the colour scheme of the output for trackClonotypes?

Thanks a lot!

The picture does not match the data

🐛 Bug

I think the .by argument in vis is not fully working at least when a vector is provided (Version : immunarch_0.3.3.9005)

To Reproduce

This is what I get when looking at exp_vol

exp_vol$Volume
[1] 213 5647 1333 2658 528 1326
by_vec = factor(c("D","S","D","S","L","N"))
by_vec
[1] D S D S L N
Levels: D L N S
p = vis(exp_vol, .by = by_vec)
Warning: Ignoring unknown aesthetics: y
Warning: Ignoring unknown aesthetics: xmin, xmax, annotations, y_position
p

Capture d’écran 2020-04-08 à 15 01 53

Expected behavior

As one can see first D value is 213 second D value is 1333 and not 5647 as in the picture.

Thank you for your help!

Some questions about the visualization by group

Hi,

Thanks for this cool tool!

I have two questions about the visualization of .by group data.

Here is my data and metadata.

> names(tcr$data)
 [1] "0619_LN1" "0619_LN2" "0619_LN3" "0619_LN4" "0619_LN5" "0619_LN6" "0619_N1"  "0619_N2"  "0619_N3"  "0619_N4"  "0619_N5" 
[12] "0619_P1"

> tcr$meta
# A tibble: 12 x 4
   Sample   patient source tissue  
   <chr>    <chr>   <chr>  <chr>   
 1 0619_LN1 S0619   LN1    LN      
 2 0619_LN2 S0619   LN2    LN      
 3 0619_LN3 S0619   LN3    LN      
 4 0619_LN4 S0619   LN4    LN      
 5 0619_LN5 S0619   LN5    LN      
 6 0619_LN6 S0619   LN6    LN      
 7 0619_N1  S0619   N1     normal  
 8 0619_N2  S0619   N2     adjacent
 9 0619_N3  S0619   N3     tumor   
10 0619_N4  S0619   N4     tumor   
11 0619_N5  S0619   N5     tumor   
12 0619_P1  S0619   P1     PBMC  

First, I visualized the clonality by proportion.

tcr.imm_pr = repClonality(tcr$data, .method = "clonal.prop")
vis(tcr.imm_pr)

图片

vis(tcr.imm_pr, .by = 'tissue', .meta = tcr$meta, .test = F)

图片

From the individual samples. "0619_N3", "0619_N4" and "0619_N5" (from tumor) all have higher values than that of "0619_P1" (from PBMC). Why does PBMC have higher value than that of tumor after using .by = 'tissue'?

Second, visualized the clonal space homeostasis.

tcr.imm_hom = repClonality(tcr$data, .method = "homeo", .clone.types = c(Small = .0001, Medium = .001, Large = .01, Hyperexpanded = 1))
vis(tcr.imm_hom)

图片

vis(tcr.imm_hom, .by = c('tissue'), .meta = tcr$meta, .test = F)

图片

In the figure of individual samples, "0619_N2" does not have "Small" clones while all others have. But in the group view, "adjacent" (from "0619_N2") has over 20% "Small" clones. Also, nearly all samples have high proportion of "Medium" clones, but only tissue "LN" shows in the group view. So, how to understand this?

I am looking forward to hearing you.

Bests,
Yiwei Niu

repDiversity .method="dXX" not working

❓ Questions and Help

The following code gives an error:
div_d10 = repDiversity(.data = coding(immdata$data), .method = "dXX", .perc = 10)
Error in FUN(X[[i]], ...) :
You entered the wrong method! Please, try again.

Thank you in advance for your assistance!

Issues loading 10x data

❓ Questions and Help

We have a set of listed tutorials available on the website.

Hello, I am trying to load 10x data into R using repLoad and am getting the following errors. Am I using the correct 10x file? Any ideas for how to resolve my issue?

> immdata <- repLoad("filtered_contig_annotations.csv",.format = "10x")
== Step 1/3: loading repertoire files... ==
Processing "<initial>" ...
  -- Parsing "filtered_contig_annotations.csv" -- 10x
unknown format, skipping

== Step 2/3: checking metadata files and merging... ==
Processing "<initial>" ...
  -- Metadata file not found; creating a dummy metadata...
Dropping  1 column(s) from .metadata.txt. Do you have spaces or tabs after the name of the last column? Remove them to ensure everything works correctly.

== Step 3/3: splitting data by barcodes and chain types... ==
Done!
> str(immdata)
List of 2
 $ data: Named list()
 $ meta: tibble [0 × 0] (S3: tbl_df/tbl/data.frame)
 Named list()
> 

Installation issues

Hello,

I'm getting an error when installing the package using:

devtools::install_url("https://github.com/immunomind/immunarch/raw/master/immunarch.tar.gz")

Here's the error:

Error: (converted from warning) package 'dtplyr' was built under R version 3.6.2
Execution halted
ERROR: lazy loading failed for package 'immunarch'
Error: Failed to install 'unknown package' from URL:
(converted from warning) installation of package ‘C:/Users/X/AppData/Local/Temp/RtmpQ1Pgsx/file3b747cab2090/immunarch_0.5.4.tar.gz’ had non-zero exit status

I also tried installing it locally and get the same message. How can I fix this?

Thanks

RepLoad issue with MiXCR files

Hello,
I just started using Immunarch today but when trying to parse a folder with different .txt clonoset files that I obtained using mixcr it gives me the following error

TCRAs <- repLoad("Users/abolivar1/Documents/TCRseq Project/Bioinformatics/TCRAs/", .format = "mixcr")
== Step 1/3: loading repertoire files... ==
Error in if (file.info(path)$isdir) { :
missing value where TRUE/FALSE needed

I am not sure what would be wrong with the files, as I am using them straight from Mixcr.

Any help would be appreciated

Thanks,
Ana

Error with installing and loading IMGT output

Hi,

I’m trying to install immunarch in R with:

install.packages("immunarch")

But I get the following error:
Warning message:
package ‘immunarch’ is not available (for R version 3.6.1)

I also tried installing it in R version 3.2.3 and it gave the same error.

Because this didn’t work I installed the pre-release version and tried to load IMGT output data. But then I also get an error:

install.packages("devtools")
devtools::install_url("https://github.com/immunomind/immunarch/raw/master/immunarch.tar.gz")
imgtdata=repLoad("path_to_file", .format="imgt")
== Step 1/3: loading repertoire files... ==
Processing "path_to_file" ...
-- Parsing "path_to_file/1_Summary.txt" -- imgt
Warning: 2 parsing failures.
row col expected actual file
1 -- 34 columns 4 columns 'path_to_file/1_Summary.txt'
1 -- 34 columns 4 columns 'path_to_file/1_Summary.txt'

Warning: 12330 parsing failures.
row col expected actual file
1 -- 34 columns 4 columns 'path_to_file/1_Summary.txt'
2 -- 34 columns 4 columns 'path_to_file/1_Summary.txt'
3 -- 34 columns 4 columns 'path_to_file/1_Summary.txt'
4 -- 34 columns 4 columns 'path_to_file/1_Summary.txt'
5 -- 34 columns 4 columns 'path_to_file/1_Summary.txt'
... ... .......... ......... .............................................................................
See problems(...) for more details.

Error in if (any(str_detect(.name[i], c("TCRA", "TRAV", "TCRG", "TRGV", :
missing value where TRUE/FALSE needed
In addition: Warning messages:
1: Missing column names filled in: 'X34' [34]
2: Missing column names filled in: 'X34' [34]

Visualisation issues (vis)

🐛 Bug

Attempts to visualise several analysis either result in warning messages with no output, or just no output. I'm guessing it may be due to the structure of my data so I'll include my repLoad procedure below.

To Reproduce

  1. Data is the filtered annotation files for two 10x samples. Samples seemed to read in ok, although got warning message (The following named parsers don't match the column names: barcode...). Read in using:

immdata <- repLoad("./immunarch")

metadata.txt content:

Sample
wt_UT
wt_aCT

  1. Example of vis command with warning and no output:
> exp_vol = repExplore(immdata$data, .method = "volume")
> vis(exp_vol, .by = c("Sample"), .meta = immdata$meta)
Warning: Ignoring unknown aesthetics: y
Warning: Ignoring unknown aesthetics: xmin, xmax, annotations, y_position
  1. Example of vis command with no output (or warning):
> exp_len = repExplore(immdata$data, .method = "len", .col = "aa")
> vis(exp_len)

Expected behavior

Graphical output of data

Additional comments

Several other vis inputs result in the same issue, happy to list them if that's of use, also happy to provide sample dataset. Suspicion the issue may lie in the "Source" section of the metadata, which is just my sample names in triplicate (see below), which may be confuse vis.

> immdata[["meta"]][["Source"]]
[1] "wt_aCT" "wt_aCT" "wt_aCT" "wt_UT"  "wt_UT"  "wt_UT" 

geneUsage and Diversity function

Hi, I would like to see the frequency of the gene usage. Is there an option of telling .quant to use the column "proportions".
Moreover, I am interested in the Shannon-Wiener Index for a cloneset (list) and the entropy function only allows choosing 1 column from 1 data.frame. Is there a way around that?

Thanks a lot!

fail to reload the data

I just followed the instruction, created a metadata.txt in the folder containing the vdjtools output --TCR clonotypes data ,but when I try to reload the data " immdata <- repLoad("data_vdjtool","vdjtools")", the error come up :
Parsing data_vdjtool/c.1.txt -- vdjtools
Error in $<-.data.frame(*tmp*, "Proportion", value = numeric(0)) :
replacement has 0 rows, data has 143053.
But when I try the tcR package : "immdata <-parse.folder()", it works.

Difficulties parsing .txt files from MiXCR using repLoad

Hi Vadim,

I have used Immunarch previously, but can't get it to load data from my current study.

I have used MiXCR to align, assemble and export clones and alignments on bulk RNA-seq data, and have output files as .txt. Unfortunately repLoad will not parse my data either as the clones (from exportClones) or alignments (exportAlignments).

When defining the data format (.format = "mixcr") I receive the following error message: "Error in strsplit(df[[.dalignments]], "|", T, F, T) : non-character argument".

When leaving the data format undefined I receive this message: "-- unsupported format, skipping".

For simplicity, I haven't included my metadata file in the folder of data; and have limited the contents of the folder to a single text file each time (either clones, or alignments). I have attached an example of these data in a separate email.

Your help would be much appreciated.

Thanks, Michael

Problem generating mds plot after repOverlap

🐛 Bug

Hi There, I am running into an error while trying to graph with vis after repOverlap, exactly as detailed in the manual

To Reproduce

Steps to reproduce the behavior:

imm_ov1 = repOverlap(immdata.opc$data, .method = "public", .verbose = T)
vis(repOverlapAnalysis(imm_ov1, "mds+kmeans"))

it gives an error
.by="cluster", .meta=immdata.opc$meta

However, when I run
vis(repOverlapAnalysis(imm_ov1, "tsne"))

there is no problem.

This is what the beginning of print(imm_ov1) looks like

image

I think it might be due to the NAs in the middle. Thoughts?

Thanks in advance for your help

Expected behavior

Additional context

2x2 matrix output in repOverlap when there are only two samples to analyse

🚀 Feature

Important request from emails.

Motivation

I am using your immunarch package’s repOverlap function.

However, when there are only two samples in the input directory, the 2 x 2 matrix for the repOverlap can not be generated correctly, and the expected matrix is dropped to a numeric value. Thereby, the heatmap graph is not drawn correctly, it is not 2 x 2 layout, and the axis tick labels are wrong.

Could you please fix the issue and produce the 2 x 2 matrix correctly when there is only 2 samples as inputs. In my opinion, the consistency is important for users when they have many different sample sets to be

Gene usage visualisation grouped by metadata

Hi,
I am trying to graph gene usage in samples as grouped by status. I am using my own dataset and I can't see any grouping happening. I have also tried using the test dataset and I also don't see that the samples get grouped. This is the code I am using, as provided in the Quick Start:

data(immdata) 
gu = geneUsage(immdata$data)
vis(gu, .by="Status", .meta=immdata$meta) 

I don't see any difference by using the .by="Status" argument or not.

Number of clonotypes (or clones) plot y axis labels

I have 3434 cells (clones) in my sample, but the clonotype plot obtained by

exp_vol = repExplore(immdata$data, .method = "volume")
vis(exp_vol)

shows well over 5000 clonotypes.
The clone plot shows over 25000 clones.

See also https://immunarch.com/articles/3_basic_analysis.html where the clonotype numbers are similar.

I believe there is a bug with y axis labels: the ticks are labelled with 2000, 4000 etc instead of the correct numbers.

See also plots of diversity estimate (e.g. Chao1) where the ticks are also labelled with 2000, 4000 etc.

Problems with metadata upload

Hi
i have been trying for a long time now to import metadata with my data.
the data is read fine, but the metadata not. I always get the following error:

metadata.txt
”[!] Samples found in the dataset, but not in the metadata: 18_5 22_5 24_5 Did you add all the necessary samples to the metadata file with correct names?”

Please find attached the metadata that was parsed using the following command, toghether with the files 18_5, 22_5, 24_5:

immdata = repLoad("/Users/anner/AnneDoc/Results_2019/TCRa_Sequencing/Piotr_sequencing /bfx1062/files_grouped_forTcR/Group2_Teff")

Could you please have a look and let me know what is wrong?

Thanks a lot !!!
annecar

Trouble loading an entire directory

Hi,
When I use repLoad(/path/to/mixcrclonesoutputfile.txt), I can do so without any problems.

However, when I use repLoad(/path/to/foldercontainingseveralmixcrclones.txtfiles/), I get this error-
Error in strsplit(df[[.dalignments]], "|", T, F, T) :
non-character argument

The folder contains a metadata.txt file with 3 tab-separated columns:
Sample Column1 Column2

Wondering if anyone knows what the issue seems to be?
Thanks.

Test issue #2.

🐛 Bug

To Reproduce

Steps to reproduce the behavior:

Expected behavior

Additional context

test.txt
photo_2019-11-01_14-23-24

Is it possible to compare immunarch objects with repOverlap?

Hello,

Thanks a lot for the great tool.
As the title says, Is it possible to compare immunarch objects with repOverlap?
I created several immunarch objects from a 10x dataset using the "filter_barcode" function based on clusters I got from UMAP clustering of the cells by Seurat.

I can now look at the repertoire diversity per cluster, but I want to check the overlap between clusters too. Is there a way to do this?

Thanks a lot.

Test Issues

🐛 Bug

To Reproduce

Steps to reproduce the behavior:

Expected behavior

Additional context

The problem of multiple chain types

Hi,

I got a question about the chain type and the clonotype definition in the tools.

It seems that each chain (no matter TCR α/β) in the same data file will be considered one type of clonotype right, instead of considering a combination of them?

So it means that different kind of chains should not be put in the same input file?

Are there any reference for this?

Many thanks,
Meng

Error with repLoad function

Hi,

When using repLoad function I am getting the following error:

immdata <- repLoad("MIXCR/ExportedClones")
Parsing MIXCR/ExportedClones ...
Parsing MIXCR/ExportedClones/metadata.txt -- metadata
Parsing MIXCR/ExportedClones/SLX-17310.i701_i502_tcr.clonotypes.ALL.txt -- mixcr
Error in [[<-.data.frame(*tmp*, .aa.seq, value = list()) :
replacement has 0 rows, data has 218
I am using immunarch version 0.3.3
Output files are from mixcr in .txt format.

I would appreciate it if you could help me to resolve this issue.

Kind regards
Pani

GitlabExodus

Task - GitlabExodus

Transfer Immunarch code from Gitlab to Github

Suggestions: metadata for each cell and integrate with Seurat in single-cell VDJ scenario

Hi immunarch developers,

Great thanks for this cool tool!

I would like to suggest two features which I think would improve the usability of immuarch when the input data is from single-cell.

  1. Add metadata for each cell.

Now, immuarch reads data from samples and metadata is for each sample. In single-cell VDJ, if user want to compare clones of sub clusters of cells, they have to create another set of input files. If each cell has its own metadata, it would be much easier to compare different groups of cells.

  1. Provide a small vignette or a set of functions to integrate VDJ data into Seurat object.

Nowadays, lots of researchers combine single-cell VDJ and single-cell RNA-seq analysis in their studies, and tools like Seurat are popular in single-cell RNA-seq data analysis. It is useful to map VDJ data to the cell clusters defined by transcriptome. There are already such tries like this and this. Since immuarch has high-level interfaces to manipulate VDJ data from several different platforms, it would be great that immuarch can provide a more elegant way to do this.

Please ignore me if you think it is trivial.

Install failed with "Error: object ‘tbl_dt’ is not exported by 'namespace:dtplyr'"

🐛 Bug

Dear all,

I failed in installing Immunarch always with this error:

Error: object ‘tbl_dt’ is not exported by 'namespace:dtplyr'
Execution halted
ERROR: lazy loading failed for package ‘immunarch’

I tried install automatically:
devtools::install_url("https://github.com/immunomind/immunarch/raw/master/immunarch.tar.gz")
or manually:

install.packages(c("BiocManager", "covr", "dbscan", "doFuture", "DT", "future", "glmnet", "hdf5r", "hexbin", "Hmisc", "plotly", "prodlim", "R.oo", "RcppAnnoy", "RcppArmadillo", "RcppParallel", "RJSONIO", "rvest", "Seurat", "slam"))
devtools::install_local("~/Downloads/immunarch.tar.gz", dependencies=T)

I tried remove.packages(dtplyr) and reinstall it and it didn't help.
Here's the traceback():

8: stop(remote_install_error(remotes[[i]], e))
7: value[[3L]](cond)
6: tryCatchOne(expr, names, parentenv, handlers[[1L]])
5: tryCatchList(expr, classes, parentenv, handlers)
4: tryCatch(res[[i]] <- install_remote(remotes[[i]], ...), error = function(e) {
       stop(remote_install_error(remotes[[i]], e))
   })
3: install_remotes(remotes, dependencies = dependencies, upgrade = upgrade, 
       force = force, quiet = quiet, build = build, build_opts = build_opts, 
       build_manual = build_manual, build_vignettes = build_vignettes, 
       repos = repos, type = type, ...)
2: pkgbuild::with_build_tools({
       ellipsis::check_dots_used(action = getOption("devtools.ellipsis_action", 
           rlang::warn))
       {
           remotes <- lapply(path, local_remote, subdir = subdir)
           install_remotes(remotes, dependencies = dependencies, 
               upgrade = upgrade, force = force, quiet = quiet, 
               build = build, build_opts = build_opts, build_manual = build_manual, 
               build_vignettes = build_vignettes, repos = repos, 
               type = type, ...)
       }
   }, required = FALSE)
1: devtools::install_local("~/Downloads/immunarch.tar.gz", dependencies = T)

here's sessionInfo():

R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.3 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_GB.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.3        rstudioapi_0.10   magrittr_1.5      usethis_1.5.1     devtools_2.2.1    pkgload_1.0.2    
 [7] R6_2.4.1          rlang_0.4.1       tools_3.6.1       pkgbuild_1.0.6    sessioninfo_1.1.1 cli_1.1.0        
[13] withr_2.1.2       ellipsis_0.3.0    remotes_2.1.0     assertthat_0.2.1  digest_0.6.22     rprojroot_1.3-2  
[19] crayon_1.3.4      processx_3.4.1    callr_3.3.2       fs_1.3.1          ps_1.3.0          testthat_2.3.0   
[25] memoise_1.1.0     glue_1.3.1        compiler_3.6.1    desc_1.2.0        backports_1.1.5   prettyunits_1.0.2 

I think this should not be a big issue but I cannot make it through. Hope anyone would help me. Thanks a lot to the community!

Datatable error when dividing public repetoir

Hi, I get the following error when running the code below that it is complaining about the syntax ":=". I'm using R 3.6.3 and Immunoarch 0.5.5 and dplyr 0.8.5 on a Windows 10. I assume it is some bug related to which version of packages you use...

library(immunarch)
data("immdata")
immdata <- immdata 

pr = pubRep(immdata$data, "aa", .coding = T, .verbose = F)
pr1 = pubRepFilter(pr, immdata$meta, c(Status = "C"))
pr2 = pubRepFilter(pr, immdata$meta, c(Status = "MS"))
pr3 = pubRepApply(pr1, pr2,) 

Error in :=(Samples.y, NULL) :
Check that is.data.table(DT) == TRUE. Otherwise, := and :=(...) are defined for use in j, once only and in particular ways. See help(":=").

I also get a warning when running pubRepFilter that:

You are using a dplyr method on a raw data.table, which will call the data frame
implementation, and is likely to be inefficient.

To suppress this message, either generate a data.table translation with lazy_dt()
or convert to a data frame or tibble with as.data.frame()/as_tibble().You are using a dplyr method on a raw data.table, which will call the data frame implementation, and is likely to be inefficient.

To suppress this message, either generate a data.table translation with lazy_dt()
or convert to a data frame or tibble with as.data.frame()/as_tibble().

Error with geneUsage on IGH

🐛 Bug

I am trying to run geneUsage on a MiXCR dataset that I imported consisting only of IGH chains. I can run it to generate a histogram of v gene usage (using "hs.ighv") but when I try to look at other genes, it only can pull the ighv genes.

To Reproduce

The following will produce a histogram of v genes.
v_usage = geneUsage(BCRdata.coding$data, "hs.ighv", .norm = TRUE, .ambig = "mag")
vis(v_usage, .plot = "hist")

However, the following attempts to generate d and j gene histograms just return the same v gene histogram.
d_usage = geneUsage(BCRdata.coding$data, "hs.ighd", .norm = TRUE, .ambig = "mag")
vis(d_usage, .plot = "hist")
or
j_usage = geneUsage(BCRdata.coding$data, "hs.ighj", .norm = TRUE, .ambig = "mag")
vis(j_usage, .plot = "hist")

Looking at d_usage or j_usage themselves, they are actually just the same as v_usage, a table of the v genes in my dataset. So it seems like geneUsage is having trouble pulling the correct genes from my dataset.

Additional context

sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.3 LTS

other attached packages:
[1] cowplot_1.0.0 openxlsx_4.1.3 immunarch_0.5.2 gridExtra_2.3
[5] data.table_1.12.6 dtplyr_1.0.0 dplyr_0.8.3 ggplot2_3.2.1

repLoad error for 10X

I encountered the error when trying to repLoad a 10X VDJ library. Do you know why?

> repLoad(
+   "Fresh1_Tcell_vdj/outs/", 
+   .format = "10x")
Parsing Fresh1_Tcell_vdj/outs/ ...
Parsing Fresh1_Tcell_vdj/outs//all_contig_annotations.bed -- 10x
Error in `[[<-.data.frame`(`*tmp*`, .nuc.seq, value = character(0)) : 
  replacement has 0 rows, data has 7715
In addition: Warning messages:
1: The following named parsers don't match the column names: AGTAGTCTCGTTTAGG-1_contig_1, 22, 356, TRBV13-1_L-REGION+V-REGION 
2: In .which_recomb_type(df[[.vgenes]]) :
  Can't determine the type of V(D)J recombination. No insertions will be presented in the resulting data table.

I used cellranger v3.0.2 to produce the input repertoire files.

Thanks in advance!

Can't upload immunarch formated table.

🐛 Bug

immunarch formated files are not being imported

To Reproduce

Steps to reproduce the behavior:

1.immdata = repLoad(.path = "AF.tsv", .format = 'immunarch')
2.
3.

Error in `[[<-.data.frame`(`*tmp*`, IMMCOL$cdr3nt, value = logical(0)) : 
  replacement has 0 rows, data has 3

AF.tsv.zip

Expected behavior

I should be able to import it into an immdata object

Additional context

Empty R data frames after parsing MiXCR files

🐛 Bug

Important issue from our support email:

Email 1

I managed to repLoad my files but with these warning:

Warning messages:
1: In readLines(f, 1) : line 1 appears to contain an embedded nul
2: In readLines(f, 1) : line 1 appears to contain an embedded nul
3: In readLines(f, 1) : line 1 appears to contain an embedded nul

And when I try to load the data (immdata2) I get this:

> immdata2
$data
named list()

$meta
# A tibble: 0 x 0

So I wonder if I used the right file from MiXCR (clna)?

Email 2

I used the "all" file it works!!!

Error using repDiversity module

Hey, I had an error message using repDiversity module and I don't know how to solve the problem. The test data is using a result from Mixcr. Is there some way to pass it ? Many thanks.
image

Including BCR constant region information

🚀 Feature

Most BCR analysis programs (ex. MIXCR and IMMCANTATION) also output a field with information regarding the constant chain (ex. IGHG1, IGHA1). It would be nice to also load that information into immunarch.

Motivation

BCR constant region is an important characteristic of the repertoire - i.e for proportion of clones in IGHM vs IGHA tells us a lot about what process of affinity maturation/somatic hypermutation the clones is in. So far, when I need to do analysis like this I have always just wrote custom script to reload the data from mixcr.

Pitch

Add an extra field for CREGION

Alternatives

N/A

Additional context

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.