velocyto-team / velocyto-notebooks Goto Github PK

View Code? Open in Web Editor NEW

52.0 52.0 31.0 8.11 MB

HTML 40.10% Jupyter Notebook 59.90%

velocyto-notebooks's People

Contributors

Stargazers

Watchers

velocyto-notebooks's Issues

Reg: "TSNE1", "TSNE2", ClusterName, Clusters

Hi,

Where do i get "TSNE1", "TSNE2", ClusterName, Clusters ? I understand that these may be pro-computed for a data. Can you give an example of the structure / contents of these variables ?

Many thanks!

Can I get a loom file from the gene expression matrix ?

Dear,
Can loom files be obtained only by the bam file after alignment? Can I get a loom file from the gene expression matrix which is easier to download?If there are 100 fastq data in an experiment, Is the only way that I align them in turn and then assemble these loom files into a total loom?
For example，how did you achieve the loom file from GSE104323?（https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE104323）
Thank for your attention!

Typos and running jupyter notebooks

Hello,
looking at the jupyter notebook code I had errors with both notebooks when calling the following line:

vlm.set_clusters(vlm.ca["ClusterName"], cluter_colors_dict=colors_dict)
changing "cluter_colors_dict" to "cluster_colors_dict" solved this issue.

However, running through the rest of the jupyter notebooks I am unable to proceed after calling:
vlm.perform_PCA()

The problem seems to be that the kernel will go dead and attempt to restart
I am not really sure what I haven't set up correctly but help would be appreciated

Typo in R Notebook

I think this line in the Mouse BM notebook:
length(intersect(rownames(emat),rownames(emat)))
Should be:
length(intersect(rownames(emat),rownames(nmat)))

Doesn't effect the results of the analysis, just a minor typo.

Can I run velocyto with RNAseq done with Illumina Hiseq

Dear all,

I'm studying the germline cells and I isolated single nuclei from single germ cells and sequenced the content with Illumina Hiseq paired-end platform. The Nuclei are huge and gave me enough RNA content to make the libraries. I have 16 such samples/datasets with ~8Gbs of reads for each sample.
Can I run velocyto with my data (maybe with run- Run on any technique)? I can use aligned and sorted bam files as input. Any suggestions in general? Any suggestions on reads alignment tool? Exon junction aware tools preferred?

Thanks!

Wanpeng

cell cycle genes

Hi,

I'm trying to map velocity vectors onto my existing UMAP/t-SNE embeddings. These embeddings were generated without the effect of cell cycle genes (I tried a few suboptimal ways to do this), so I'd like to filter out cell cycle genes from the Velocyto analysis as well. In the Habel et al notebook, a set of cell cycle genes are downloaded and filtered out:

urlretrieve("http://pklab.med.harvard.edu/velocyto/Haber_et_al/goatools_cellcycle_genes.txt",
            "data/goatools_cellcycle_genes.txt")

How exactly were the cell cycle genes defined in the first place? From the file name, I guess it came from goatools. I would be great if the author could share the script for producing the list of cell cycle genes. Many thanks!

pyvelocyto on data from different platforms

Hi,

I have aligned bam files from different platforms (Smartseq2 and inDrop methods). Is it possible to run pyvelocyto on a population of cells coming from different platforms ? I already have the individual bam files.
Is it just running the following command ?

velocyto run -c -U BAMFILE... GTFFILE

Thanks,
Goutham A

Analysis Pipeline : calculate_grid_arrows()

When i run the analysis for dentategyrus notebook, after calculating the calculate_grid_arrows() function, i get a return of flow_grid (np.ndarray) – the gridpoints. Why this array does not have any grid point as [0,0].
Is there any origin in the final graph?

mutli samples

Dear,
Does velocyto support mutli samples ? I have nine 10x samples (nine .bam files) at different development stage, could I run them together with veloctyo ?

Help with different vectorfield figures in R and python

Hi,

I am writing to seek your help with an issue I have with generating RNA velocity figures in R and python - the direction of the arrows don't seem to be consistent between R and python. Can you please let me know it this is expected, and if there's a private link to share my code and data?

The values for parameters I am using in python and R are:

python:

vlm.score_cluster_expression(min_avg_U=0.05, min_avg_S=0.5)
k=20

R:
fit.quantile <- 0.02;

kCells = 20
emat <- filter.genes.by.cluster.expression(emat,cell.colors,min.max.cluster.average = 0.5)
nmat <- filter.genes.by.cluster.expression(nmat,cell.colors,min.max.cluster.average = 0.05)

Appreciate all the help.

Thanks
Sharvari

knn_imputation: no attribute 'U_sz'

Hi,
running knn_imputation with this line

vlm.knn_imputation(n_pca_dims=9, k=k, balanced=True, b_sight=k6, b_maxl=int(k3.5), n_jobs=16)

I get this error

AttributeError: 'VelocytoLoom' object has no attribute 'U_sz'

What am I missing?

Thanks a lot
Andrea

Qi: I experience unable to read in the trial loom file showed as below, any suggestions? Thank you very much.

Dear velocyto-team:

I experience unable to read in the trial loom file showed as below, any suggestions? Thank you very much.

The Error Messenge:

vlm = vcy.VelocytoLoom("/data/DentateGyrus.loom")
Traceback (most recent call last):
File "", line 1, in
File "/opt/apps/bio/velocyto-0.17.17/lib/python3.7/site-packages/velocyto/analysis.py", line 58, in init
ds = loompy.connect(self.loom_filepath)
File "/opt/apps/bio/velocyto-0.17.17/lib/python3.7/site-packages/loompy/loompy.py", line 1389, in connect
return LoomConnection(filename, mode, validate=validate)
File "/opt/apps/bio/velocyto-0.17.17/lib/python3.7/site-packages/loompy/loompy.py", line 84, in init
self._file = h5py.File(filename, mode)
File "/opt/apps/bio/velocyto-0.17.17/lib/python3.7/site-packages/h5py/_hl/files.py", line 408, in init
swmr=swmr)
File "/opt/apps/bio/velocyto-0.17.17/lib/python3.7/site-packages/h5py/_hl/files.py", line 175, in make_fid
fid = h5f.open(name, h5f.ACC_RDWR, fapl=fapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5f.pyx", line 88, in h5py.h5f.open
OSError: Unable to open file (unable to lock file, errno = 11, error message = 'Resource temporarily unavailable')

Skip samtools sort

Hello,

I already have a sorted bamfile and I am using the velocyto run command. Is there a way to skip the built-in sorting step other than renaming the input bamfile?

Thanks!

No velocity arrows showing up

Hi, I am trying to run velocyto on 10x data using the following set of commands.

vlm = vcy.VelocytoLoom("mouse.loom")
colors_dict = {0: np.array([ 0.95,  0.6,  0.1]), 1: np.array([ 0.85,  0.3,  0.1]), 2: np.array([ 0.8,  0.02,  0.1]),
              3: np.array([ 0.81,  0.43,  0.72352941]), 4: np.array([ 0.61,  0.13,  0.72352941]), 5: np.array([ 0.9,  0.8 ,  0.3])}
vlm.set_clusters(vlm.ca["Clusters"], cluster_colors_dict=colors_dict)
vlm.normalize("S", size=True, log=True)
vlm.normalize("U", size=True, log=True)
vlm.default_filter_and_norm()
vlm.default_fit_preparation()
vlm.fit_gammas()
vlm.predict_U()
vlm.calculate_velocity()
vlm.calculate_shift(assumption="constant_velocity")
vlm.extrapolate_cell_at_t(delta_t=1)
vlm.perform_TSNE()
vlm.estimate_transition_prob(hidim="Sx_sz", embed="ts", transform="linear")
vlm.calculate_embedding_shift(sigma_corr = 0.05)
vlm.calculate_grid_arrows(smooth=0.8, steps=(20, 20), n_neighbors=50)
vlm.plot_grid_arrows(scatter_kwargs_dict={"alpha":0.35, "lw":0.35, "edgecolor":"0.4", "s":38, "rasterized":True}, min_mass=24, angles='xy', scale_units='xy', headaxislength=2.75, headlength=5, headwidth=4.8, quiver_scale=0.47)

Note: I have to change the transform parameter of function vlm.estimate_transition_prob to linear since sqrt was giving error.

Also attaching the log of the run which gives some divide by 0 warning.
At the end I am just getting the PCA colored plot of the cells without any flow (arrow), what am I missing?
log.txt

Thanks again!

Can I get a loom file from the gene expression matrix ?

Dear,
Can loom files be obtained only by the bam file after alignment? Can I get a loom file from the gene expression matrix which is easier to download? If there are 100 fastq data in an experiment, Is the only way that I align them in turn and then assemble these loom files into a total loom?
For example，how did you achieve the loom file from GSE104323?（https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE104323）
Thank you for your attention!

Few arrows or chaotic arrows. What are the key parameters when analyze loom files?

Dear velocyto team,
The spliced and unspliced model is really excellent! I have been testing your method for three weeks, but can't produce good results.
Our project is about the development of an organ. And out data is consist of 2000+ cells with 5000 gene number and 100,000 UMI in average. The ratio of spliced and unspliced molecules is 80% and 17%.
I followed the analysis pipeline but got few arrows. If I set the 'quiver_scale ' and 'min_pass' as a really small number, I got chaotic arrows.
vlm.plot_grid_arrows(scatter_kwargs_dict={"alpha":0.5, "lw":1, "edgecolor":"0.4", "s":80, "rasterized":True},
min_mass=0.1, angles='xy', scale_units='xy',
headaxislength=2.75, headlength=5, headwidth=4.8, quiver_scale=0.005, scale_type="absolute")

If I used unspliced prediction, I got no arrows.Could you please give me some advice?

When I use Monocle2, I have to set the start state. But I am not sure about the start state of my project, I think RNA velocity is the right method for this project. I really need your professional guidance!!!
Please help me!!!

DEBUG - The file did not specify the _Valid column attribute

Hi,
I used command velocyto run10x command to create the .loom file on a 10x data, but when I tried reading the loom file in ipyton-notebook I get the above error. Any pointer please?
I used the following command in ipython

vlm = vcy.VelocytoLoom("mouse.loom")

Thanks in advance.

Running velocyto on bulk RNA

Hello,
I would like to run velocyto on a time series of bulk RNA-seq. Each time point is 24 hours apart. I did the alignment with STAR and as it is a first strand library I reversed the strand of mapping in the bam (see at the end) to be able to use the default logic.
When I looked at the arrows on the PCA, they are not really intuitive. Did I make something wrong? I tried to check with the circadian dataset which is used in the publication but as it is solid data I could not map them...
Many thanks,
Lucille
To reverse the strand I did:
samtools view -h $b | awk -v OFS="\t" '{if($1~/^@/){print}else{n=$2;d=16;q=(n-n%d)/d+(n<0);if(q%2==1){$2-=d}else{$2+=d};print}}' | samtools view -b - > ${b}_reversed.bam

How can I achieve key such as 'TSNE1' or ''Clustername'' in the vlm.ca?

Thanks for you excellent script. But when I run own data ,which have generate '.loom' file, I always face the traceback following. I notice that the sample you used have those key in dict and can not face error? So ,how should I do to slove it to continue the program.

vlm.ca["TSNE1"]
Traceback (most recent call last):
File "", line 1, in
KeyError: 'TSNE1'

vlm.ca["ClusterName"]
Traceback (most recent call last):
File "", line 1, in
KeyError: 'ClusterName'

velocyto-team / velocyto-notebooks Goto Github PK

velocyto-notebooks's People

Contributors

Stargazers

Watchers

Forkers

velocyto-notebooks's Issues

Recommend Projects

Recommend Topics

Recommend Org