velocyto-notebooks's People
Forkers
k3yavi dylanmr gyd1990 jzyuan shaaaarpy abhi744 eugot gennadygorin jpreall jashahir nhyda shbrief herpelinckt ribakambala opnumten geekgeekit peterzzq magic136 rstatistics 1512474508 thinknoon glad-ys sygongcode pengyu1608 newbiejasper csbioazim prmshr j1205 hbusra bbyun28 tal-ishon jackng88velocyto-notebooks's Issues
Reg: "TSNE1", "TSNE2", ClusterName, Clusters
Hi,
Where do i get "TSNE1", "TSNE2", ClusterName, Clusters ? I understand that these may be pro-computed for a data. Can you give an example of the structure / contents of these variables ?
Many thanks!
Can I get a loom file from the gene expression matrix ?
Dear,
Can loom files be obtained only by the bam file after alignment? Can I get a loom file from the gene expression matrix which is easier to download?If there are 100 fastq data in an experiment, Is the only way that I align them in turn and then assemble these loom files into a total loom?
For example,how did you achieve the loom file from GSE104323?(https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE104323)
Thank for your attention!
Typos and running jupyter notebooks
Hello,
looking at the jupyter notebook code I had errors with both notebooks when calling the following line:
vlm.set_clusters(vlm.ca["ClusterName"], cluter_colors_dict=colors_dict)
changing "cluter_colors_dict" to "cluster_colors_dict" solved this issue.
However, running through the rest of the jupyter notebooks I am unable to proceed after calling:
vlm.perform_PCA()
The problem seems to be that the kernel will go dead and attempt to restart
I am not really sure what I haven't set up correctly but help would be appreciated
Typo in R Notebook
I think this line in the Mouse BM notebook:
length(intersect(rownames(emat),rownames(emat)))
Should be:
length(intersect(rownames(emat),rownames(nmat)))
Doesn't effect the results of the analysis, just a minor typo.
Can I run velocyto with RNAseq done with Illumina Hiseq
Dear all,
I'm studying the germline cells and I isolated single nuclei from single germ cells and sequenced the content with Illumina Hiseq paired-end platform. The Nuclei are huge and gave me enough RNA content to make the libraries. I have 16 such samples/datasets with ~8Gbs of reads for each sample.
Can I run velocyto with my data (maybe with run- Run on any technique)? I can use aligned and sorted bam files as input. Any suggestions in general? Any suggestions on reads alignment tool? Exon junction aware tools preferred?
Thanks!
Wanpeng
cell cycle genes
Hi,
I'm trying to map velocity vectors onto my existing UMAP/t-SNE embeddings. These embeddings were generated without the effect of cell cycle genes (I tried a few suboptimal ways to do this), so I'd like to filter out cell cycle genes from the Velocyto analysis as well. In the Habel et al notebook, a set of cell cycle genes are downloaded and filtered out:
urlretrieve("http://pklab.med.harvard.edu/velocyto/Haber_et_al/goatools_cellcycle_genes.txt",
"data/goatools_cellcycle_genes.txt")
How exactly were the cell cycle genes defined in the first place? From the file name, I guess it came from goatools. I would be great if the author could share the script for producing the list of cell cycle genes. Many thanks!
pyvelocyto on data from different platforms
Hi,
I have aligned bam files from different platforms (Smartseq2 and inDrop methods). Is it possible to run pyvelocyto on a population of cells coming from different platforms ? I already have the individual bam files.
Is it just running the following command ?
velocyto run -c -U BAMFILE... GTFFILE
Thanks,
Goutham A
Analysis Pipeline : calculate_grid_arrows()
When i run the analysis for dentategyrus notebook, after calculating the calculate_grid_arrows() function, i get a return of flow_grid (np.ndarray) – the gridpoints. Why this array does not have any grid point as [0,0].
Is there any origin in the final graph?
mutli samples
Dear,
Does velocyto support mutli samples ? I have nine 10x samples (nine .bam files) at different development stage, could I run them together with veloctyo ?
Help with different vectorfield figures in R and python
Hi,
I am writing to seek your help with an issue I have with generating RNA velocity figures in R and python - the direction of the arrows don't seem to be consistent between R and python. Can you please let me know it this is expected, and if there's a private link to share my code and data?
The values for parameters I am using in python and R are:
python:
vlm.score_cluster_expression(min_avg_U=0.05, min_avg_S=0.5)
k=20
R:
fit.quantile <- 0.02;
kCells = 20
emat <- filter.genes.by.cluster.expression(emat,cell.colors,min.max.cluster.average = 0.5)
nmat <- filter.genes.by.cluster.expression(nmat,cell.colors,min.max.cluster.average = 0.05)
Appreciate all the help.
Thanks
Sharvari
knn_imputation: no attribute 'U_sz'
Hi,
running knn_imputation with this line
vlm.knn_imputation(n_pca_dims=9, k=k, balanced=True, b_sight=k6, b_maxl=int(k3.5), n_jobs=16)
I get this error
AttributeError: 'VelocytoLoom' object has no attribute 'U_sz'
What am I missing?
Thanks a lot
Andrea
Qi: I experience unable to read in the trial loom file showed as below, any suggestions? Thank you very much.
Dear velocyto-team:
I experience unable to read in the trial loom file showed as below, any suggestions? Thank you very much.
The Error Messenge:
vlm = vcy.VelocytoLoom("/data/DentateGyrus.loom")
Traceback (most recent call last):
File "", line 1, in
File "/opt/apps/bio/velocyto-0.17.17/lib/python3.7/site-packages/velocyto/analysis.py", line 58, in init
ds = loompy.connect(self.loom_filepath)
File "/opt/apps/bio/velocyto-0.17.17/lib/python3.7/site-packages/loompy/loompy.py", line 1389, in connect
return LoomConnection(filename, mode, validate=validate)
File "/opt/apps/bio/velocyto-0.17.17/lib/python3.7/site-packages/loompy/loompy.py", line 84, in init
self._file = h5py.File(filename, mode)
File "/opt/apps/bio/velocyto-0.17.17/lib/python3.7/site-packages/h5py/_hl/files.py", line 408, in init
swmr=swmr)
File "/opt/apps/bio/velocyto-0.17.17/lib/python3.7/site-packages/h5py/_hl/files.py", line 175, in make_fid
fid = h5f.open(name, h5f.ACC_RDWR, fapl=fapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5f.pyx", line 88, in h5py.h5f.open
OSError: Unable to open file (unable to lock file, errno = 11, error message = 'Resource temporarily unavailable')
Qi
Skip samtools sort
Hello,
I already have a sorted bamfile and I am using the velocyto run
command. Is there a way to skip the built-in sorting step other than renaming the input bamfile?
Thanks!
No velocity arrows showing up
Hi, I am trying to run velocyto
on 10x data using the following set of commands.
vlm = vcy.VelocytoLoom("mouse.loom")
colors_dict = {0: np.array([ 0.95, 0.6, 0.1]), 1: np.array([ 0.85, 0.3, 0.1]), 2: np.array([ 0.8, 0.02, 0.1]),
3: np.array([ 0.81, 0.43, 0.72352941]), 4: np.array([ 0.61, 0.13, 0.72352941]), 5: np.array([ 0.9, 0.8 , 0.3])}
vlm.set_clusters(vlm.ca["Clusters"], cluster_colors_dict=colors_dict)
vlm.normalize("S", size=True, log=True)
vlm.normalize("U", size=True, log=True)
vlm.default_filter_and_norm()
vlm.default_fit_preparation()
vlm.fit_gammas()
vlm.predict_U()
vlm.calculate_velocity()
vlm.calculate_shift(assumption="constant_velocity")
vlm.extrapolate_cell_at_t(delta_t=1)
vlm.perform_TSNE()
vlm.estimate_transition_prob(hidim="Sx_sz", embed="ts", transform="linear")
vlm.calculate_embedding_shift(sigma_corr = 0.05)
vlm.calculate_grid_arrows(smooth=0.8, steps=(20, 20), n_neighbors=50)
vlm.plot_grid_arrows(scatter_kwargs_dict={"alpha":0.35, "lw":0.35, "edgecolor":"0.4", "s":38, "rasterized":True}, min_mass=24, angles='xy', scale_units='xy', headaxislength=2.75, headlength=5, headwidth=4.8, quiver_scale=0.47)
Note: I have to change the transform
parameter of function vlm.estimate_transition_prob
to linear
since sqrt
was giving error.
Also attaching the log of the run which gives some divide by 0 warning.
At the end I am just getting the PCA colored plot of the cells without any flow (arrow), what am I missing?
log.txt
Thanks again!
Can I get a loom file from the gene expression matrix ?
Dear,
Can loom files be obtained only by the bam file after alignment? Can I get a loom file from the gene expression matrix which is easier to download? If there are 100 fastq data in an experiment, Is the only way that I align them in turn and then assemble these loom files into a total loom?
For example,how did you achieve the loom file from GSE104323?(https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE104323)
Thank you for your attention!
Few arrows or chaotic arrows. What are the key parameters when analyze loom files?
Dear velocyto team,
The spliced and unspliced model is really excellent! I have been testing your method for three weeks, but can't produce good results.
Our project is about the development of an organ. And out data is consist of 2000+ cells with 5000 gene number and 100,000 UMI in average. The ratio of spliced and unspliced molecules is 80% and 17%.
I followed the analysis pipeline but got few arrows. If I set the 'quiver_scale ' and 'min_pass' as a really small number, I got chaotic arrows.
vlm.plot_grid_arrows(scatter_kwargs_dict={"alpha":0.5, "lw":1, "edgecolor":"0.4", "s":80, "rasterized":True},
min_mass=0.1, angles='xy', scale_units='xy',
headaxislength=2.75, headlength=5, headwidth=4.8, quiver_scale=0.005, scale_type="absolute")
If I used unspliced prediction, I got no arrows.Could you please give me some advice?
When I use Monocle2, I have to set the start state. But I am not sure about the start state of my project, I think RNA velocity is the right method for this project. I really need your professional guidance!!!
Please help me!!!
DEBUG - The file did not specify the _Valid column attribute
Hi,
I used command velocyto run10x
command to create the .loom
file on a 10x data, but when I tried reading the loom
file in ipyton-notebook
I get the above error. Any pointer please?
I used the following command in ipython
vlm = vcy.VelocytoLoom("mouse.loom")
Thanks in advance.
Running velocyto on bulk RNA
Hello,
I would like to run velocyto on a time series of bulk RNA-seq. Each time point is 24 hours apart. I did the alignment with STAR and as it is a first strand library I reversed the strand of mapping in the bam (see at the end) to be able to use the default logic.
When I looked at the arrows on the PCA, they are not really intuitive. Did I make something wrong? I tried to check with the circadian dataset which is used in the publication but as it is solid data I could not map them...
Many thanks,
Lucille
To reverse the strand I did:
samtools view -h $b | awk -v OFS="\t" '{if($1~/^@/){print}else{n=$2;d=16;q=(n-n%d)/d+(n<0);if(q%2==1){$2-=d}else{$2+=d};print}}' | samtools view -b - > ${b}_reversed.bam
How can I achieve key such as 'TSNE1' or ''Clustername'' in the vlm.ca?
Thanks for you excellent script. But when I run own data ,which have generate '.loom' file, I always face the traceback following. I notice that the sample you used have those key in dict and can not face error? So ,how should I do to slove it to continue the program.
vlm.ca["TSNE1"]
Traceback (most recent call last):
File "", line 1, in
KeyError: 'TSNE1'
vlm.ca["ClusterName"]
Traceback (most recent call last):
File "", line 1, in
KeyError: 'ClusterName'
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.