marionilab / embryotimecourse2018 Goto Github PK

Data and analysis scripts for our Embryo Timecourse paper.

HTML 89.02% R 0.11% Python 0.01% Jupyter Notebook 10.87% Shell 0.01% Shell 0.01%

embryotimecourse2018's Introduction

EmbryoTimecourse2018

This repo contains data and analysis scripts for our Embryo Timecourse paper. There are two types of content here: scripts to get the processed data (download), and scripts that we have used to do our analyses (analysis_scripts).

To use the download script, please ensure you have curl installed, and run the download script in the download folder. You can choose which of our datasets you would like to download while running the script.

Please note that you can now use our Bioconductor package MouseGastrulationData to access these data in a more efficient way, with processed and raw count matrices delivered directly into your R session. Download times should also be much shorter if you use the package - and you can use e.g. loomr to save the data to an easily readable format if you want to analyse the data using python.

embryotimecourse2018's People

Contributors

Stargazers

Watchers

embryotimecourse2018's Issues

ygenes.tab

Where to find the ygenes.tab file, which is required by the getHVGs() function?

How to get the files in the analysis script?

Hi @jonathangriffiths ,
I am trying to reuse your analysis pipeline. When I started doing preprocessing, I find there were some files which I don't have.

In ATLAS folder,

cell calling step
exp = read.table("/nfs/research1/marioni/jonny/embryos/raw/meta/exp_cells.tab", sep = "\t", header = TRUE)

qc step
exp_design = read.table("/nfs/research1/marioni/jonny/embryos/raw/meta/sample_stage_map.csv", sep = ",", header = TRUE)

And there are some files similar to those files above in two other folders. Could you please tell me how can I have or generate this kind of files?

Question about defining clusters

Hi Jonny @jonathangriffiths ,

I am a bit confused about why you produced different number of clusters in analysis.

From your tutorial, there are 19 clusters in your data, when constructing atlas. However, I find there are 37 cell types on the published paper. And I see you have got 37 samples and wonder if you find markers and assign the cell types based on samples, not clusters. Could you tell me how you generate 37 cell types?

Another question is, from your tutorial, it seems that you first assign cell types to each clusters and then find markers based on cell types. Could I say that you first generate 37 clusters, then assign cell types and find markers eventually?

Thanks bro

Huang

How to create a loom file from this data in R?

Hi there! I'm a brand new student in bioinformatics trying to create a loom file for a project in python but I am not sure if I have confidently created one. I have gone through all the process of gathering your data through bioconductor in R but I am unsure of how to generate a proper loom file from your given data.

Is there a tutorial on how to do that in your files or do you know of a place where I can learn how to?

Thank you!

Gene Marker Lists

Hello

I was wondering if there exists a file with the output of findMarkers?

Thanks!

Error while fetching processed data

reference <- MouseGastrulationData::EmbryoAtlasData(type = 'processed')

  |===================================================================================| 100%

snapshotDate(): 2022-10-31
see ?MouseGastrulationData and browseVignettes('MouseGastrulationData') for documentation
downloading 1 resources
retrieving 1 resource
  |===================================================================================| 100%

Error: failed to load resource
  name: EH2701
  title: Atlas processed counts (sample 1)
  reason: 1 resources failed to download
In addition: Warning messages:
1: download failed
  web resource path: ‘https://experimenthub.bioconductor.org/fetch/2717’
  local file path: ‘/Users/administrateur/Library/Caches/org.R-project.R/R/ExperimentHub/80365376fa9_2717’
  reason: Internal Server Error (HTTP 500). 
2: bfcadd() failed; resource removed
  rid: BFC5
  fpath: ‘https://experimenthub.bioconductor.org/fetch/2717’
  reason: download failed 
3: download failed
  hub path: ‘https://experimenthub.bioconductor.org/fetch/2717’
  cache resource: ‘EH2701 : 2717’
  reason: bfcadd() failed; see warnings()

How do you separate specific annotation cells from entire dataset. example neural crest cells from e8.5 cells. ?

Metadata on cluster information

Hi,

Thank you so much for sharing data and analysis protocol.

As I begin to explore the atlas dataset, I'm wondering if the only way to get an atlas dataset with cell type annotation is to run the entire posted analysis pipeline, or is there any metadata file that already contains the annotation information.

Really appreciate all your help!

cannot download from ArrayExpress

Hi Jonny，
when I tried to download the following data from ArrayExpress: Atlas: E-MTAB-6967; Smart-seq2 endothelial cells: E-MTAB-6970; Tal1−/− chimaeras: E-MTAB-7325; wild-type chimaeras: E-MTAB-7324, I encountered an error. However, downloading other data using ascp works fine. I'm wondering if there's any issue on your side?

load_data() function runtime

Hi,

I am trying to follow your (excellent!) method to re-run the analysis, mostly for learning more about single cell analysis, but also trying to explore the dataset further. After making a few minor changes to the scripts - mostly removing hard coded paths - I have tried to load the data using the load_data() function. I am currently running the code of step 7_define_clusters using R Studio Server on a 16 core 256GB RAM virtual machine.
I seem to be still stuck at the load_data step:

load_data(remove_doublets = TRUE, remove_stripped = TRUE, load_corrected = TRUE, root_dir = "~/data/atlas")

The R process now has run for more than 3 hours, and peak memory usage was around 72GB (htop). I can see that the function must have progressed, as the different cores have been active during the three hours.

Is this expected behaviour?

Thanks in advance!

marionilab / embryotimecourse2018 Goto Github PK

embryotimecourse2018's Introduction

EmbryoTimecourse2018

embryotimecourse2018's People

Contributors

Stargazers

Watchers

Forkers

embryotimecourse2018's Issues

Recommend Projects

Recommend Topics

Recommend Org