Coder Social home page Coder Social logo

embryotimecourse2018's Introduction

EmbryoTimecourse2018

This repo contains data and analysis scripts for our Embryo Timecourse paper. There are two types of content here: scripts to get the processed data (download), and scripts that we have used to do our analyses (analysis_scripts).

To use the download script, please ensure you have curl installed, and run the download script in the download folder. You can choose which of our datasets you would like to download while running the script.

Please note that you can now use our Bioconductor package MouseGastrulationData to access these data in a more efficient way, with processed and raw count matrices delivered directly into your R session. Download times should also be much shorter if you use the package - and you can use e.g. loomr to save the data to an easily readable format if you want to analyse the data using python.

embryotimecourse2018's People

Contributors

jonathangriffiths avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

embryotimecourse2018's Issues

ygenes.tab

Where to find the ygenes.tab file, which is required by the getHVGs() function?

How to get the files in the analysis script?

Hi @jonathangriffiths ,
I am trying to reuse your analysis pipeline. When I started doing preprocessing, I find there were some files which I don't have.

In ATLAS folder,

cell calling step
exp = read.table("/nfs/research1/marioni/jonny/embryos/raw/meta/exp_cells.tab", sep = "\t", header = TRUE)

qc step
exp_design = read.table("/nfs/research1/marioni/jonny/embryos/raw/meta/sample_stage_map.csv", sep = ",", header = TRUE)

And there are some files similar to those files above in two other folders. Could you please tell me how can I have or generate this kind of files?

Question about defining clusters

Hi Jonny @jonathangriffiths ,

I am a bit confused about why you produced different number of clusters in analysis.

From your tutorial, there are 19 clusters in your data, when constructing atlas. However, I find there are 37 cell types on the published paper. And I see you have got 37 samples and wonder if you find markers and assign the cell types based on samples, not clusters. Could you tell me how you generate 37 cell types?

Another question is, from your tutorial, it seems that you first assign cell types to each clusters and then find markers based on cell types. Could I say that you first generate 37 clusters, then assign cell types and find markers eventually?

Thanks bro

Huang

How to create a loom file from this data in R?

Hi there! I'm a brand new student in bioinformatics trying to create a loom file for a project in python but I am not sure if I have confidently created one. I have gone through all the process of gathering your data through bioconductor in R but I am unsure of how to generate a proper loom file from your given data.

Is there a tutorial on how to do that in your files or do you know of a place where I can learn how to?

Thank you!

Gene Marker Lists

Hello

I was wondering if there exists a file with the output of findMarkers?

Thanks!

Error while fetching processed data

reference <- MouseGastrulationData::EmbryoAtlasData(type = 'processed')
  |===================================================================================| 100%

snapshotDate(): 2022-10-31
see ?MouseGastrulationData and browseVignettes('MouseGastrulationData') for documentation
downloading 1 resources
retrieving 1 resource
  |===================================================================================| 100%

Error: failed to load resource
  name: EH2701
  title: Atlas processed counts (sample 1)
  reason: 1 resources failed to download
In addition: Warning messages:
1: download failed
  web resource path: ‘https://experimenthub.bioconductor.org/fetch/2717’
  local file path: ‘/Users/administrateur/Library/Caches/org.R-project.R/R/ExperimentHub/80365376fa9_2717’
  reason: Internal Server Error (HTTP 500). 
2: bfcadd() failed; resource removed
  rid: BFC5
  fpath: ‘https://experimenthub.bioconductor.org/fetch/2717’
  reason: download failed 
3: download failed
  hub path: ‘https://experimenthub.bioconductor.org/fetch/2717’
  cache resource: ‘EH2701 : 2717’
  reason: bfcadd() failed; see warnings() 

Metadata on cluster information

Hi,

Thank you so much for sharing data and analysis protocol.

As I begin to explore the atlas dataset, I'm wondering if the only way to get an atlas dataset with cell type annotation is to run the entire posted analysis pipeline, or is there any metadata file that already contains the annotation information.

Really appreciate all your help!

cannot download from ArrayExpress

Hi Jonny,
when I tried to download the following data from ArrayExpress: Atlas: E-MTAB-6967; Smart-seq2 endothelial cells: E-MTAB-6970; Tal1−/− chimaeras: E-MTAB-7325; wild-type chimaeras: E-MTAB-7324, I encountered an error. However, downloading other data using ascp works fine. I'm wondering if there's any issue on your side?
ff8ec9a9dbb2d51b1dbabd856ea6bde

load_data() function runtime

Hi,

I am trying to follow your (excellent!) method to re-run the analysis, mostly for learning more about single cell analysis, but also trying to explore the dataset further. After making a few minor changes to the scripts - mostly removing hard coded paths - I have tried to load the data using the load_data() function. I am currently running the code of step 7_define_clusters using R Studio Server on a 16 core 256GB RAM virtual machine.
I seem to be still stuck at the load_data step:

load_data(remove_doublets = TRUE, remove_stripped = TRUE, load_corrected = TRUE, root_dir = "~/data/atlas")

The R process now has run for more than 3 hours, and peak memory usage was around 72GB (htop). I can see that the function must have progressed, as the different cores have been active during the three hours.

Is this expected behaviour?

Thanks in advance!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.