Coder Social home page Coder Social logo

hemberg-lab / scrna.seq.datasets Goto Github PK

View Code? Open in Web Editor NEW
163.0 163.0 59.0 779 KB

Collection of public scRNA-Seq datasets used by our group

Home Page: https://hemberg-lab.github.io/scRNA.seq.datasets/

License: GNU General Public License v3.0

Shell 27.89% R 66.50% Perl 4.23% Dockerfile 1.37%
aws dataset docker jenkins mkdocs openstack s3-storage single-cell

scrna.seq.datasets's People

Contributors

gkild avatar tallulandrews avatar wikiselev avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

scrna.seq.datasets's Issues

Yan dataset

Hello,

Bash links appear to be broken:

scrnaseq]$ wget 'https://s3.amazonaws.com/scrnaseq-public-datasets/manual-data/yan/nsmb.2660-S2.csv'
--2024-03-21 14:26:38-- https://s3.amazonaws.com/scrnaseq-public-datasets/manual-data/yan/nsmb.2660-S2.csv
Resolving s3.amazonaws.com (s3.amazonaws.com)... 52.217.200.232, 52.217.103.102, 52.217.117.24, ...
Connecting to s3.amazonaws.com (s3.amazonaws.com)|52.217.200.232|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2024-03-21 14:26:39 ERROR 404: Not Found.

Is it doable to update it?

Datasets on Amazon not found.

Thanks for your works! I am running your code link_ but I found several datasets which should be downloaded form Amazon server were gone. For example, the Pollen dataset, from ./bash/pollen.sh. Is there anyway to download these datasets or to solve this problem? Thanks!

Cell annotations for the Biase dataset

Hi,

It appears the cell annotations for the Biase dataset were on Amazon and the file is no longer accessible. Would it be possible to share the cell annotation file?

Thanks,
Atif

R scripts don't produce same RDS files

I've noticed that downloading the raw data and running the provided R scripts for the data sets to generate the RDS files outputs different results than what the ones on the website. Specifically, the logcounts don't match up everywhere.

I discovered this while looking through the scmap paper and going through the data sets. This problem only occurs for the data sets with CPM normalization (in Supplementary Table 2 from scmap) namely: Goolam, Li, Kolodziejczyk, Baron, Segerstolpe, Klein, Zeisel, Shekhar and Macosko.

I've gone through how the logcounts are computed in create_sce.R, but it doesn't match up with the actual results. For example, take the Li data set

> log2(calculateCPM(sceset, use_size_factors = FALSE) + 1)[1:5, 1:2]
         RHA015__A549__turquoise RHA016__A549__turquoise
TSPAN6                  9.009166               9.2430517
TNMD                    0.000000               0.0000000
DPM1                    4.888420               6.9251494
SCYL3                   0.000000               6.7888400
C1orf112                0.000000               0.6565277

while loading li.rds available at https://hemberg-lab.github.io/scRNA.seq.datasets/human/tissues/ gives

> logcounts(li)[1:5, 1:2]
         RHA015__A549__turquoise RHA016__A549__turquoise
TSPAN6                 10.352333               10.586473
TNMD                    0.000000                0.000000
DPM1                    6.203446                8.262799
SCYL3                   0.000000                8.125773
C1orf112                0.000000                1.300885

How were these logcounts computed?

Citation of this data collection

I was wondering how to formally cite this collection of scRNA-seq datasets for academic publication.

Thank you for providing this comprehensive and easy-to-use collection. It saved me a lot of time.

goolam.R does not run without error

When running goolam.R to convert the data into a SingleCellExperiment line 17
sceset <- create_sce_from_counts(d, ann)
causes the following error:
Error in .calculate_cpm(assay(x, exprs_values), ...) : unused argument (use.size.factors = FALSE)
Traceback information:
`Error in .calculate_cpm(assay(x, exprs_values), ...) :
unused argument (use.size.factors = FALSE)
9. .local(x, ...)
8. .nextMethod(x = x, size_factors = size_factors, ...)
7. eval(call, callEnv)
6. eval(call, callEnv)
5. callNextMethod(x = x, size_factors = size_factors, ...)
4. .local(x, ...)
3. calculateCPM(sceset, use.size.factors = FALSE)
2. calculateCPM(sceset, use.size.factors = FALSE) at create_sce_mirror.R#14

  1. create_sce_from_counts(d, ann)
    `
    R version 3.6.1; This happenes under both Windows 10 and Ubuntu 18.04

Treutlein bash script no longer works

When running treutlein.sh to download the data i get the following error message:
HTTP request sent, awaiting response... 404 Not Found 2020-04-14 16:23:20 ERROR 404: Not Found.

Opening the page manually also leads to a 404 error.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.