Coder Social home page Coder Social logo

lincs's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

lincs's Issues

Download modzs.gctx

I can't figure out where to get modzs.gctx, which is needed to construct the signature dataframe sig_expr_df in consensi.ipynb. From here, you say:

The z-score signature vectors are retrieved from the /xchip/cogs/data/build/a2y13q1/modzs.gctx file on the C3 cloud.

But this was 2 years ago and the link doesn't work anymore. Also, I'm not exactly sure what this file is exactly or how it was generated.

I appreciate your help in advance!

Level 4 replicates don´t match with level 5 signature

I am trying to plot some genes using data level 4 for my compound (BRD-K55591206) on HepG2 cells.

There are two signatures with HepG2 cells at level 5:
LJP008_HEPG2_24H:J01
POL001_HEPG2_24H:J09
To make sure I was using the same data from these level 5 signatures I checked the replicates at level 4 of each of these signatures above. The average of the two LJP008 experiments (distil_ids: LJP008_HEPG2_24H_X2_B20:J01|LJP008_HEPG2_24H_X3_B20:J01) matches the signature of each gene at level 5. Perfect.

However, the level 4 data for signature POL001_HEPG2 (distil_ids: POL001_HEPG2_24H_X1.L2_B23:J07|POL001_HEPG2_24H_X2.L2_B23:J07|POL001_HEPG2_24H_X3.L2_B23:J07) does not match level 5.

If we use the NAT2 gene as an example, we have the following level 5 value: 0.004413

On the other hand, the values for the level 4 replicates are:
POL001_HEPG2_24H_X1.L2_B23:J07 = -0.386299998
POL001_HEPG2_24H_X2.L2_B23:J07 = 0.110600002
POL001_HEPG2_24H_X3.L2_B23:J07 =0.38409999
The avg 0.036133 does not match level 5 0.004413

The compound is BRD-K55591206, 10 µM, 24 h.

Why don’t they match?

I am using cmapR to retrieve the data from these files:
https://clue.io/releases/data-dashboard
https://s3.amazonaws.com/macchiato.clue.io/builds/LINCS2020/level5/level5_beta_trt_cp_n720216x12328.gctx
https://s3.amazonaws.com/macchiato.clue.io/builds/LINCS2020/level4/level4_beta_all_n3026460x12328.gctx
https://s3.amazonaws.com/macchiato.clue.io/builds/LINCS2020/siginfo_beta.txt
https://s3.amazonaws.com/macchiato.clue.io/builds/LINCS2020/instinfo_beta.txt

Effect of over- and underexpression on itselves

Hi Daniel,

thank you very much for sharing this work. As a computational biologist, this data seems very interesting for lookup of hypothesis won in another dataset in a wet lab data, great!

I had a look at the datasets you kindly provided in https://github.com/dhimmel/lincs/tree/gh-pages/data/consensi and checked the effect of overexpression/underexpression of a gene as perturbagen on itself:

About a third of the genes showed nominal significant (z score <= -1.96) underexpression when it was itself the repressing perturbagen. When looking on overexpression, about 10 percent of genes showed overexpression when they were the overexpressed perturbagen itself.

My first question is: While this is truly a clear enrichment in the right direction, is this rather low efficiency of a gene as perturbagen on itself expected?

My second question is: Do you suggest to filter for genes that have an effect as perturbagen on itself for quality control?

To illustrate this issue, here is a histogram of z-scores showing effect as perturbagen on itselves vs. effect on other genes:
s309_1_distribution_zscores_over_under_itselves_effect

Thanks and best, Holger

Which set of GSEA data is equivalent to modzs.gctx

Hello Daniel,

I am in the process of creating auto-update scripts for all the nodes with hetio, and in order to do that, I will need a copy of the most up-to-date modzs.gctx file. I know that GSEA has some LINCS datasets in there, and I was curious which files best correspond to the modzs file that you used in this repository. Any input or feedback would be greatly appreciated. Thank you!

Best,
Krish

README for the repo

Hey @dhimmel,

Thank you for such amazing work putting together the scripts to process and analyze the Lincs dataset. If it is not too much, could you add a README to the repo to guide us through the process?

Thank You.

Regards,
Yojana Gadiya

Code to generate l1000.db or the downloadable l1000.db

As shown in database.ipynb, there is a large-size l1000.db file, containing the gene expression profiles and meta data of lincs. Here could you publish the code to produce the l1000.db and/or the l1000.db itself? Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.