Comments (16)
Yeah, that's a lovely idea. Do you think that we should expose another "add_de_res_for_pertpy" function that does the same thing but this time load the DE results into adata.varm
Either that, or allow to pass a data frame directly to the plotting function:
def plot_something(
adata: AnnData,
de_res: pd.DataFrame | str,
p_col: str = "FDR",
fc_col: str = "log2FC",
):
"""
de_res:
This may be either
* the key under which the results are stored in `adata.varm`, or
* a pandas data frame with DE results. The data frame must use gene
identifiers from `adata.var_names` as index.
"""
But I think that we've got it figured out! Do you want to tackle any of this? If not, I'll find someone as mentioned.
If you had someone it would be awesome! I'm happy to review if required.
from pertpy.
@grst what else do you think we should add? Feel free to edit my issue (hope you have the rights now - if not please ping me).
from pertpy.
Tools
Personally, I never made good experiences with rpy2
-- which is one reason why I stopped using these functions in more recent projects. Instead I performed DE in R and connected it through a workflow manager.
Are you sure you want to support that? It would also entail quite complicated CI setups if we were to test it.
Maybe we start with pydeseq2, t-test, wilcoxon test?
Plots
another nice one I forgot, which is especially useful when comparing between multiple conditions:
Visualization frameworks
My plotting functions are a mix of altair and matplotlib. Do you want to stick to one of them?
from pertpy.
Personally, I never made good experiences with
rpy2
-- which is one reason why I stopped using these functions in more recent projects. Instead I performed DE in R and connected it through a workflow manager.Are you sure you want to support that? It would also entail quite complicated CI setups if we were to test it.
rpy2 is at the moment an optional dependency for pertpy because milopy can also work with edgeR. I'd keep stick to this for support for the R options.
Maybe we start with pydeseq2, t-test, wilcoxon test?
Yup. We'd have to discuss what the interface should look like. Would you base it on the current scanpy t-test/wilcoxon test or would you redesign it completely? If yes, what should it look like?
My plotting functions are a mix of altair and matplotlib. Do you want to stick to one of them?
Preferably matplotlib. We replaced altair code in other tools with matplotlib equivalents.
from pertpy.
Would you base it on the current scanpy t-test/wilcoxon test or would you redesign it completely? If yes, what should it look like?
I would probably use the scanpy implementation but redesign the output. IMO all tools should return a dataframe with one row per var
with p-value and logFC. This dataframe could optionally be saved in adata.varm
.
I also have a wrapper for a statsmodels linear model which could be relevant here:
https://github.com/icbi-lab/luca/blob/89c4e6109bc723f6958cae7af791398b28e8e422/lib/scanpy_helper_submodule/scanpy_helpers/compare_groups/lm.py#L24-L151
Preferably matplotlib. We replaced altair code in other tools with matplotlib equivalents.
Fair enough, shouldn't be too hard to rewrite it. Maybe an excuse to play with the new seaborn API.
Another point to consider is how the interface would look like. groupby
is nice for a simple pairwise or groupwise comparison, but what about more complicated cases? Do we allow to specify patsy
model strings?
from pertpy.
I would probably use the scanpy implementation but redesign the output. IMO all tools should return a dataframe with one row per
var
with p-value and logFC. This dataframe could optionally be saved inadata.varm
.
Yes, that makes a lot of sense.
I also have a wrapper for a statsmodels linear model which could be relevant here: https://github.com/icbi-lab/luca/blob/89c4e6109bc723f6958cae7af791398b28e8e422/lib/scanpy_helper_submodule/scanpy_helpers/compare_groups/lm.py#L24-L151
Yeah, why not.
Preferably matplotlib. We replaced altair code in other tools with matplotlib equivalents.
Fair enough, shouldn't be too hard to rewrite it. Maybe an excuse to play with the new seaborn API.
Yup, but we also have manpower for this if you don't want to do this yourself.
Another point to consider is how the interface would look like.
groupby
is nice for a simple pairwise or groupwise comparison, but what about more complicated cases? Do we allow to specifypatsy
model strings?
I wouldn't really need them for a quick t-test or Wilcoxon test but when doing serious work with edgeR/deseq2 for complex setups - yes I think that we need to support them.
from pertpy.
Yup, but we also have manpower for this if you don't want to do this yourself.
ok, even better. So implementation-wise, what would you need from my side?
from pertpy.
Yup, but we also have manpower for this if you don't want to do this yourself.
ok, even better. So implementation-wise, what would you need from my side?
If you explicitly designed an interface (parameters that you want and output DF) it would certainly help and guide one of my students. I am confident though that I could also do that if you're busy.
Could you please edit my post and add links to the current implementations to all of your plots?
You are of course welcome to contribute to pertpy, but if you're busy one of my students will help out.
from pertpy.
Updated the comment and added some draft specs for further discussion.
from pertpy.
Excellent, thank you very much.
So would the de_analysis
have an additional parameter called "method" to which this is passed to? Or would you have separate public functions per test?
from pertpy.
Good point. I would have said the former. Updated the specs accordingly.
from pertpy.
What do we do with https://scanpy.readthedocs.io/en/stable/generated/scanpy.pl.rank_genes_groups.html#scanpy.pl.rank_genes_groups and all of its siblings? Think it expects the old format?
from pertpy.
tbh, I'm not a big fan of how scanpy currently stores the DE results. So I wouldn't mind if the scanpy functions were rewritten. But that would be either a breaking change or require a bunch of code handling legacy formats, so not sure if this is going to happen in a reasonable timeframe.
Alternatively, these functions could be mirrored in pertpy, together with the other plotting functions planned. I have a code snippet for adding a data frame to anndata such that it works with the scanpy plotting functions here.
from pertpy.
tbh, I'm not a big fan of how scanpy currently stores the DE results. So I wouldn't mind if the scanpy functions were rewritten. But that would be either a breaking change or require a bunch of code handling legacy formats, so not sure if this is going to happen in a reasonable timeframe.
Unlikely.
Alternatively, these functions could be mirrored in pertpy, together with the other plotting functions planned. I have a code snippet for adding a data frame to anndata such that it works with the scanpy plotting functions here.
So the mirroring would work as follows: users calls
pt.pl.rank_genes_groups(adata, key="MYDERESULTSDF")
Under the hood we call your function and then call the scanpy plotting function?
Or how would you design it?
from pertpy.
Maybe by using a context manager that temporarily adds the converted data frame to adata.uns
?
def rank_genes_groups(adata, key, *args, **kwargs):
with _add_de_res_for_scanpy(adata, key) as adata:
sc.pl.rank_genes_groups(adata, *args, **kwargs)
from pertpy.
Yeah, that's a lovely idea. Do you think that we should expose another "add_de_res_for_pertpy" function that does the same thing but this time load the DE results into adata.varm
? We know where to store such dataframes, but newbies are always confused with the slots and varm
is rarely used.
But I think that we've got it figured out! Do you want to tackle any of this? If not, I'll find someone as mentioned.
from pertpy.
Related Issues (20)
- Add dataloader for sciplex-GxE
- Enable a way to save sccoda plots HOT 2
- Compute method for MLPClassifierSpace
- 0.7.0 release
- MLPClassifierSpace on papalexi_2021 fails with TypeError: cannot pickle 'weakref.ReferenceType' object HOT 6
- Pre-release CI arviz: ImportError: cannot import name 'gaussian' from 'scipy.signal' HOT 1
- encounting issues when running DIALOGUE
- SCGEN lack of examples
- Augur sklearn warning HOT 3
- `annotate_compounds` for combinations of drugs HOT 2
- Use __all__ for public interfaces
- Can not import pertpy HOT 1
- Expression prediction tutorial HOT 1
- Weird behavior of Sccoda HOT 22
- scgen test broken HOT 1
- Unable to import pertpy due to SymPy error HOT 4
- Can't import pertpy owing to NameError!!! HOT 6
- scCODA convergence issue for continous covairables HOT 2
- How to plot the 0 value during different comparisons? HOT 11
- How to use Pertpy after subsetting the cell types? HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pertpy.