Comments (13)
I think the error comes from the slight difference in input to the train_tree function. The train_tree function uses the X matrix as input instead of the AnnData object (so source_adata.X instead of source_adata).
Hope this helps! If this doesn't solve the problem, could you provide the complete trace-back and your input, so it's easier to debug?
from scarches.
Hi, thanks for replying.
For the previous problem I found it comes from my annotation. I made each annotation level under different clustering resolution, which means in lower hierarchy some cells may crossover and thereby belong to other clusters in higher hierarchy. This problem made the tree 'dirty'. After reannotation this problem got solved.
However here comes a new problem: when I continued to the next step, update the hierarchy with new dataset, an error appeared as the following
I tried different datasets, the same problem appeared.
from scarches.
And here is the full report, thank you again!
from scarches.
Hmm, interesting. The error is caused when the tree trained on data_2 (the query data) is used to predict the labels of data_1 (the reference data). When doing pca.transform(test_data) it seems that there a no cells in the testdata which causes an error. Is the query_latent the combined latent space of the reference and query? And if so, are the labels of reference exactly called 'reference'? You can have a look at this notebook to see an example of how to concatenate the reference and query data.
from scarches.
Yes I had walked through this notebook previously, and it worked well with such "one-level-annotation" (as shown below), with exact the same datasets.
However, when I switched it to "multi-level-annotation-tree" (shown as below, which follows this notebook https://github.com/lcmmichielsen/treeArches-reproducibility/blob/main/Figure2-HLCA%20healthy/Figure2%2C%20S9-S13.ipynb), this problem comes out.
from scarches.
What do you mean exactly with this problem? Is that related to the problem that you mentioned before about the zero samples in the test data? Or is that solved and is your problem related to the figure you attached now?
from scarches.
The problem "0 sample (0,30)" is about the "multi-level-annotation-tree", constructed as the picture here.
https://user-images.githubusercontent.com/118878017/270230240-2586f2dd-dd4e-4f54-bf13-fd5cf05231d8.png
What I tend to say is when I abandon the multi-hierarchy annotation structure above, using the lowest hierarchy instead, the model is trained perfectly, so I guess the problem may come from the hierarchy structure?
from scarches.
Good to know. Did you check these two things I mentioned earlier:
- Is the query_latent the combined latent space of the reference and query?
- And if so, are the labels of reference exactly called 'reference'?
from scarches.
Oh, actually not, the query_latent is only the query dataset, as I followed the GitHub reproducibility notebook. The reference label is 'reference' though.
I will try full_latent first and report you the result then. Thank you for your patient and generous help!
from scarches.
Okay, let me know whether this helps!
Btw, in codeblock 14 of the notebook you mentioned (https://github.com/lcmmichielsen/treeArches-reproducibility/blob/main/Figure2-HLCA%20healthy/Figure2%2C%20S9-S13.ipynb), we also merge the reference (LCA) and query (emb_M) into one object before updating the hierarchy. So there you could see another example of how you could implement it for your dataset.
from scarches.
Hi, here is my issue updating: I moved the jupyter file into vscode, and picked out the package learn.py as a subprocess, here is the debugging result:
the variable data_1, data_2 and trees seem fine, but the problem is still there:
I know it may be complex to figure out what is going on inside it as the dataset varies, so if it is too bothering just ignore my issue and close it. Thank you again!
from scarches.
Hmm this is weird. Now your code also crashes at another spot right? It used to be at labels_1_pred = predict_labels(data_1, tree_2, threshold=rej_threshold)
, but now it's a step earlier during tree = train_tree(data_1, labels_1, tree, classifier, dimred, useRE, FN, n_neighbors, dynamic_neighbors, distkNN)
, right?
Do the labels you input to the learn_tree function still correspond to the labels that were already in the hierarchy?
It's quite difficult to debug, so without proper error traceback for this new problem and your input code, I am afraid I cannot help you.
from scarches.
Thank you again! I will use the demo dataset instead.
from scarches.
Related Issues (20)
- cannot setup.py install HOT 3
- ImportError: libcudnn.so.8: cannot open shared object file: No such file or directory HOT 2
- hlca_map_classify.ipynb missing early_stopping_kwarg
- Errors extracting embeddings HOT 1
- Runtime error using scPoli - mat1 and mat2 must have the same dtype, but got Double and Float HOT 7
- the difference between scPoli and scANVI HOT 1
- Update of conda environment yaml
- environment reproducibility HOT 2
- scPoli usage HOT 4
- More detailed training logs HOT 2
- scPoli Model for Unsupervised Use HOT 1
- Choice of reference and query data sets HOT 1
- Issue in annotating cell types of unlabelled query data by scPoli HOT 1
- Runtime error using scPoli - Tensors must have same number of dimensions: got 1 and 2 HOT 2
- Issue with expimap model HOT 1
- No module named 'jax.extend' HOT 1
- PBMC data no longer available HOT 7
- AttributeError: module 'scanpy.neighbors' has no attribute 'compute_neighbors_umap'
- scGEN network.batch_removal error HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from scarches.