rvinas / hyfa Goto Github PK
View Code? Open in Web Editor NEWHypergraph Factorisation
License: MIT License
Hypergraph Factorisation
License: MIT License
Hi, I have a couple of questions about the Normalized bulk transcriptomics according to your paper and GTEx pipeline, as below:
I would greatly appreciate any clarification you could provide on these matters:
Warm regards,
Mian
Hi, when running the codes of compare to different baselines in evaluate_GTEx_v8_normalised.ipynb using my own data, I found an AssertionError in line 142 a below:
AssertionError Traceback (most recent call last)
Cell In[32], line 142
140 y_test_pred = out['px_rate'].cpu().numpy() # torch.distributions.normal.Normal(loc=out['px_rate'], scale=out['px_r']).mean.cpu().numpy()
141 y_test_ = d.x_target.cpu().numpy()
--> 142 assert np.allclose(y_test_, y_test)
144 sample_scores = score_fn(y_test, y_test_pred, sample_corr=sample_corr)
146 # Append resultsAssertionError:
The target array of the aux_test_dataset is not consistent with the target array of the corresponding HypergraphDataset after we converted the aux_test_dataset into the HypergraphDataset.
After comparing d.target_dynamic['Participant ID'] and aux_test_dataset.adata_target.obs['Participant ID'], and also their expression arrays, I found the order of "Participand ID" and their expression data has changed. It seems that d.target_dynamic['Participant ID'] is ordered numerically and alphabetically instead of in the same order as aux_test_dataset.adata_target.obs['Participant ID']. For example:
print(aux_test_dataset.adata_target.obs['Participant ID'].values)
I got:
['GS12' 'GW133' 'GZ137' 'GW142' 'CT146' 'LBJ18' 'XQN39' 'SLG43' 'QG44' 'XQN75' 'ZGJ176' 'XN9063' 'PZ140']
After HypergraphDataset convertion and DataLoader:
aux_test_dataset = HypergraphDataset(adata[test_mask], obs_source={'Tissue': source_tissues}, obs_target={'Tissue': [tt]})'
aux_test_loader = DataLoader(aux_test_dataset, batch_size=len(aux_test_dataset),collate_fn=collate_fn, shuffle=False)
d = next(iter(aux_test_loader))
print(d.target_dynamic['Participant ID'])
The result changed into:
['CT146' 'GS12' 'GW133' 'GW142' 'GZ137' 'LBJ18' 'PZ140' 'QG44' 'SLG43' 'XN9063' 'XQN39' 'XQN75' 'ZGJ176']
The same is true of expression matrices (i.e. aux_test_dataset.adata_target.layers['x'].toarray() and d.x_target.cpu().numpy()).
How could I fix this error?
Thanks in advance!
Mian
I'm having trouble installing packages due to version conflicts. Can you suggest the right Python version for compatibility with the required packages?
Hi, I found the GTEx bulk RNA-seq donors were divided into three parts (training, validation, and testing donors). I can grasp the purposes of the training and validation subsets in relation to the Hypergraph model's training and accuracy validation respectively, but I cannot fully comprehend the role of the testing dataset.
Could anyone elaborate on the specific purpose of the testing dataset and how it differs from the validation dataset? Can I just split the data into training and validation, and treat the validation dataset as the testing dataset?
Thanks in advance!
Mian
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.