tencentailabhealthcare / scbert Goto Github PK
View Code? Open in Web Editor NEWLicense: GNU General Public License v3.0
License: GNU General Public License v3.0
Hello, I want to use scBERT, but you haven't provided the corresponding code. How can I retrain a new model? Can we only perform finetune?
Thank you for sharing the source code of the project!
I wish extending and modifying it was allowed, to allow easier building on top of the idea you presented in the paper.
I ran scBERT finetune to try repro the results from the paper. I had to use:
batch_size 2
to fit in memory with the guitorchrun --nproc_per_node=1
to run standalone. The model overfit and early terminated after 26 epochs and about 12 days of processing.Here's my command explicitly defining all the parameters as they appear in finetune.py:
torchrun --nproc_per_node=1 finetune.py \
--local_rank 0 \
--bin_num 5 \
--gene_num 16906 \
--epoch 100 \
--seed 2021 \
--batch_size 2 \
--learning_rate 0.0001 \
--grad_acc 60 \
--valid_every 1 \
--pos_embed True \
--data_path scBERT/data/Zheng68K.h5ad \
--model_path scBERT/data/panglao_pretrain.pth \
--ckpt_dir ./ckpts/ \
--model_name finetune
I'm very interested to hear if anyone has been able to repro the results of the paper. I could shut down my GUI and run batch_size 3 (the default), but given the run time on my RTX2080ti of 12 days to 26 epochs, I'm loathe to start that. Hoping to get some confirmation here that it'd be worthwhile to single task my machine for that long.
Thanks
Hi
Hi wantt to use scBERT on mouse single cell data.
In the issue #8
@TencentAILabHealthcare said : "The scBERT could support other species, by replacing the data for pertaining and fine-tuning. The gene identities should also be changed to those of other species."
Could you be more specific ? :
I would greatly appreciate your assistance and guidance.
Thank you for your time and consideration.
Best,
Alexis
Hello, I am a graduate student currently working on foundational models for biomedical sequences. I would really appreciate if you could provide the gene2vec_16906.npy file that will greatly aid my research efforts. I plan to test scBERT on my own dataset for comparative analysis.
hello, I cannot access the pre-trained model data on weixin drive. May I request the pre-trained model weights from you through alternative ways? Thank you for considering my request.
Could you release the source code for evaluation of novel cell type detection, or a complete tutorial/demo for novel cell type detection and evaluation?
Thanks a lot !
如题
After checking a previous answer, there is still an error after installing 'local-attention': ModuleNotFoundError: No module named 'fast_transformers'.
It seems that there is a missing file "local_attention.py" in the performer_pytorch folder.
Hi, I am trying to reproduce the pretraining process, and I want to know the pretraining resources, about the number of GPUs and pretraining time. Thanks
Hi, I have a question in fintune.py code, why should we concatenate full_seq with torch.tensor([0])
full_seq = torch.cat((full_seq, torch.tensor([0]))).to(device)
Thanks
Thanks
Hi, could you please upload your model checkpoint to Google Drive? The current link is in Chinese and requires log in to download the model
Hi,
I am trying to use pretrained scBERT on my data to finetune it, but an error message said No such file or directory: '../data/gene2vec_16906.npy'. Could you provide this file as well?
Best,
Yuge
Hello, I am wondering how did you mask the expression levels for pre-training? I'm assuming a vector of 200 zeros would work, but that it would be indistinguishable from actual zeros in the data, so maybe something like a vector of 200 -1's would be better/more explicit? What worked for you?
Line 232 in 1e384a3
Hello,
Thank you for providing the tool.
However, I encountered an error with python -m torch.distributed.launch finetune.py
:
The input h5ad file has been processed with your preprocess.py without errors.
/usr/local/lib/python3.8/dist-packages/torch/distributed/launch.py:180: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use_env is set by default in torchrun.
If your script expects--local_rank
argument to be set, please
change it to read fromos.environ['LOCAL_RANK']
instead. See
https://pytorch.org/docs/stable/distributed.html#launch-utility for
further instructionswarnings.warn(
/content/performer_pytorch.py:115: UserWarning: torch.qr is deprecated in favor of torch.linalg.qr and will be removed in a future PyTorch release.
The boolean parameter 'some' has been replaced with a string parameter 'mode'.
Q, R = torch.qr(A, some)
should be replaced with
Q, R = torch.linalg.qr(A, 'reduced' if some else 'complete') (Triggered internally at ../aten/src/ATen/native/BatchLinearAlgebra.cpp:2349.)
q, r = torch.qr(unstructured_block.cpu(), some = True)
== Epoch: 1 | Training Loss: 3.652004 | Accuracy: 0.7500% ==
== Epoch: 1 | Validation Loss: 3.654254 | F1 Score: 0.000274 ==
[[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
...
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]]
Traceback (most recent call last):
File "finetune.py", line 262, in
print(classification_report(truths, predictions, target_names=label_dict.tolist(), digits=4))
File "/usr/local/lib/python3.8/dist-packages/sklearn/metrics/_classification.py", line 2132, in classification_report
raise ValueError(
ValueError: Number of classes, 37, does not match size of target_names, 38. Try specifying the labels parameter
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 25554) of binary: /usr/bin/python3
Traceback (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.8/dist-packages/torch/distributed/launch.py", line 195, in
main()
File "/usr/local/lib/python3.8/dist-packages/torch/distributed/launch.py", line 191, in main
launch(args)
File "/usr/local/lib/python3.8/dist-packages/torch/distributed/launch.py", line 176, in launch
run(args)
File "/usr/local/lib/python3.8/dist-packages/torch/distributed/run.py", line 753, in run
elastic_launch(
File "/usr/local/lib/python3.8/dist-packages/torch/distributed/launcher/api.py", line 132, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/usr/local/lib/python3.8/dist-packages/torch/distributed/launcher/api.py", line 246, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:finetune.py FAILED
Failures:
<NO_OTHER_FAILURES>Root Cause (first observed failure):
[0]:
time : 2023-01-09_08:36:05
host : d3c781782424
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 25554)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
Thank you for sharing this useful tool. Would it be nontrivial to extend this methodology to 10x spatial data, with the use of spots instead of cells? Would appreciate your thoughts.
您好,我想用您的模型进行细胞类型的注释,但是我的gpu资源有限,重新用您提供的checkpoint进行finetune需要一个多礼拜的时间,请问可以提供利用Zheng68k进行finetune之后的模型参数吗
Hello,
Thank you for providing the tool.
However, I encountered an error with the finetune procedure:
python3 -m torch.distributed.launch --data_path "./data/Zheng68K.h5ad" --model_path "/data/scbert/data/panglao_pretrained.pth" finetune.py
the error is as follows:
usage: launch.py [-h] [--nnodes NNODES] [--node_rank NODE_RANK] [--nproc_per_node NPROC_PER_NODE]
[--master_addr MASTER_ADDR] [--master_port MASTER_PORT] [--use_env] [-m] [--no_python]
[--logdir LOGDIR]
training_script ...
launch.py: error: unrecognized arguments: --data_path
However, when I put the parameters i.e. the data path and model path into the finetune code as the default parameter, the error disappears. And then use python3 -m torch.distributed.launch finetune.py without any parameters to escape the error.
I read some similar cases from others. Some people suggest use torchrun instead of (the deprecated?) torch.distributed.launch. However it needs to upgrade the torch version which is inconsistent with the requirements.txt. I didn't upgrade because I didn't know whether upgrading torch version and using torchrun may cause other compatibility issues.
So could you please fix the problem?
Traceback (most recent call last):
File "/home/yzhu/.conda/envs/scBERT/lib/python3.6/runpy.py", line 183, in _run_module_as_main
mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
File "/home/yzhu/.conda/envs/scBERT/lib/python3.6/runpy.py", line 109, in _get_module_details
import(pkg_name)
File "/mnt/efs/yzhu/scBERT3/predict.py", line 124, in
pred_list = label_dict[pred_finals].tolist()
IndexError: only integers, slices (:
), ellipsis (...
), numpy.newaxis (None
) and integer or boolean arrays are valid indices
What does this mean when I used my own data for fine-tuning: 'RuntimeError: mat1 dim 1 must match mat2 dim 0'?
I used scanpy Version: 1.9.3, Python 3.9.2. This should be ok, because I have run the tutorial smoothly, using Zheng68K.h5ad as fine-tuning data.
I have also made sure there is an adata.obs['celltype'] column in my data. Any suggestion on how to go about with troubleshooting? thanks!
$ python -m torch.distributed.launch finetune.py --data_path "./data/ref9bgInteg.h5ad" --model_path "./data/panglao_pretrained.pth" > log0730_refine_bnt9BGinteg.txt
Traceback (most recent call last):
File "/mnt/efs/yzhu/scbert/finetune.py", line 204, in
logits = model(data)
File "/home/yzhu/.conda/envs/scBERT2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/yzhu/.conda/envs/scBERT2/lib/python3.9/site-packages/torch/nn/parallel/distributed.py", line 705, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/home/yzhu/.conda/envs/scBERT2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/mnt/efs/yzhu/scbert/performer_pytorch/performer_pytorch.py", line 636, in forward
x = self.to_out(x)
File "/home/yzhu/.conda/envs/scBERT2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/mnt/efs/yzhu/scbert/finetune.py", line 114, in forward
x = self.fc1(x)
File "/home/yzhu/.conda/envs/scBERT2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/yzhu/.conda/envs/scBERT2/lib/python3.9/site-packages/torch/nn/modules/linear.py", line 94, in forward
return F.linear(input, self.weight, self.bias)
File "/home/yzhu/.conda/envs/scBERT2/lib/python3.9/site-packages/torch/nn/functional.py", line 1753, in linear
return torch._C._nn.linear(input, weight, bias)
RuntimeError: mat1 dim 1 must match mat2 dim 0
Traceback (most recent call last):
File "/home/yzhu/.conda/envs/scBERT2/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/yzhu/.conda/envs/scBERT2/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/yzhu/.conda/envs/scBERT2/lib/python3.9/site-packages/torch/distributed/launch.py", line 340, in
main()
File "/home/yzhu/.conda/envs/scBERT2/lib/python3.9/site-packages/torch/distributed/launch.py", line 326, in main
sigkill_handler(signal.SIGTERM, None) # not coming back
File "/home/yzhu/.conda/envs/scBERT2/lib/python3.9/site-packages/torch/distributed/launch.py", line 301, in sigkill_handler
raise subprocess.CalledProcessError(returncode=last_return_code, cmd=cmd)
subprocess.CalledProcessError: Command '['/home/yzhu/.conda/envs/scBERT2/bin/python', '-u', 'finetune.py', '--local_rank=0', '--data_path', './data/ref9bgInteg.h5ad', '--model_path', './data/panglao_pretrained.pth']' returned non-zero exit status 1.
Could the scBERT run on macOS with a M1 chip?
Hi all, if I only intend to access gene embedding data or cell embedding data, are there any approaches? Thanks a lot.
Thanks for your excellent work!
How can I use multi-gpu to fine-tune?
hello, I cannot access the pretrained model data. Is there any chance you could store it in google drive along with the other file?
请问是否有人遇到无法加载微调得到的模型文件,怀疑是不是torch不兼容,但是已经按照requirtments.txt文件进行配置 @yfzon @TencentAILabHealthcare
Hello,
I tried to use the link provided, & the QR code, but unable access the panglao_pretrain.pth file;
Could you please help
Thank you for your help and time
Thank you for your great work.
However, it seems unclear in the paper about the embedding to draw the UMAP.
It is the experssion embedding (which is cells * genes * 200 dimensinal) or the average of the experssion embedding(which is cells * 200 dimensional)?
It would be grateful for you to answer the question. Thank you very much!
Line 74 in a6998d3
您好,我想用您的模型进行细胞类型的注释,您的文章有写使用的Human Cell Atlas包含27种细胞类型,这与AHCA原文公布的细胞类型数量不一致,想问下您是否做了一些细胞类型合并的操作,如果方便的话能否分享下用于finetune的数据文件呢?非常感谢
Hi,
Would you mind sharing the cell label file for the Zheng68K dataset mentioned in your paper? I found one, but seems it's inconsistent with the statistics reported in your paper.
https://github.com/theislab/scanpy_usage/blob/38030605c74c755167302caa431cf3ff62de646f/170503_zheng17/data/zheng17_bulk_lables.txt
Thanks a lot!
Zhao
What is the output of pre-trained model, and how it shows the information from the dataset? Is the output still kind of expression matrix which shows the information between Genes. I don't really get what the pre-trainning step does.
Besides, I want to know if we don't know the label from the dataset we want to annotate, which means there is no certain/good reference dataset. Which dataset could be used to fine-tune?
I apologize for asking questions about the code so presumptuously.
The code does not appear to provide complete execution steps. Therefore, I hope to get help from the author or others regarding how to run the code and get results.
Thank you very much!
您好,想获得gen2vec_16906.npy的基因名称和基因顺序,请问应该如何获取?
I want to train from scratch, not just fine-tune on the pre-trained model. How can I realize it? I can't find the code about it.
Hello, Can you check the pre trained data link?
If it is possible, can you post an alternative one.
Hi! Congrats for the great work! In addition to the data, could you please also provide a downloadable link for the pretrained weights? The current link requires an additional verification step to download. Thank you!
I am trying to download the pre-trained model but I am unable to download using WeChat. Please provide me with some other way. Thanks!
Hi,
Does scBERT support other sepcies such as mouse, fly, worm and so on ?
I did a ablation study whether pretraining could benefit downstream tasks by finetuning without loading state dict from checkpoint. All other settings were kept same. the finetuning process stopped at epoch 34. All reported metrics are comparable to the paper results. Metrics for some rare cell types and overall are even better.
How to prove the pretraining benefit?
Thanks a lot for the great work and paper. I was wondering if there is any code to annotate and plot cells from your pretrained model, in a similar way as Figure 2 of your paper (https://www.nature.com/articles/s42256-022-00534-z/figures/2)?
Thanks a lot in advance!
Hi, I wonder if there are any reference codes we can use for fine-tuning.
Moreover, Can I use my own datasets with 1000 hvgs to perform analysis? I always receive an error:
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
terminate called after throwing an instance of 'c10::CUDAError'
Hi,
The codes of pre-training and predict have been showed? I wonder how to train a model from scratch.
Thx!
Thank you for sharing this useful tool. How to deal with the vector of 0 value in the pre-training model, whether the vector of 0 value can be trained
Prediction with "python predict.py" resulted in the following error message:
TypeError: can't convert cuda:0 device type to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
Replacing "pred_finals.append(pred_final)" with "pred_finals.append(torch.IntTensor.item(pred_final))" in line 123 of predict.py could fix this problem, but the output of inference had only one cell type, like this:
['CD8+ Cytotoxic T','CD8+ Cytotoxic T','CD8+ Cytotoxic T'...]
Is this due to the incorrect modification of the code, or the insufficient fine-tuning of the model?
您好,我想请问一下scBERT的代码部分
首先我按照您给的github中的链接进行了数据和checkpoint下载
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.