Coder Social home page Coder Social logo

scbert's People

Contributors

tencentailabhealthcare avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

scbert's Issues

How to train a new model?

Hello, I want to use scBERT, but you haven't provided the corresponding code. How can I retrain a new model? Can we only perform finetune?

stringent copyright

Thank you for sharing the source code of the project!

I wish extending and modifying it was allowed, to allow easier building on top of the idea you presented in the paper.

Has anyone been able to reproduce the finetune from the paper?

I ran scBERT finetune to try repro the results from the paper. I had to use:

  • batch_size 2 to fit in memory with the gui
  • torchrun --nproc_per_node=1 to run standalone. The model overfit and early terminated after 26 epochs and about 12 days of processing.
    A graph from the data output:
    image

Here's my command explicitly defining all the parameters as they appear in finetune.py:

torchrun --nproc_per_node=1 finetune.py \
--local_rank 0 \
--bin_num 5 \
--gene_num 16906 \
--epoch 100 \
--seed 2021 \
--batch_size 2 \
--learning_rate 0.0001 \
--grad_acc 60 \
--valid_every 1 \
--pos_embed True \
--data_path scBERT/data/Zheng68K.h5ad \
--model_path scBERT/data/panglao_pretrain.pth \
--ckpt_dir ./ckpts/ \
--model_name finetune

I'm very interested to hear if anyone has been able to repro the results of the paper. I could shut down my GUI and run batch_size 3 (the default), but given the run time on my RTX2080ti of 12 days to 26 epochs, I'm loathe to start that. Hoping to get some confirmation here that it'd be worthwhile to single task my machine for that long.

Thanks

scBERT on Mouse data

Hi

Hi wantt to use scBERT on mouse single cell data.
In the issue #8
@TencentAILabHealthcare said : "The scBERT could support other species, by replacing the data for pertaining and fine-tuning. The gene identities should also be changed to those of other species."

Could you be more specific ? :

  • Where can I find these data ?
  • How to run the pretraining and fine tunning on it ?
  • "The gene identities should also be changed to those of other species." ==> How is it possible to do that ?

I would greatly appreciate your assistance and guidance.

Thank you for your time and consideration.

Best,

Alexis

Request for Access to Gene embedding: gene2vec_16906.npy

Hello, I am a graduate student currently working on foundational models for biomedical sequences. I would really appreciate if you could provide the gene2vec_16906.npy file that will greatly aid my research efforts. I plan to test scBERT on my own dataset for comparative analysis.

About the pre-trained model weight accessment

hello, I cannot access the pre-trained model data on weixin drive. May I request the pre-trained model weights from you through alternative ways? Thank you for considering my request.

About pretraining process

Hi, I am trying to reproduce the pretraining process, and I want to know the pretraining resources, about the number of GPUs and pretraining time. Thanks

questions in finetune code

Hi, I have a question in fintune.py code, why should we concatenate full_seq with torch.tensor([0])
full_seq = torch.cat((full_seq, torch.tensor([0]))).to(device)
Thanks

where is gene2vec_16906.npy?

Hi,

I am trying to use pretrained scBERT on my data to finetune it, but an error message said No such file or directory: '../data/gene2vec_16906.npy'. Could you provide this file as well?

Best,

Yuge

How to represent the [mask] token?

Hello, I am wondering how did you mask the expression levels for pre-training? I'm assuming a vector of 200 zeros would work, but that it would be indistinguishable from actual zeros in the data, so maybe something like a vector of 200 -1's would be better/more explicit? What worked for you?

error with finetune.py

Hello,

Thank you for providing the tool.

However, I encountered an error with python -m torch.distributed.launch finetune.py:
The input h5ad file has been processed with your preprocess.py without errors.

/usr/local/lib/python3.8/dist-packages/torch/distributed/launch.py:180: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use_env is set by default in torchrun.
If your script expects --local_rank argument to be set, please
change it to read from os.environ['LOCAL_RANK'] instead. See
https://pytorch.org/docs/stable/distributed.html#launch-utility for
further instructions

warnings.warn(
/content/performer_pytorch.py:115: UserWarning: torch.qr is deprecated in favor of torch.linalg.qr and will be removed in a future PyTorch release.
The boolean parameter 'some' has been replaced with a string parameter 'mode'.
Q, R = torch.qr(A, some)
should be replaced with
Q, R = torch.linalg.qr(A, 'reduced' if some else 'complete') (Triggered internally at ../aten/src/ATen/native/BatchLinearAlgebra.cpp:2349.)
q, r = torch.qr(unstructured_block.cpu(), some = True)
== Epoch: 1 | Training Loss: 3.652004 | Accuracy: 0.7500% ==
== Epoch: 1 | Validation Loss: 3.654254 | F1 Score: 0.000274 ==
[[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
...
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]]
Traceback (most recent call last):
File "finetune.py", line 262, in
print(classification_report(truths, predictions, target_names=label_dict.tolist(), digits=4))
File "/usr/local/lib/python3.8/dist-packages/sklearn/metrics/_classification.py", line 2132, in classification_report
raise ValueError(
ValueError: Number of classes, 37, does not match size of target_names, 38. Try specifying the labels parameter
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 25554) of binary: /usr/bin/python3
Traceback (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.8/dist-packages/torch/distributed/launch.py", line 195, in
main()
File "/usr/local/lib/python3.8/dist-packages/torch/distributed/launch.py", line 191, in main
launch(args)
File "/usr/local/lib/python3.8/dist-packages/torch/distributed/launch.py", line 176, in launch
run(args)
File "/usr/local/lib/python3.8/dist-packages/torch/distributed/run.py", line 753, in run
elastic_launch(
File "/usr/local/lib/python3.8/dist-packages/torch/distributed/launcher/api.py", line 132, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/usr/local/lib/python3.8/dist-packages/torch/distributed/launcher/api.py", line 246, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

finetune.py FAILED

Failures:
<NO_OTHER_FAILURES>

Root Cause (first observed failure):
[0]:
time : 2023-01-09_08:36:05
host : d3c781782424
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 25554)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Spatial transcriptomics?

Thank you for sharing this useful tool. Would it be nontrivial to extend this methodology to 10x spatial data, with the use of spots instead of cells? Would appreciate your thoughts.

请问可以提供finetune完后的模型参数吗

您好,我想用您的模型进行细胞类型的注释,但是我的gpu资源有限,重新用您提供的checkpoint进行finetune需要一个多礼拜的时间,请问可以提供利用Zheng68k进行finetune之后的模型参数吗

launch.py: error: unrecognized arguments: --data_path

Hello,

Thank you for providing the tool.

However, I encountered an error with the finetune procedure:
python3 -m torch.distributed.launch --data_path "./data/Zheng68K.h5ad" --model_path "/data/scbert/data/panglao_pretrained.pth" finetune.py

the error is as follows:

usage: launch.py [-h] [--nnodes NNODES] [--node_rank NODE_RANK] [--nproc_per_node NPROC_PER_NODE]
[--master_addr MASTER_ADDR] [--master_port MASTER_PORT] [--use_env] [-m] [--no_python]
[--logdir LOGDIR]
training_script ...
launch.py: error: unrecognized arguments: --data_path


However, when I put the parameters i.e. the data path and model path into the finetune code as the default parameter, the error disappears. And then use python3 -m torch.distributed.launch finetune.py without any parameters to escape the error.

I read some similar cases from others. Some people suggest use torchrun instead of (the deprecated?) torch.distributed.launch. However it needs to upgrade the torch version which is inconsistent with the requirements.txt. I didn't upgrade because I didn't know whether upgrading torch version and using torchrun may cause other compatibility issues.

So could you please fix the problem?

prediction always fail, why and how to fix this? message: "IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices"

Traceback (most recent call last):
File "/home/yzhu/.conda/envs/scBERT/lib/python3.6/runpy.py", line 183, in _run_module_as_main
mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
File "/home/yzhu/.conda/envs/scBERT/lib/python3.6/runpy.py", line 109, in _get_module_details
import(pkg_name)
File "/mnt/efs/yzhu/scBERT3/predict.py", line 124, in
pred_list = label_dict[pred_finals].tolist()
IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices

#What does this mean when I used my own data for fine-tuning: 'RuntimeError: mat1 dim 1 must match mat2 dim 0'

What does this mean when I used my own data for fine-tuning: 'RuntimeError: mat1 dim 1 must match mat2 dim 0'?

I used scanpy Version: 1.9.3, Python 3.9.2. This should be ok, because I have run the tutorial smoothly, using Zheng68K.h5ad as fine-tuning data.
I have also made sure there is an adata.obs['celltype'] column in my data. Any suggestion on how to go about with troubleshooting? thanks!

$ python -m torch.distributed.launch finetune.py --data_path "./data/ref9bgInteg.h5ad" --model_path "./data/panglao_pretrained.pth" > log0730_refine_bnt9BGinteg.txt
Traceback (most recent call last):
File "/mnt/efs/yzhu/scbert/finetune.py", line 204, in
logits = model(data)
File "/home/yzhu/.conda/envs/scBERT2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/yzhu/.conda/envs/scBERT2/lib/python3.9/site-packages/torch/nn/parallel/distributed.py", line 705, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/home/yzhu/.conda/envs/scBERT2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/mnt/efs/yzhu/scbert/performer_pytorch/performer_pytorch.py", line 636, in forward
x = self.to_out(x)
File "/home/yzhu/.conda/envs/scBERT2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/mnt/efs/yzhu/scbert/finetune.py", line 114, in forward
x = self.fc1(x)
File "/home/yzhu/.conda/envs/scBERT2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/yzhu/.conda/envs/scBERT2/lib/python3.9/site-packages/torch/nn/modules/linear.py", line 94, in forward
return F.linear(input, self.weight, self.bias)
File "/home/yzhu/.conda/envs/scBERT2/lib/python3.9/site-packages/torch/nn/functional.py", line 1753, in linear
return torch._C._nn.linear(input, weight, bias)
RuntimeError: mat1 dim 1 must match mat2 dim 0
Traceback (most recent call last):
File "/home/yzhu/.conda/envs/scBERT2/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/yzhu/.conda/envs/scBERT2/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/yzhu/.conda/envs/scBERT2/lib/python3.9/site-packages/torch/distributed/launch.py", line 340, in
main()
File "/home/yzhu/.conda/envs/scBERT2/lib/python3.9/site-packages/torch/distributed/launch.py", line 326, in main
sigkill_handler(signal.SIGTERM, None) # not coming back
File "/home/yzhu/.conda/envs/scBERT2/lib/python3.9/site-packages/torch/distributed/launch.py", line 301, in sigkill_handler
raise subprocess.CalledProcessError(returncode=last_return_code, cmd=cmd)
subprocess.CalledProcessError: Command '['/home/yzhu/.conda/envs/scBERT2/bin/python', '-u', 'finetune.py', '--local_rank=0', '--data_path', './data/ref9bgInteg.h5ad', '--model_path', './data/panglao_pretrained.pth']' returned non-zero exit status 1.

How to get the cell embedding for UMAP?

Thank you for your great work.

However, it seems unclear in the paper about the embedding to draw the UMAP.
It is the experssion embedding (which is cells * genes * 200 dimensinal) or the average of the experssion embedding(which is cells * 200 dimensional)?
It would be grateful for you to answer the question. Thank you very much!

请问可以分享Human Cell Atlas的数据文件吗

您好,我想用您的模型进行细胞类型的注释,您的文章有写使用的Human Cell Atlas包含27种细胞类型,这与AHCA原文公布的细胞类型数量不一致,想问下您是否做了一些细胞类型合并的操作,如果方便的话能否分享下用于finetune的数据文件呢?非常感谢

跑出来的结果不太优秀

作者给的预训练模型我用了,并结合作者给的样例——Zheng68K跑了蛮久,问题是,验证损失的数值总是在浮动,是为什么呢,有人跑了其他数据集吗,结果会好一些吗?
image
作者您好,这里也想请问一下您的模型是不是还有别的调整方式?我就随便设置了个batch_size=3,难道是这个的问题?

What is the output of pre-trained model, and how it shows the information from the dataset?

What is the output of pre-trained model, and how it shows the information from the dataset? Is the output still kind of expression matrix which shows the information between Genes. I don't really get what the pre-trainning step does.

Besides, I want to know if we don't know the label from the dataset we want to annotate, which means there is no certain/good reference dataset. Which dataset could be used to fine-tune?

How to run code

I apologize for asking questions about the code so presumptuously.
The code does not appear to provide complete execution steps. Therefore, I hope to get help from the author or others regarding how to run the code and get results.
Thank you very much!

How toTrain from scratch

I want to train from scratch, not just fine-tune on the pre-trained model. How can I realize it? I can't find the code about it.

Link failed

Hello, Can you check the pre trained data link?
If it is possible, can you post an alternative one.

Downloadable link for the pretrained

Hi! Congrats for the great work! In addition to the data, could you please also provide a downloadable link for the pretrained weights? The current link requires an additional verification step to download. Thank you!

data download

I am trying to download the pre-trained model but I am unable to download using WeChat. Please provide me with some other way. Thanks!

Other species

Hi,
Does scBERT support other sepcies such as mouse, fly, worm and so on ?

performance without pretraining

I did a ablation study whether pretraining could benefit downstream tasks by finetuning without loading state dict from checkpoint. All other settings were kept same. the finetuning process stopped at epoch 34. All reported metrics are comparable to the paper results. Metrics for some rare cell types and overall are even better.
How to prove the pretraining benefit?

How to reproduce the pre-training process

Hi, I wonder if there are any reference codes we can use for fine-tuning.

Moreover, Can I use my own datasets with 1000 hvgs to perform analysis? I always receive an error:
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
terminate called after throwing an instance of 'c10::CUDAError'

Error in prediction

Prediction with "python predict.py" resulted in the following error message:

    TypeError: can't convert cuda:0 device type to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

Replacing "pred_finals.append(pred_final)" with "pred_finals.append(torch.IntTensor.item(pred_final))" in line 123 of predict.py could fix this problem, but the output of inference had only one cell type, like this:

    ['CD8+ Cytotoxic T','CD8+ Cytotoxic T','CD8+ Cytotoxic T'...]

Is this due to the incorrect modification of the code, or the insufficient fine-tuning of the model?

question

您好,我想请问一下scBERT的代码部分
首先我按照您给的github中的链接进行了数据和checkpoint下载

  1. preprocess.py中提到的data = sc.read_h5ad('./data/raw_data.h5ad') 中的raw_data.h5ad是指哪一个呢?是panglao_human.h5ad这个吗?
  2. finetune.py中的from performer_pytorch_v2 import PerformerLM,您的代码中没有performer_pytorch_v2

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.