modelscope / adaseq Goto Github PK

AdaSeq: An All-in-One Library for Developing State-of-the-Art Sequence Understanding Models

License: Apache License 2.0

Python 97.51% Shell 0.08% Perl 2.41%

entity-typing named-entity-recognition natural-language-processing natural-language-understanding nlp pytorch sequence-labeling word-segmentation ner relation-extraction

adaseq's Introduction

Discord

English | 中文 | 日本語

Introduction

ModelScope is built upon the notion of “Model-as-a-Service” (MaaS). It seeks to bring together most advanced machine learning models from the AI community, and streamlines the process of leveraging AI models in real-world applications. The core ModelScope library open-sourced in this repository provides the interfaces and implementations that allow developers to perform model inference, training and evaluation.

In particular, with rich layers of API-abstraction, the ModelScope library offers unified experience to explore state-of-the-art models spanning across domains such as CV, NLP, Speech, Multi-Modality, and Scientific-computation. Model contributors of different areas can integrate models into the ModelScope ecosystem through the layered-APIs, allowing easy and unified access to their models. Once integrated, model inference, fine-tuning, and evaluations can be done with only a few lines of codes. In the meantime, flexibilities are also provided so that different components in the model applications can be customized wherever necessary.

Apart from harboring implementations of a wide range of different models, ModelScope library also enables the necessary interactions with ModelScope backend services, particularly with the Model-Hub and Dataset-Hub. Such interactions facilitate management of various entities (models and datasets) to be performed seamlessly under-the-hood, including entity lookup, version control, cache management, and many others.

Models and Online Accessibility

Hundreds of models are made publicly available on ModelScope (700+ and counting), covering the latest development in areas such as NLP, CV, Audio, Multi-modality, and AI for Science, etc. Many of these models represent the SOTA in their specific fields, and made their open-sourced debut on ModelScope. Users can visit ModelScope(modelscope.cn) and experience first-hand how these models perform via online experience, with just a few clicks. Immediate developer-experience is also possible through the ModelScope Notebook, which is backed by ready-to-use CPU/GPU development environment in the cloud - only one click away on ModelScope.

Some representative examples include:

LLM:

Multi-Modal:

CV:

Audio:

AI for Science:

Note: Most models on ModelScope are public and can be downloaded without account registration on modelscope website(www.modelscope.cn), please refer to instructions for model download, for dowloading models with api provided by modelscope library or git.

QuickTour

We provide unified interface for inference using pipeline, fine-tuning and evaluation using Trainer for different tasks.

For any given task with any type of input (image, text, audio, video...), inference pipeline can be implemented with only a few lines of code, which will automatically load the underlying model to get inference result, as is exemplified below:

>>> from modelscope.pipelines import pipeline
>>> word_segmentation = pipeline('word-segmentation',model='damo/nlp_structbert_word-segmentation_chinese-base')
>>> word_segmentation('今天天气不错，适合出去游玩')
{'output': '今天 天气 不错 ， 适合 出去 游玩'}

Given an image, portrait matting (aka. background-removal) can be accomplished with the following code snippet:

>>> import cv2
>>> from modelscope.pipelines import pipeline

>>> portrait_matting = pipeline('portrait-matting')
>>> result = portrait_matting('https://modelscope.oss-cn-beijing.aliyuncs.com/test/images/image_matting.png')
>>> cv2.imwrite('result.png', result['output_img'])

The output image with the background removed is:

Fine-tuning and evaluation can also be done with a few more lines of code to set up training dataset and trainer, with the heavy-lifting work of training and evaluation a model encapsulated in the implementation of traner.train() and trainer.evaluate() interfaces.

For example, the gpt3 base model (1.3B) can be fine-tuned with the chinese-poetry dataset, resulting in a model that can be used for chinese-poetry generation.

>>> from modelscope.metainfo import Trainers
>>> from modelscope.msdatasets import MsDataset
>>> from modelscope.trainers import build_trainer

>>> train_dataset = MsDataset.load('chinese-poetry-collection', split='train'). remap_columns({'text1': 'src_txt'})
>>> eval_dataset = MsDataset.load('chinese-poetry-collection', split='test').remap_columns({'text1': 'src_txt'})
>>> max_epochs = 10
>>> tmp_dir = './gpt3_poetry'

>>> kwargs = dict(
     model='damo/nlp_gpt3_text-generation_1.3B',
     train_dataset=train_dataset,
     eval_dataset=eval_dataset,
     max_epochs=max_epochs,
     work_dir=tmp_dir)

>>> trainer = build_trainer(name=Trainers.gpt3_trainer, default_args=kwargs)
>>> trainer.train()

Why should I use ModelScope library

A unified and concise user interface is abstracted for different tasks and different models. Model inferences and training can be implemented by as few as 3 and 10 lines of code, respectively. It is convenient for users to explore models in different fields in the ModelScope community. All models integrated into ModelScope are ready to use, which makes it easy to get started with AI, in both educational and industrial settings.
ModelScope offers a model-centric development and application experience. It streamlines the support for model training, inference, export and deployment, and facilitates users to build their own MLOps based on the ModelScope ecosystem.
For the model inference and training process, a modular design is put in place, and a wealth of functional module implementations are provided, which is convenient for users to customize their own model inference, training and other processes.
For distributed model training, especially for large models, it provides rich training strategy support, including data parallel, model parallel, hybrid parallel and so on.

Installation

Docker

ModelScope Library currently supports popular deep learning framework for model training and inference, including PyTorch, TensorFlow and ONNX. All releases are tested and run on Python 3.7+, Pytorch 1.8+, Tensorflow1.15 or Tensorflow2.0+.

To allow out-of-box usage for all the models on ModelScope, official docker images are provided for all releases. Based on the docker image, developers can skip all environment installation and configuration and use it directly. Currently, the latest version of the CPU image and GPU image can be obtained from:

CPU docker image

# py37
registry.cn-hangzhou.aliyuncs.com/modelscope-repo/modelscope:ubuntu20.04-py37-torch1.11.0-tf1.15.5-1.6.1

# py38
registry.cn-hangzhou.aliyuncs.com/modelscope-repo/modelscope:ubuntu20.04-py38-torch2.0.1-tf2.13.0-1.9.5

GPU docker image

# py37
registry.cn-hangzhou.aliyuncs.com/modelscope-repo/modelscope:ubuntu20.04-cuda11.3.0-py37-torch1.11.0-tf1.15.5-1.6.1

# py38
registry.cn-hangzhou.aliyuncs.com/modelscope-repo/modelscope:ubuntu20.04-cuda11.8.0-py38-torch2.0.1-tf2.13.0-1.9.5

Setup Local Python Environment

One can also set up local ModelScope environment using pip and conda. ModelScope supports python3.7 and above. We suggest anaconda for creating local python environment:

conda create -n modelscope python=3.8
conda activate modelscope

PyTorch or TensorFlow can be installed separately according to each model's requirements.

Install pytorch doc
Install tensorflow doc

After installing the necessary machine-learning framework, you can install modelscope library as follows:

If you only want to play around with the modelscope framework, of trying out model/dataset download, you can install the core modelscope components:

pip install modelscope

If you want to use multi-modal models:

pip install modelscope[multi-modal]

If you want to use nlp models:

pip install modelscope[nlp] -f https://modelscope.oss-cn-beijing.aliyuncs.com/releases/repo.html

If you want to use cv models:

pip install modelscope[cv] -f https://modelscope.oss-cn-beijing.aliyuncs.com/releases/repo.html

If you want to use audio models:

pip install modelscope[audio] -f https://modelscope.oss-cn-beijing.aliyuncs.com/releases/repo.html

If you want to use science models:

pip install modelscope[science] -f https://modelscope.oss-cn-beijing.aliyuncs.com/releases/repo.html

Notes:

Currently, some audio-task models only support python3.7, tensorflow1.15.4 Linux environments. Most other models can be installed and used on Windows and Mac (x86).
Some models in the audio field use the third-party library SoundFile for wav file processing. On the Linux system, users need to manually install libsndfile of SoundFile(doc link). On Windows and MacOS, it will be installed automatically without user operation. For example, on Ubuntu, you can use following commands:
```
sudo apt-get update
sudo apt-get install libsndfile1
```
Some models in computer vision need mmcv-full, you can refer to mmcv installation guide, a minimal installation is as follows:
```
pip uninstall mmcv # if you have installed mmcv, uninstall it
pip install -U openmim
mim install mmcv-full
```

Learn More

We provide additional documentations including:

License

This project is licensed under the Apache License (Version 2.0).

Citation

@Misc{modelscope,
  title = {ModelScope: bring the notion of Model-as-a-Service to life.},
  author = {The ModelScope Team},
  howpublished = {\url{https://github.com/modelscope/modelscope}},
  year = {2023}
}

adaseq's People

Contributors

Stargazers

Watchers

adaseq's Issues

[Question] Error loading inference after model fine-tuning 模型微调后加载时出错

What is your question?

问题：模型微调后加载时出错
Question: Error loading inference after model fine-tuning

RuntimeError: SequenceLabelingPipeline: SequenceLabelingModel: Error(s) in loading state_dict for SequenceLabelingModel:
Missing key(s) in state_dict: "embedder.transformer_model.embeddings.position_ids".

What have you tried?

我完全按照官方教程进行的微调，没有进行额外的代码修改，自动保存的微调模型无法加载。
I completely followed the official tutorial for fine-tuning without making any additional code modifications, and the automatically saved fine-tuning model cannot be loaded.

Code (if necessary)

Code:
python```
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
p = pipeline(
Tasks.named_entity_recognition,
'/root/experiments/experiments/CMeEE-cmeee/20240227_093521/output_best'
)
result = p('对儿童SARST细胞亚群的研究表明，与成人SARS相比，儿童细胞下降不明显，证明上述推测成立。')
print(result)


### What's your environment?

- AdaSeq Version (e.g., 1.0 or master): 0.6.6
- ModelScope Version (e.g., 1.0 or master): 1.12.0
- PyTorch Version (e.g., 1.12.1): 2.2.1
- OS (e.g., Ubuntu 20.04): Ubuntu 20.04
- Python version: 3.11
- CUDA/cuDNN version: 
- GPU models and configuration: RTX3090
- Any other relevant information:


### Code of Conduct

- [X] I agree to follow this project's Code of Conduct

FileNotFoundError of try

What is your question?

when I run :"python scripts/train.py -c examples/bert_crf/configs/resume.yaml".
it warns me: "FileNotFoundError: [Errno 2] No such file or directory: 'C:/Users/XXX/.cache/huggingface/datasets/named_entity_recognition_dataset_builder/default-ef6495d2da4494b4/0.0.0/db737b9bb893f20fb03d04403a30bf7c033256c212b7e9f0ebc6e9c958535c51.incomplete/named_entity_recognition_dataset_builder-test-00000-00000-of-NNNNN.arrow'".

What have you tried?

No response

Code (if necessary)

(base) PS C:\Users\XXX\Desktop\Graphic\AdaSeq> python scripts/train.py -c examples/bert_crf/configs/resume.yaml
2023-09-13 19:25:22,907 - modelscope - INFO - PyTorch version 2.0.1 Found.
2023-09-13 19:25:22,912 - modelscope - INFO - Loading ast index from C:\Users\XXX.cache\modelscope\ast_indexer
2023-09-13 19:25:23,304 - modelscope - INFO - Loading done! Current index file version is 1.9.0, with md5 f8489c2bf624f5caf45aa5b79ca58350 and a total number of 921 components indexed
2023-09-13 19:25:27,292 - modelscope - WARNING - The reference has been Deprecated in modelscope v1.4.0+, please use from modelscope.msdatasets.dataset_cls.custom_datasets import TorchCustomDataset
2023-09-13 19:25:27,356 - INFO - adaseq.data.dataset_manager - Will use a custom loading script: D:\conda\Lib\site-packages\adaseq\data\dataset_builders\named_entity_recognition_dataset_builder.py
Downloading and preparing dataset named_entity_recognition_dataset_builder/default to C:/Users/XXX/.cache/huggingface/datasets/named_entity_recognition_dataset_builder/default-ef6495d2da4494b4/0.0.0/db737b9bb893f20fb03d04403a30bf7c033256c212b7e9f0ebc6e9c958535c51...
Traceback (most recent call last):
File "D:\conda\Lib\site-packages\datasets\builder.py", line 1618, in _prepare_split_single
writer = writer_class(
^^^^^^^^^^^^^
File "D:\conda\Lib\site-packages\datasets\arrow_writer.py", line 334, in init
self.stream = self._fs.open(fs_token_paths[2][0], "wb")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\conda\Lib\site-packages\fsspec\spec.py", line 1151, in open
f = self._open(
^^^^^^^^^^^
File "D:\conda\Lib\site-packages\fsspec\implementations\local.py", line 183, in _open
return LocalFileOpener(path, mode, fs=self, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\conda\Lib\site-packages\fsspec\implementations\local.py", line 285, in init
self._open()
File "D:\conda\Lib\site-packages\fsspec\implementations\local.py", line 290, in _open
self.f = open(self.path, mode=self.mode)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'C:/Users/XXX/.cache/huggingface/datasets/named_entity_recognition_dataset_builder/default-ef6495d2da4494b4/0.0.0/db737b9bb893f20fb03d04403a30bf7c033256c212b7e9f0ebc6e9c958535c51.incomplete/named_entity_recognition_dataset_builder-test-00000-00000-of-NNNNN.arrow'

What's your environment?

AdaSeq Version (e.g., 1.0 or master):
ModelScope Version (e.g., 1.0 or master):
PyTorch Version (e.g., 1.12.1):
OS (e.g., Ubuntu 20.04):
Python version:
CUDA/cuDNN version:
GPU models and configuration:
Any other relevant information:

Code of Conduct

I agree to follow this project's Code of Conduct

[Bug] error in typing_metric.py

Checklist before your report.

I have verified that the issue exists against the master branch of AdaSeq.
I have read the relevant section in the contribution guide on reporting bugs.
I have checked the issues list for similar or identical bug reports.
I have checked the pull requests list for existing proposed fixes.
I have checked the commit log to find out if the bug was already fixed in the master branch.

What happened?

error occurred during the evaluation phase of the training script for entity typing.

Python traceback

show/hide

Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/averie/name-entity-recognition/experiments/adaseq/scripts/train.py", line 38, in <module>
    train_model_from_args(args)
  File "/home/averie/name-entity-recognition/experiments/adaseq/adaseq/commands/train.py", line 84, in train_model_from_args
    train_model(
  File "/home/averie/name-entity-recognition/experiments/adaseq/adaseq/commands/train.py", line 164, in train_model
    trainer.train(checkpoint_path)
  File "/home/averie/name-entity-recognition/experiments/adaseq/adaseq/training/default_trainer.py", line 146, in train
    return super().train(checkpoint_path=checkpoint_path, *args, **kwargs)
  File "/home/averie/name-entity-recognition/experiments/adaseq/env/lib/python3.10/site-packages/modelscope/trainers/trainer.py", line 689, in train
    self.train_loop(self.train_dataloader)
  File "/home/averie/name-entity-recognition/experiments/adaseq/env/lib/python3.10/site-packages/modelscope/trainers/trainer.py", line 1220, in train_loop
    self.invoke_hook(TrainerStages.after_train_epoch)
  File "/home/averie/name-entity-recognition/experiments/adaseq/env/lib/python3.10/site-packages/modelscope/trainers/trainer.py", line 1372, in invoke_hook
    getattr(hook, fn_name)(self)
  File "/home/averie/name-entity-recognition/experiments/adaseq/env/lib/python3.10/site-packages/modelscope/trainers/hooks/evaluation_hook.py", line 54, in after_train_epoch
    self.do_evaluate(trainer)
  File "/home/averie/name-entity-recognition/experiments/adaseq/env/lib/python3.10/site-packages/modelscope/trainers/hooks/evaluation_hook.py", line 67, in do_evaluate
    eval_res = trainer.evaluate()
  File "/home/averie/name-entity-recognition/experiments/adaseq/env/lib/python3.10/site-packages/modelscope/trainers/trainer.py", line 778, in evaluate
    metric_values = self.evaluation_loop(self.eval_dataloader,
  File "/home/averie/name-entity-recognition/experiments/adaseq/env/lib/python3.10/site-packages/modelscope/trainers/trainer.py", line 1272, in evaluation_loop
    metric_values = single_gpu_test(
  File "/home/averie/name-entity-recognition/experiments/adaseq/env/lib/python3.10/site-packages/modelscope/trainers/utils/inference.py", line 56, in single_gpu_test
    evaluate_batch(trainer, data, metric_classes, vis_closure)
  File "/home/averie/name-entity-recognition/experiments/adaseq/env/lib/python3.10/site-packages/modelscope/trainers/utils/inference.py", line 183, in evaluate_batch
    metric_cls.add(batch_result, data)
  File "/home/averie/name-entity-recognition/experiments/adaseq/adaseq/metrics/typing_metric.py", line 128, in add
    pred_results.append(one_hot_to_list(predicts[i][j]))
  File "/home/averie/name-entity-recognition/experiments/adaseq/adaseq/metrics/typing_metric.py", line 123, in one_hot_to_list
    id_list = set((np.where(in_tensor.detach().cpu() == 1)[0]))
AttributeError: 'set' object has no attribute 'detach'

Operating system

Ubuntu 22.04.2 LTS

Python version

3.10.6

Output of pip freeze

show/hide

addict==2.4.0
aiohttp==3.8.4
aiosignal==1.3.1
aliyun-python-sdk-core==2.13.36
aliyun-python-sdk-kms==2.16.1
async-timeout==4.0.2
attrs==23.1.0
certifi==2023.5.7
cffi==1.15.1
charset-normalizer==3.1.0
cmake==3.26.3
crcmod==1.7
cryptography==41.0.1
datasets==2.8.0
dill==0.3.6
einops==0.6.1
filelock==3.12.0
frozenlist==1.3.3
fsspec==2023.5.0
gast==0.5.4
huggingface-hub==0.15.1
idna==3.4
Jinja2==3.1.2
jmespath==0.10.0
joblib==1.2.0
lit==16.0.5
MarkupSafe==2.1.2
modelscope==1.6.0
mpmath==1.3.0
multidict==6.0.4
multiprocess==0.70.14
networkx==3.1
numpy==1.22.0
nvidia-cublas-cu11==11.10.3.66
nvidia-cuda-cupti-cu11==11.7.101
nvidia-cuda-nvrtc-cu11==11.7.99
nvidia-cuda-runtime-cu11==11.7.99
nvidia-cudnn-cu11==8.5.0.96
nvidia-cufft-cu11==10.9.0.58
nvidia-curand-cu11==10.2.10.91
nvidia-cusolver-cu11==11.4.0.1
nvidia-cusparse-cu11==11.7.4.91
nvidia-nccl-cu11==2.14.3
nvidia-nvtx-cu11==11.7.91
oss2==2.18.0
packaging==23.1
pandas==1.5.3
Pillow==9.5.0
pyarrow==12.0.0
pycparser==2.21
pycryptodome==3.18.0
python-dateutil==2.8.2
pytz==2023.3
PyYAML==6.0
regex==2023.5.5
requests==2.31.0
responses==0.18.0
scikit-learn==1.2.2
scipy==1.10.1
seqeval==1.2.2
simplejson==3.19.1
six==1.16.0
sortedcontainers==2.4.0
sympy==1.12
threadpoolctl==3.1.0
tokenizers==0.13.3
tomli==2.0.1
torch==1.13.1
torchvision==0.14.1
tqdm==4.65.0
transformers==4.29.2
triton==2.0.0
typing_extensions==4.6.3
urllib3==2.0.2
xxhash==3.2.0
yapf==0.33.0
yarl==1.9.2

How to reproduce

show/hide

python3  -m scripts.train -c examples/NPCRF/configs/ufet_concat_npcrf.yaml

Code of Conduct

I agree to follow this project's Code of Conduct

新增案例文档

Is your feature request related to a problem?

1.新增一些关系抽取的样例吧

2.解决超过512时候也返回抽取结果的位置

Describe the solution you'd like.

No response

Describe alternatives you've considered.

No response

Additional context.

No response

Code of Conduct

I agree to follow this project's Code of Conduct

ncbi and bc5cd

What is your question?

When using Bert-CRF for named entity recognition, where can I find the datasets ncbi and bc5cdr, as well as the configuration files?

What have you tried?

No response

Code (if necessary)

No response

What's your environment?

AdaSeq Version (e.g., 1.0 or master):
ModelScope Version (e.g., 1.0 or master):
PyTorch Version (e.g., 1.12.1):
OS (e.g., Ubuntu 20.04):
Python version:
CUDA/cuDNN version:
GPU models and configuration:
Any other relevant information:

Code of Conduct

I agree to follow this project's Code of Conduct

[Question] ner任务的evaluate标准是怎样的f1?evaluate的代码在哪里查看？

What is your question?

你们的标准是strict 还是 relaxed的？
你们只有evaluate的结果p/r/f，但是具体的代码在哪里可以查看？

What have you tried?

我自己在代码中尝试寻找，但是你们的代码层层封装的太多了我没找到。

Code (if necessary)

No response

What's your environment?

No response

Code of Conduct

I agree to follow this project's Code of Conduct

[Feature] Add a constrain_crf option to the SequenceLabelingModel

Is your feature request related to a problem?

There is already an CRF module on the repository called CRFwithConstraints that implements constrains for BIO and BIOES decoding, but it's only used in the definition for the TwoStageNERModel. I would like to use the constrained CRF on a simple BERT-CRF model with the SequenceLabelingModel class.

Describe the solution you'd like.

I think it is a simple solution, just add a new argument called constrain_crf that activates the CRFwithConstraints module:

adaseq/models/sequence_labeling_model.py

@@ -10,7 +10,7 @@ from modelscope.utils.config import ConfigDict
 from adaseq.data.constant import PAD_LABEL_ID
 from adaseq.metainfo import Models, Pipelines, Tasks
 from adaseq.models.base import Model
-from adaseq.modules.decoders import CRF, PartialCRF
+from adaseq.modules.decoders import CRF, PartialCRF, CRFwithConstraints
 from adaseq.modules.dropouts import WordDropout
 from adaseq.modules.embedders import Embedder
 from adaseq.modules.encoders import Encoder
@@ -52,6 +52,7 @@ class SequenceLabelingModel(Model):
         mv_interpolation: Optional[float] = 0.5,
         partial: Optional[bool] = False,
         chunk: Optional[bool] = False,
+        constrain_crf: Optional[bool] = False,
         **kwargs
     ) -> None:
         super().__init__(**kwargs)
@@ -84,8 +85,14 @@ class SequenceLabelingModel(Model):
                 self.dropout = nn.Dropout(dropout)

         self.use_crf = use_crf
+        self.constrain_crf = constrain_crf
         if use_crf:
-            if partial:
+            if constrain_crf:
+                id2label_list = [v for k, v in self.id_to_label.items()]
+                self.crf = CRFwithConstraints(
+                    id2label_list, batch_first=True, add_constraint=True
+                )
+            elif partial:
                 self.crf = PartialCRF(self.num_labels, batch_first=True)
             else:
                 self.crf = CRF(self.num_labels, batch_first=True)

To use the CRFwithConstraints on the config.yaml would be something like:

model:
  type: sequence-labeling-model
  embedder:
    model_name_or_path: sijunhe/nezha-cn-base
  word_dropout: 0.0
  use_crf: true
  constrain_crf: true

Describe alternatives you've considered.

No response

Additional context.

No response

Code of Conduct

I agree to follow this project's Code of Conduct

[Question]

What is your question?

Hello Team,
I am using your dataset for the competition of Multiconer problem. Many thanks to make the dataset public. Could you please upload the external sentence of test dataset as well?

What have you tried?

No response

Code (if necessary)

No response

What's your environment?

AdaSeq Version (e.g., 1.0 or master):
ModelScope Version (e.g., 1.0 or master):
PyTorch Version (e.g., 1.12.1):
OS (e.g., Ubuntu 20.04):
Python version:
CUDA/cuDNN version:
GPU models and configuration:
Any other relevant information:

Code of Conduct

I agree to follow this project's Code of Conduct

cannot run the examples/bert_crf/resume.yaml under modelscope==1.1.2

What is your question?

cannot run the examples/bert_crf/resume.yaml under modelscope==1.1.2

What have you tried?

No response

Code (if necessary)

No response

What's your environment?

AdaSeq Version (e.g., 1.0 or master):
ModelScope Version (e.g., 1.0 or master):
PyTorch Version (e.g., 1.12.1):
OS (e.g., Ubuntu 20.04):
Python version:
CUDA/cuDNN version:
GPU models and configuration:
Any other relevant information:

Code of Conduct

I agree to follow this project's Code of Conduct

Error SequenceLabelingMetric: Can't instantiate abstract class SequenceLabelingMetric with abstract method merge

Checklist before your report.

I have verified that the issue exists against the master branch of AdaSeq.
I have read the relevant section in the contribution guide on reporting bugs.
I have checked the issues list for similar or identical bug reports.
I have checked the pull requests list for existing proposed fixes.
I have checked the commit log to find out if the bug was already fixed in the master branch.

What happened?

An error occurs when evaluation begins. The traceback is attached below

Python traceback

Traceback (most recent call last):
File "/notebooks/Sem-eval/AdaSeq/scripts/train.py", line 51, in
main(args)
File "/notebooks/Sem-eval/AdaSeq/scripts/train.py", line 17, in main
train_model(
File "/notebooks/Sem-eval/AdaSeq/adaseq/commands/train.py", line 136, in train_model
trainer.train(checkpoint_path)
File "/notebooks/Sem-eval/AdaSeq/adaseq/training/default_trainer.py", line 146, in train
return super().train(checkpoint_path=checkpoint_path, *args, **kwargs)
File "/usr/local/lib/python3.9/dist-packages/modelscope/trainers/trainer.py", line 495, in train
self.train_loop(self.train_dataloader)
File "/usr/local/lib/python3.9/dist-packages/modelscope/trainers/trainer.py", line 895, in train_loop
self.invoke_hook(TrainerStages.after_train_epoch)
File "/usr/local/lib/python3.9/dist-packages/modelscope/trainers/trainer.py", line 1034, in invoke_hook
getattr(hook, fn_name)(self)
File "/usr/local/lib/python3.9/dist-packages/modelscope/trainers/hooks/evaluation_hook.py", line 33, in after_train_epoch
self.do_evaluate(trainer)
File "/usr/local/lib/python3.9/dist-packages/modelscope/trainers/hooks/evaluation_hook.py", line 45, in do_evaluate
eval_res = trainer.evaluate()
File "/usr/local/lib/python3.9/dist-packages/modelscope/trainers/trainer.py", line 505, in evaluate
metric_classes = [build_metric(metric) for metric in self.metrics]
File "/usr/local/lib/python3.9/dist-packages/modelscope/trainers/trainer.py", line 505, in
metric_classes = [build_metric(metric) for metric in self.metrics]
File "/usr/local/lib/python3.9/dist-packages/modelscope/metrics/builder.py", line 79, in build_metric
return build_from_cfg(
File "/usr/local/lib/python3.9/dist-packages/modelscope/utils/registry.py", line 215, in build_from_cfg
raise type(e)(f'{obj_cls.name}: {e}')
TypeError: SequenceLabelingMetric: Can't instantiate abstract class SequenceLabelingMetric with abstract method merge

Operating system

Ubuntu 20.4

Python version

3.9

Output of pip freeze

show/hide

How to reproduce

show/hide

Code of Conduct

I agree to follow this project's Code of Conduct

[Question] running self trained NER model causes errors

What is your question?

error message:

Traceback (most recent call last):
  File "/root/raid/electrolyte_papers_extraction/NER/ner.py", line 31, in <module>
    ner = NER('ckpt/ner/240731013859.938009/output_best', device = 'gpu')
  File "/root/raid/electrolyte_papers_extraction/NER/ner.py", line 12, in __init__
    self.pipeline = pipeline(Tasks.named_entity_recognition, abspath(ckpt), device = device)
  File "/usr/local/lib/python3.10/dist-packages/modelscope/pipelines/builder.py", line 169, in pipeline
    return build_pipeline(cfg, task_name=task)
  File "/usr/local/lib/python3.10/dist-packages/modelscope/pipelines/builder.py", line 65, in build_pipeline
    return build_from_cfg(
  File "/usr/local/lib/python3.10/dist-packages/modelscope/utils/registry.py", line 215, in build_from_cfg
    raise type(e)(f'{obj_cls.__name__}: {e}')
RuntimeError: SequenceLabelingPipeline: SequenceLabelingModel: TransformerEmbedder: Try loading from huggingface and modelscope failed 

huggingface:
The request model: google-bert/bert-base-cased does not exist!

modelscope:
The request model: google-bert/bert-base-cased does not exist!

self trained NER checkpoint:

https://github.com/breadbread1984/electrolyte_papers_extraction/tree/main/NER/ckpt/ner/240731013859.938009

What have you tried?

under ckpt/ner/. edit <path/to/latest/checkpoint>/output_best/configuration.json to change the following lines

from

    "plugins": [
        "adaseq"
    ]

    "plugins": [
        "https://files.pythonhosted.org/packages/49/47/ddf684253dbb4c3e0716fcda67094aa3c407237d5eb8930ede0a91b9feb8/adaseq-0.6.6-py3-none-any.whl"
    ]

Code (if necessary)

source code:

from os.path import abspath
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
pipeline_ = pipeline(Tasks.named_entity_recognition, abspath('ckpt/ner/240731013859.938009/output_best'), device = 'gpu')

What's your environment?

AdaSeq Version (e.g., 1.0 or master): 0.6.6
ModelScope Version (e.g., 1.0 or master): 1.16.1
PyTorch Version (e.g., 1.12.1): 2.2.0a0+81ea7a4
OS (e.g., Ubuntu 20.04): Ubuntu 22.04.3 LTS
Python version: 3.10.12
CUDA/cuDNN version: 8.9.7.29-1+cuda12.2
GPU models and configuration: A100
Any other relevant information:

Code of Conduct

I agree to follow this project's Code of Conduct

[Feature] pipleline NER返回结果问题

Is your feature request related to a problem?

按照transformers里面的piplelines如果输入超过512时候，也会返回大于512的位置，而不是报错

Describe the solution you'd like.

No response

Describe alternatives you've considered.

No response

Additional context.

No response

Code of Conduct

I agree to follow this project's Code of Conduct

[Question] i can't find the TBD dataset

What is your question?

No response

What have you tried?

No response

Code (if necessary)

No response

What's your environment?

AdaSeq Version (e.g., 1.0 or master):
ModelScope Version (e.g., 1.0 or master):
PyTorch Version (e.g., 1.12.1):
OS (e.g., Ubuntu 20.04):
Python version:
CUDA/cuDNN version:
GPU models and configuration:
Any other relevant information:

Code of Conduct

I agree to follow this project's Code of Conduct

[Question] Can you provide image caption of twitter15/17, snap, wikidiverse, mnre.

What is your question?

Hi, I notice the dataset used in MoRe does not contain the image caption of its original images. For example, twitter15 only has text, and twitter15-img and twitter15-txt have retrieved content, instead of their original image caption. Can you provide it? Thank you.

What have you tried?

None

Code (if necessary)

No response

What's your environment?

AdaSeq Version (e.g., 1.0 or master):
ModelScope Version (e.g., 1.0 or master):
PyTorch Version (e.g., 1.12.1):
OS (e.g., Ubuntu 20.04):
Python version:
CUDA/cuDNN version:
GPU models and configuration:
Any other relevant information:

Code of Conduct

I agree to follow this project's Code of Conduct

CRF训练loss不稳定

基于自定义ner数据，包含BIEOS标签。模型结构基于bert+crf，只import adaseq的CRF和CRFwithConstraints使用。本地训练时，会出现以下2种情况：
（1）loss非常大，不收敛，尝试过bert和crf使用不同量级的lr，如bert使用5e-5，crf使用0.5等

{'loss': 4.413724667306784e+18, 'learning_rate': 5e-05, 'epoch': 0.0}
{'loss': 4.028261839082933e+18, 'learning_rate': 5e-05, 'epoch': 0.0}

（2）loss部分step能下降，但会出现中间某个step，loss又变的很大，跨epoch也会发现loss的突变

{'loss': 139.975, 'learning_rate': 5e-05, 'epoch': 0.12}                                                                                                                       
{'loss': 19.9296, 'learning_rate': 5e-05, 'epoch': 0.18}                                                                                                                       
{'loss': 138.8168, 'learning_rate': 5e-05, 'epoch': 0.24}                                                                                                                      
{'loss': 14.5336, 'learning_rate': 5e-05, 'epoch': 0.29}                                                                                                                       
{'loss': 12.7167, 'learning_rate': 5e-05, 'epoch': 0.35}                                                                                                                       
{'loss': 8.8402, 'learning_rate': 5e-05, 'epoch': 0.41}                                                                                                                        
{'loss': 7.3277, 'learning_rate': 5e-05, 'epoch': 0.47}                                                                                                                        
{'loss': 130.5398, 'learning_rate': 5e-05, 'epoch': 0.53}                                                                                                                      
{'loss': 6.4985, 'learning_rate': 5e-05, 'epoch': 0.59}                                                                                                                        
{'loss': 380.6562, 'learning_rate': 5e-05, 'epoch': 0.65}                                                                                                                      
{'loss': 130.0721, 'learning_rate': 5e-05, 'epoch': 0.71}                                                                                                                      
{'loss': 3.8882, 'learning_rate': 5e-05, 'epoch': 0.76}                                                                                                                        
{'loss': 4.0297, 'learning_rate': 5e-05, 'epoch': 0.82}                                                                                                                        
{'loss': 7.1636, 'learning_rate': 5e-05, 'epoch': 0.88}                                                                                                                        
{'loss': 4.571, 'learning_rate': 5e-05, 'epoch': 0.94}                                                                                                                         
{'loss': 4.482, 'learning_rate': 5e-05, 'epoch': 1.0}                                                                                                                          
{'loss': 129.3293, 'learning_rate': 5e-05, 'epoch': 1.06}                                                                                                                      
{'loss': 4.5055, 'learning_rate': 5e-05, 'epoch': 1.12}                                                                                                                        
{'loss': 129.5204, 'learning_rate': 5e-05, 'epoch': 1.18}                                                                                                                      
{'loss': 3.5977, 'learning_rate': 5e-05, 'epoch': 1.24}                                                                                                                        
{'loss': 3.5342, 'learning_rate': 5e-05, 'epoch': 1.29}                                                                                                                        
{'loss': 3.283, 'learning_rate': 5e-05, 'epoch': 1.35}                                                                                                                         
{'loss': 252.8688, 'learning_rate': 5e-05, 'epoch': 1.41}                                                                                                                      
{'loss': 3.2223, 'learning_rate': 5e-05, 'epoch': 1.47}                                                                                                                        
{'loss': 2.3803, 'learning_rate': 5e-05, 'epoch': 1.53}                                                                                                                        
{'loss': 2.8814, 'learning_rate': 5e-05, 'epoch': 1.59}

可能是什么原因呢？

数据预处理时token前后添加了cls和sep，对应的tag为O。pad部分对应的tag也为O
模型细节如下

outputs = self.bert(
            input_ids,
            attention_mask=attention_mask,
            token_type_ids=token_type_ids,
            position_ids=position_ids,
            head_mask=head_mask,
            inputs_embeds=inputs_embeds,
            encoder_hidden_states=encoder_hidden_states,
            encoder_attention_mask=encoder_attention_mask,
            output_attentions=output_attentions,
            output_hidden_states=output_hidden_states,
            return_dict=return_dict,
        )
sequence_output = outputs[0]
sequence_output = self.dropout(sequence_output)
logits = self.classifier(sequence_output)
# logits = self.activation(logits)

loss = None
if labels is not None:
     loss = -self.crf(logits, labels, attention_mask)

[Question] build_dataset error log: 'sequence-labeling-model is not in the custom_datasets registry group named-entity-recognition. Please make sure the correct version of ModelScope library is used

What is your question?

在使用adaseq框架做NER任务时报错：
build_dataset error log: 'sequence-labeling-model is not in the custom_datasets registry group named-entity-recognition. Please make sure the correct version of ModelScope library is used

What have you tried?

原来是用的modelscope 1.7.1后面尝试升级到1.9.5依然不行，运行到这个地方就卡住了。

完整报错信息：
2024-06-20 10:39:37,335 - modelscope - INFO - PyTorch version 1.10.1+cu111 Found.
2024-06-20 10:39:37,336 - modelscope - INFO - Loading ast index from /home/tangjielong/.cache/modelscope/ast_indexer
2024-06-20 10:39:37,365 - modelscope - INFO - Loading done! Current index file version is 1.7.1, with md5 55cda22e675324e12989be11cd8d8653 and a total number of 861 components indexed
2024-06-20 10:39:38,305 - modelscope - WARNING - The reference has been Deprecated in modelscope v1.4.0+, please use from modelscope.msdatasets.dataset_cls.custom_datasets import TorchCustomDataset
2024-06-20 10:39:38,401 - INFO - adaseq.data.dataset_manager - Will use a custom loading script: /data0/tangjielong/MNER_LLM/AdaSeq-master/adaseq/data/dataset_builders/named_entity_recognition_dataset_builder.py
Downloading and preparing dataset named_entity_recognition_dataset_builder/default to /data0/tangjielong/.cache/huggingface/datasets/named_entity_recognition_dataset_builder/default-df0a7beb617cd5ee/0.0.0/db737b9bb893f20fb03d04403a30bf7c033256c212b7e9f0ebc6e9c958535c51...
Downloading data: 216kB [00:00, 1.40MB/s]
Dataset named_entity_recognition_dataset_builder downloaded and prepared to /data0/tangjielong/.cache/huggingface/datasets/named_entity_recognition_dataset_builder/default-df0a7beb617cd5ee/0.0.0/db737b9bb893f20fb03d04403a30bf7c033256c212b7e9f0ebc6e9c958535c51. Subsequent calls will reuse this data.
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 829.68it/s]
2024-06-20 10:39:40,242 - INFO - adaseq.data.dataset_manager - First sample in train set: {'id': '0', 'tokens': ['New', 'Post', ':', 'Blackburn', 'Festival', 'of', 'Voice', '2017'], 'spans': [{'start': 3, 'end': 7, 'type': 'MISC'}], 'mask': [True, True, True, True, True, True, True, True]}
2024-06-20 10:39:40,643 - INFO - adaseq.data.preprocessors.sequence_labeling_preprocessor - label_to_id: {'O': 0, 'B-LOC': 1, 'I-LOC': 2, 'E-LOC': 3, 'S-LOC': 4, 'B-MISC': 5, 'I-MISC': 6, 'E-MISC': 7, 'S-MISC': 8, 'B-ORG': 9, 'I-ORG': 10, 'E-ORG': 11, 'S-ORG': 12, 'B-PER': 13, 'I-PER': 14, 'E-PER': 15, 'S-PER': 16}
Some weights of the model checkpoint at xlm-roberta-large were not used when initializing XLMRobertaModel: ['lm_head.layer_norm.weight', 'lm_head.layer_norm.bias', 'lm_head.dense.bias', 'lm_head.dense.weight', 'lm_head.bias']

This IS expected if you are initializing XLMRobertaModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
This IS NOT expected if you are initializing XLMRobertaModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
** build_dataset error log: 'sequence-labeling-model is not in the custom_datasets registry group named-entity-recognition. Please make sure the correct version of ModelScope library is used.'
** build_dataset error log: 'sequence-labeling-model is not in the custom_datasets registry group named-entity-recognition. Please make sure the correct version of ModelScope library is used.'

Code (if necessary)

No response

What's your environment?

ModelScope Version: 1.7.1 and 1.9.5
PyTorch Version: 1.10.1
transformers: 4.30.2
Python version: 3.7

Code of Conduct

I agree to follow this project's Code of Conduct

[Question] How to save pred.txt

What is your question?

Hi team,

When I run the following script python scripts/test.py -w ${checkpoint_dir}. pred.txt is not generating in the checkpoint directory.

What have you tried?

'python scripts/test.py -w ${checkpoint_dir}'

Code (if necessary)

No response

What's your environment?

AdaSeq Version (e.g., 1.0 or master):
ModelScope Version (e.g., 1.0 or master):
PyTorch Version (e.g., 1.11.0):
OS (e.g., Ubuntu 18.04):
Python version 3.7.5:
CUDA/cuDNN version 11.3:
GPU models and configuration:
Any other relevant information:

Code of Conduct

I agree to follow this project's Code of Conduct

Got crashed after the first epoch training on windows 11.

run python scripts/train.py -c examples/bert_crf/configs/resume.yaml

On window11, i7 12700h, Nvidia RTX 3070 laptop.
Installed modelscope==1.0.3, it works on the linux platforms.

2022-12-03 00:15:25,329 - modelscope - INFO - epoch [1][200/239]        lr: 5.000e-05, eta: 0:26:17, iter_time: 0.319, data_load_time: 0.005, memory: 4263, loss: 17.1283
2022-12-03 00:15:37,843 - modelscope - WARNING - ('METRICS', 'default', 'ner-metric') not found in ast index file
2022-12-03 00:15:37,843 - modelscope - WARNING - ('METRICS', 'default', 'ner-dumper') not found in ast index file
Total test samples:   0%|                                                                      | 0/463 [00:00<?, ?it/s]2022-12-03 00:15:38,091 - modelscope - INFO - PyTorch version 1.12.0 Found.
2022-12-03 00:15:38,093 - modelscope - INFO - Loading ast index from C:\Users\zx920\.cache\modelscope\ast_indexer
2022-12-03 00:15:38,147 - modelscope - INFO - Loading done! Current index file version is 1.0.3, with md5 ab126a3e272314963017d9feade29ae0
Traceback (most recent call last):
  File "<string>", line 1, in <module>
Total test samples:   0%|                                                                      | 0/463 [00:02<?, ?it/s]
Traceback (most recent call last):
  File "scripts/train.py", line 54, in <module>
    main(args)
  File "scripts/train.py", line 21, in main
  File "C:\Users\zx920\.conda\envs\adas\lib\multiprocessing\spawn.py", line 105, in spawn_main
    trainer.train(args.checkpoint_path)
  File "C:\Users\zx920\workspace\AdaSeq\adaseq\trainers\default_trainer.py", line 354, in train
    exitcode = _main(fd)
  File "C:\Users\zx920\.conda\envs\adas\lib\multiprocessing\spawn.py", line 115, in _main
    return super().train(checkpoint_path=checkpoint_path, *args, **kwargs)
  File "C:\Users\zx920\.conda\envs\adas\lib\site-packages\modelscope\trainers\trainer.py", line 459, in train
    self = reduction.pickle.load(from_parent)
EOFError: Ran out of input
    self.train_loop(self.train_dataloader)
  File "C:\Users\zx920\.conda\envs\adas\lib\site-packages\modelscope\trainers\trainer.py", line 871, in train_loop
    self.invoke_hook(TrainerStages.after_train_epoch)
  File "C:\Users\zx920\.conda\envs\adas\lib\site-packages\modelscope\trainers\trainer.py", line 977, in invoke_hook
    getattr(hook, fn_name)(self)
  File "C:\Users\zx920\.conda\envs\adas\lib\site-packages\modelscope\trainers\hooks\evaluation_hook.py", line 31, in after_train_epoch
    self.do_evaluate(trainer)
  File "C:\Users\zx920\.conda\envs\adas\lib\site-packages\modelscope\trainers\hooks\evaluation_hook.py", line 35, in do_evaluate
    eval_res = trainer.evaluate()
  File "C:\Users\zx920\.conda\envs\adas\lib\site-packages\modelscope\trainers\trainer.py", line 484, in evaluate
    metric_classes)
  File "C:\Users\zx920\.conda\envs\adas\lib\site-packages\modelscope\trainers\trainer.py", line 921, in evaluation_loop
    data_loader_iters=self._eval_iters_per_epoch)
  File "C:\Users\zx920\.conda\envs\adas\lib\site-packages\modelscope\trainers\utils\inference.py", line 51, in single_gpu_test
    for i, data in enumerate(data_loader):
  File "C:\Users\zx920\.conda\envs\adas\lib\site-packages\torch\utils\data\dataloader.py", line 438, in __iter__
    return self._get_iterator()
  File "C:\Users\zx920\.conda\envs\adas\lib\site-packages\torch\utils\data\dataloader.py", line 384, in _get_iterator
    return _MultiProcessingDataLoaderIter(self)
  File "C:\Users\zx920\.conda\envs\adas\lib\site-packages\torch\utils\data\dataloader.py", line 1048, in __init__
    w.start()
  File "C:\Users\zx920\.conda\envs\adas\lib\multiprocessing\process.py", line 112, in start
    self._popen = self._Popen(self)
  File "C:\Users\zx920\.conda\envs\adas\lib\multiprocessing\context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "C:\Users\zx920\.conda\envs\adas\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "C:\Users\zx920\.conda\envs\adas\lib\multiprocessing\popen_spawn_win32.py", line 89, in __init__
    reduction.dump(process_obj, to_child)
  File "C:\Users\zx920\.conda\envs\adas\lib\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
  File "C:\Users\zx920\.conda\envs\adas\lib\site-packages\torch\multiprocessing\reductions.py", line 145, in reduce_tensor
    raise RuntimeError("Cowardly refusing to serialize non-leaf tensor which requires_grad, "
RuntimeError: Cowardly refusing to serialize non-leaf tensor which requires_grad, since autograd does not support crossing process boundaries.  If you just want to transfer the data, call detach() on the tensor before serializing (e.g., putting it on the queue).
[W C:\cb\pytorch_1000000000000\work\torch\csrc\CudaIPCTypes.cpp:15] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]

[Question]

What is your question?

如何推理？

What have you tried?

训练模型之后自动就有了pred.txt，请问这个文件是在哪里生成的？

Code (if necessary)

No response

What's your environment?

AdaSeq Version (e.g., 1.0 or master):
ModelScope Version (e.g., 1.0 or master):
PyTorch Version (e.g., 1.12.1):
OS (e.g., Ubuntu 20.04):
Python version:
CUDA/cuDNN version:
GPU models and configuration:
Any other relevant information:

Code of Conduct

I agree to follow this project's Code of Conduct

can not run ./examples/ICASSP2023_MUG_track4/end2end.sh under modelscope==1.1.2, 1.1.1 or 1.0.4

Checklist before your report.

I have verified that the issue exists against the master branch of AdaSeq.
I have read the relevant section in the contribution guide on reporting bugs.
I have checked the issues list for similar or identical bug reports.
I have checked the pull requests list for existing proposed fixes.
I have checked the commit log to find out if the bug was already fixed in the master branch.

What happened?

AttributeError: 'ConfigDict' object has no attribute 'safe_get'

Python traceback

start training....
2023-01-05 16:53:32,579 - modelscope - INFO - PyTorch version 1.12.1 Found.
2023-01-05 16:53:32,581 - modelscope - INFO - Loading ast index from /root/.cache/modelscope/ast_indexer
2023-01-05 16:53:32,953 - modelscope - INFO - Loading done! Current index file version is 1.1.0, with md5 7c254489cd639574f9416078442cc193
Traceback (most recent call last):
File "scripts/train.py", line 51, in
main(args)
File "scripts/train.py", line 25, in main
checkpoint_path=args.checkpoint_path,
File "/mnt/workspace/workgroup/yuhai/kpe/adaseq/adaseq/commands/train.py", line 106, in train_model
config.safe_get('experiment.exp_dir', 'experiments/'),
File "/mnt/workspace/workgroup/anaconda3/envs/modelscope/lib/python3.7/site-packages/modelscope/utils/config.py", line 296, in getattr
return getattr(self._cfg_dict, name)
File "/mnt/workspace/workgroup/anaconda3/envs/modelscope/lib/python3.7/site-packages/modelscope/utils/config.py", line 55, in getattr
raise ex
AttributeError: 'ConfigDict' object has no attribute 'safe_get'

Operating system

Ubuntu 18.04.5 LTS

Python version

3.8.15

Output of pip freeze

show/hide

How to reproduce

bash ./examples/ICASSP2023_MUG_track4/end2end.sh ${sdk_token}

Code of Conduct

I agree to follow this project's Code of Conduct

[Question] 大佬你好，请问可以上传一下论文Improving Low-resource Named Entity Recognition with Graph Propagated Data Augmentation的代码吗？

What is your question?

[Question] 大佬你好，请问可以上传一下论文Improving Low-resource Named Entity Recognition with Graph Propagated Data Augmentation的代码吗？最近看到您的论文很有启发，因此希望能够在您的基础上进行研究工作。非常感谢大佬了。

What have you tried?

No response

Code (if necessary)

No response

What's your environment?

AdaSeq Version (e.g., 1.0 or master):
ModelScope Version (e.g., 1.0 or master):
PyTorch Version (e.g., 1.12.1):
OS (e.g., Ubuntu 20.04):
Python version:
CUDA/cuDNN version:
GPU models and configuration:
Any other relevant information:

Code of Conduct

I agree to follow this project's Code of Conduct

[Question]How to run this code on MNRE dataset?

What is your question?

I found that you have upload the MNRE dataset to modelscope, but I have not found the config file for MNRE here.

What have you tried?

I have changed the dataset link in the SNAP‘s config file to MNRE, and then used it to run MNRE. Although it can run, the final result seems to be wrong, because the F1 score is almost equal to 1.

Code (if necessary)

No response

What's your environment?

AdaSeq Version (e.g., 1.0 or master):
ModelScope Version (e.g., 1.0 or master):
PyTorch Version (e.g., 1.12.1):
OS (e.g., Ubuntu 20.04):
Python version:
CUDA/cuDNN version:
GPU models and configuration:
Any other relevant information:

Code of Conduct

I agree to follow this project's Code of Conduct

[Question] AdaSeq for Sequence Labeling, How are the metrics calculated? What are the available models?

What is your question?

Hello, I am working on the SemEval2023 MultiCoNER-II task.

First of all thank you for sharing this amazing repo, its saving me a lot of time and effort.

With regards to the metrics, I was training an xlm-robert-laarge model on the English dataset and noticed the F1 score was low but accuracy was high (F1=0.38 and accuracy = ~0.8).
If you have taken a look at the English dataset for the MultiCoNER-II you'll see that the Other tag (aka 'O') is more frequent in the dataset than any other tag by a large margin. Hence its possible that the model(s) may overfit and just start predicting the tag O for most tokens in a sequence.

My question is, When calculating the metrics, do you take into account the O tags into the calculation? in other words, do you mask the tokens in the target sequence whose gold tag is O when calculating the loss/accuracy/F1?

My next question has to do with the possible configurations we can control.
What are the models (transformers) that we can use?

What have you tried?

multiconer2-en-exp#1.zip
The attached file contains the configuration I used.
The model was early stopped at epoch=7

Code (if necessary)

No response

What's your environment?

AdaSeq Version (e.g., 1.0 or master): 0.5.0
ModelScope Version (e.g., 1.0 or master): 1.1.1
PyTorch Version (e.g., 1.12.1): 1.12.1+cu102
OS (e.g., Ubuntu 20.04): Linux-5.10.147+-x86_64-with-glibc2.27 in Google colab
Python version: 3.8.16
CUDA/cuDNN version: 11.2
GPU models and configuration: Tesla T4 15109MiB
Any other relevant information: Used on Google Colab

Code of Conduct

I agree to follow this project's Code of Conduct

[Question] CUDA out of memory

What is your question?

How much GPU memory is required to train a BERT model?

For start I have this commands from your readme file adaseq train -c demo.yaml and faced with Out of memory error.

OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 1.96 GiB total capacity; 1.09 GiB already allocated; 4.81 MiB free; 1.14 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

However it just a first question in the queue of many. I will ask them here as at this moment this issue blocks me to verify them and find answer on my own.

Does your pytorch end models could be converted in tensorflow lite models? (I need this format for import on Android device)

Does your babert model provides fully trained model to split chinese text on words (Word Segmentation)?

Is this checkpoint dataset (chinese-babert-base.tar) just for validation of the model?

What have you tried?

$pip install adaseq
$adaseq train -c demo.yaml

Code (if necessary)

No response

What's your environment?

AdaSeq Version (e.g., 1.0 or master):
ModelScope Version (master):
PyTorch Version (default one, supplied with adaseq):
OS (Ubuntu 22.04.2 LTS):
Python version: Python 3.10.12
CUDA/cuDNN version: NVIDIA-SMI 470.199.02 Driver Version: 470.199.02 CUDA Version: 11.4
GPU models and configuration: NVIDIA GeForce GTX 860M
Any other relevant information:

Code of Conduct

I agree to follow this project's Code of Conduct

Dataset[Question]

What is your question?

将zip文件放在本地文件夹下报错，- WARNING - adaseq.data.dataset_manager - Training set not found!

What have you tried?

No response

Code (if necessary)

No response

What's your environment?

AdaSeq Version (e.g., 1.0 or master):
ModelScope Version (e.g., 1.0 or master):
PyTorch Version (e.g., 1.12.1):
OS (e.g., Ubuntu 20.04):
Python version:
CUDA/cuDNN version:
GPU models and configuration:
Any other relevant information:

Code of Conduct

I agree to follow this project's Code of Conduct

[Question] 运行twitter-17-txt.yaml和twitter-17-img.yaml出错

What is your question?

运行twitter-17-txt.yaml和twitter-17-img.yaml出错

What have you tried?

No response

Code (if necessary)

Traceback (most recent call last):
File "/home/nlp/.pycharm_helpers/pydev/pydevd.py", line 1496, in _exec
pydev_imports.execfile(file, globals, locals) # execute the script
File "/home/nlp/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/home/nlp/code/AdaSeq/examples/MoRe/train.py", line 37, in
train_model_from_args(args)
File "/home/nlp/code/AdaSeq/adaseq/commands/train.py", line 93, in train_model_from_args
checkpoint_path=args.checkpoint_path,
File "/home/nlp/code/AdaSeq/adaseq/commands/train.py", line 164, in train_model
trainer.train(checkpoint_path)
File "/home/nlp/code/AdaSeq/adaseq/training/default_trainer.py", line 146, in train
return super().train(checkpoint_path=checkpoint_path, *args, **kwargs)
File "/home/nlp/anaconda3/envs/MoRe/lib/python3.7/site-packages/modelscope/trainers/trainer.py", line 676, in train
self.train_loop(self.train_dataloader)
File "/home/nlp/anaconda3/envs/MoRe/lib/python3.7/site-packages/modelscope/trainers/trainer.py", line 1171, in train_loop
self.train_step(self.model, data_batch, **kwargs)
File "/home/nlp/anaconda3/envs/MoRe/lib/python3.7/site-packages/modelscope/trainers/trainer.py", line 841, in train_step
train_outputs = model.forward(**inputs)
File "/home/nlp/code/AdaSeq/adaseq/models/sequence_labeling_model.py", line 129, in forward
loss = self._calculate_loss(logits, label_ids, crf_mask)
File "/home/nlp/code/AdaSeq/adaseq/models/sequence_labeling_model.py", line 176, in _calculate_loss
targets = targets * mask
RuntimeError: The size of tensor a (269) must match the size of tensor b (270) at non-singleton dimension 1

Traceback (most recent call last):
File "/home/nlp/.pycharm_helpers/pydev/pydevd.py", line 1496, in _exec
pydev_imports.execfile(file, globals, locals) # execute the script
File "/home/nlp/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/home/nlp/code/AdaSeq/examples/MoRe/train.py", line 37, in
train_model_from_args(args)
File "/home/nlp/code/AdaSeq/adaseq/commands/train.py", line 93, in train_model_from_args
checkpoint_path=args.checkpoint_path,
File "/home/nlp/code/AdaSeq/adaseq/commands/train.py", line 164, in train_model
trainer.train(checkpoint_path)
File "/home/nlp/code/AdaSeq/adaseq/training/default_trainer.py", line 146, in train
return super().train(checkpoint_path=checkpoint_path, *args, **kwargs)
File "/home/nlp/anaconda3/envs/MoRe/lib/python3.7/site-packages/modelscope/trainers/trainer.py", line 676, in train
self.train_loop(self.train_dataloader)
File "/home/nlp/anaconda3/envs/MoRe/lib/python3.7/site-packages/modelscope/trainers/trainer.py", line 1171, in train_loop
self.train_step(self.model, data_batch, **kwargs)
File "/home/nlp/anaconda3/envs/MoRe/lib/python3.7/site-packages/modelscope/trainers/trainer.py", line 841, in train_step
train_outputs = model.forward(**inputs)
File "/home/nlp/code/AdaSeq/adaseq/models/sequence_labeling_model.py", line 129, in forward
loss = self._calculate_loss(logits, label_ids, crf_mask)
File "/home/nlp/code/AdaSeq/adaseq/models/sequence_labeling_model.py", line 176, in _calculate_loss
targets = targets * mask
RuntimeError: The size of tensor a (299) must match the size of tensor b (300) at non-singleton dimension 1

请问这个问题如何解决

What's your environment?

AdaSeq Version (e.g., 1.0 or master):
ModelScope Version (e.g., 1.0 or master):
PyTorch Version (e.g., 1.12.1):
OS (e.g., Ubuntu 20.04):
Python version:
CUDA/cuDNN version:
GPU models and configuration:
Any other relevant information:

Code of Conduct

I agree to follow this project's Code of Conduct

NotImplementedError

Is your feature request related to a problem?

2023-06-08 14:43:53,296 - INFO - modelscope - Checkpoints will be saved to NER/koubei/230608144347.471030
2023-06-08 14:43:53,296 - INFO - adaseq.training.hooks.text_logger_hook - Text logs will be saved to: NER/koubei/230608144347.471030/metrics.json
2023-06-08 14:43:53,369 - INFO - modelscope - tensorboard files will be saved to NER/koubei/230608144347.471030/tensorboard_output
/data/ningfeim/.myconda/envs/modelscope/lib/python3.7/site-packages/adaseq/modules/decoders/crf.py:293: UserWarning: where received a uint8 condition tensor. This behavior is deprecated and will be removed in a future version of PyTorch. Use a boolean condition instead. (Triggered internally at ../aten/src/ATen/native/TensorCompare.cpp:402.)
score = torch.where(mask[i].unsqueeze(1), next_score, score)
2023-06-08 14:44:00,504 - INFO - modelscope - epoch [1][50/478] lr: 5.000e-05, eta: 0:05:33, iter_time: 0.143, data_load_time: 0.013, memory: 2636, loss: 24.8958
2023-06-08 14:44:05,806 - INFO - modelscope - epoch [1][100/478] lr: 5.000e-05, eta: 0:04:44, iter_time: 0.106, data_load_time: 0.011, memory: 2689, loss: 6.3790
2023-06-08 14:44:11,301 - INFO - modelscope - epoch [1][150/478] lr: 5.000e-05, eta: 0:04:27, iter_time: 0.110, data_load_time: 0.012, memory: 2689, loss: 4.3439
2023-06-08 14:44:16,351 - INFO - modelscope - epoch [1][200/478] lr: 5.000e-05, eta: 0:04:11, iter_time: 0.101, data_load_time: 0.012, memory: 2689, loss: 3.1956
2023-06-08 14:44:22,130 - INFO - modelscope - epoch [1][250/478] lr: 5.000e-05, eta: 0:04:06, iter_time: 0.116, data_load_time: 0.014, memory: 2800, loss: 3.9149
2023-06-08 14:44:28,207 - INFO - modelscope - epoch [1][300/478] lr: 5.000e-05, eta: 0:04:02, iter_time: 0.122, data_load_time: 0.015, memory: 2800, loss: 3.5845
2023-06-08 14:44:33,858 - INFO - modelscope - epoch [1][350/478] lr: 5.000e-05, eta: 0:03:55, iter_time: 0.113, data_load_time: 0.014, memory: 2800, loss: 2.7284
2023-06-08 14:44:38,887 - INFO - modelscope - epoch [1][400/478] lr: 5.000e-05, eta: 0:03:46, iter_time: 0.101, data_load_time: 0.012, memory: 2800, loss: 2.6073
2023-06-08 14:44:44,202 - INFO - modelscope - epoch [1][450/478] lr: 5.000e-05, eta: 0:03:39, iter_time: 0.106, data_load_time: 0.012, memory: 2800, loss: 2.7159
Total test samples: 0%|▏ | 1/463 [00:00<00:46, 9.89it/sTotal test samples: 4%|███▏ | 17/463 [00:00<00:14, 30.77iTotal test samples: 11%|█████████▎ | 49/463 [00:00<00:04, Total test samples: 17%|███████████████▏ | 81/463 [00:00<00Total test samples: 24%|████████████████████▉ | 113/463 [00:Total test samples: 31%|██████████████████████████▉ | 145/46Total test samples: 38%|████████████████████████████████▉ | Total test samples: 45%|██████████████████████████████████████▊ Total test samples: 52%|████████████████████████████████████████████▊ Total test samples: 62%|█████████████████████████████████████████████████████▋ Total test samples: 69%|███████████████████████████████████████████████████████████▌ Total test samples: 76%|████████████████████████████████████████████████████████████Total test samples: 87%|████████████████████████████████████████████████████████████Total test samples: 97%|████████████████████████████████████████████████████████████Total test samples: 100%|██████████████████████████████████████████████████████████████████████████████████████| 463/463 [00:02<00:00, 227.71it/s]
Traceback (most recent call last):
File "/data/ningfeim/.myconda/envs/modelscope/bin/adaseq", line 8, in
sys.exit(run())
File "/data/ningfeim/.myconda/envs/modelscope/lib/python3.7/site-packages/adaseq/main.py", line 13, in run
main(prog='adaseq')
File "/data/ningfeim/.myconda/envs/modelscope/lib/python3.7/site-packages/adaseq/commands/init.py", line 29, in main
args.func(args)
File "/data/ningfeim/.myconda/envs/modelscope/lib/python3.7/site-packages/adaseq/commands/train.py", line 85, in train_model_from_args
checkpoint_path=args.checkpoint_path,
File "/data/ningfeim/.myconda/envs/modelscope/lib/python3.7/site-packages/adaseq/commands/train.py", line 144, in train_model
trainer.train(checkpoint_path)
File "/data/ningfeim/.myconda/envs/modelscope/lib/python3.7/site-packages/adaseq/training/default_trainer.py", line 146, in train
return super().train(checkpoint_path=checkpoint_path, *args, **kwargs)
File "/data/ningfeim/.myconda/envs/modelscope/lib/python3.7/site-packages/modelscope/trainers/trainer.py", line 676, in train
self.train_loop(self.train_dataloader)
File "/data/ningfeim/.myconda/envs/modelscope/lib/python3.7/site-packages/modelscope/trainers/trainer.py", line 1181, in train_loop
self.invoke_hook(TrainerStages.after_train_epoch)
File "/data/ningfeim/.myconda/envs/modelscope/lib/python3.7/site-packages/modelscope/trainers/trainer.py", line 1328, in invoke_hook
getattr(hook, fn_name)(self)
File "/data/ningfeim/.myconda/envs/modelscope/lib/python3.7/site-packages/modelscope/trainers/hooks/evaluation_hook.py", line 35, in after_train_epoch
self.do_evaluate(trainer)
File "/data/ningfeim/.myconda/envs/modelscope/lib/python3.7/site-packages/modelscope/trainers/hooks/evaluation_hook.py", line 47, in do_evaluate
eval_res = trainer.evaluate()
File "/data/ningfeim/.myconda/envs/modelscope/lib/python3.7/site-packages/modelscope/trainers/trainer.py", line 764, in evaluate
metric_classes)
File "/data/ningfeim/.myconda/envs/modelscope/lib/python3.7/site-packages/modelscope/trainers/trainer.py", line 1239, in evaluation_loop
data_loader_iters=self._eval_iters_per_epoch)
File "/data/ningfeim/.myconda/envs/modelscope/lib/python3.7/site-packages/modelscope/trainers/utils/inference.py", line 77, in single_gpu_test
return get_metric_values(metric_classes)
File "/data/ningfeim/.myconda/envs/modelscope/lib/python3.7/site-packages/modelscope/trainers/utils/inference.py", line 194, in get_metric_values
metric_values.update(metric_cls.evaluate())
File "/data/ningfeim/.myconda/envs/modelscope/lib/python3.7/site-packages/adaseq/data/dataset_dumpers/base.py", line 26, in evaluate
self.dump()
File "/data/ningfeim/.myconda/envs/modelscope/lib/python3.7/site-packages/adaseq/data/dataset_dumpers/named_entity_recognition_dataset_dumper.py", line 40, in dump
raise NotImplementedError
NotImplementedError

环境：
python3.7
ubuntu 22.04
cuda11.7

Describe the solution you'd like.

最后一行依赖对应报错行：

这块是在判断训练集类型conll

yaml训练内如下：
experiment:
exp_dir: NER/ # 所有实验的根文件夹
exp_name: koubei # 本次配置文件的实验名称
seed: 42 # 随机种子

dataset:
data_file: # 数据文件
train: '/data/ningfeim/test/NER/koubei/datas/train.txt'
dev: '/data/ningfeim/test/NER/koubei/datas/dev.txt'
test: '/data/ningfeim/test/NER/koubei/datas/test.txt'
data_type: conll # 数据格式

task: named-entity-recognition # 任务名称，用于加载内建的 DatasetBuilder（如果需要的话）

preprocessor:
type: sequence-labeling-preprocessor # 预处理器名称
model_dir: /data/ningfeim/project/Bert-ner-yiwei/bert #huggingface/modelscope 模型名字或路径，用于初始化 Tokenizer，可缺省
max_length: 512 # 预训练模型支持的最大输入长度

data_collator: SequenceLabelingDataCollatorWithPadding # 用于 batch 转换的 data_collator 名称

model:
type: sequence-labeling-model # 模型名称
embedder:
model_name_or_path: /data/ningfeim/test/NER/koubei/nlp_raner_named-entity-recognition_chinese-base-news # 预训练模型名称或路径，可以是huggingface/modelscope的backbone模型，或者也可以加载modelscope上的任务模型
dropout: 0.1 # dropout 概率
use_crf: true # 是否使用CRF

train:
hooks:
- type: TensorboardHook
max_epochs: 5 # 最大训练轮数
dataloader:
batch_size_per_gpu: 8 # 训练batch_size
optimizer:
type: AdamW # pytorch 优化器名称
lr: 5.0e-5 # 全局学习率
param_groups:
- regex: crf
lr: 5.0e-1
lr_scheduler:
type: LinearLR # transformers 或 pytorch 的 lr_scheduler 名称
start_factor: 1.0
end_factor: 0.0
total_iters: 20

evaluation:
dataloader:
batch_size_per_gpu: 16 # 评估batch_size
metrics:
- type: ner-metric # 所有已实现的metric见 adaseq/metainfo.py 的 Metrics 类。
- type: ner-dumper # 输出预测结果
model_type: sequence_labeling
dump_format: column

Describe alternatives you've considered.

No response

Additional context.

No response

Code of Conduct

I agree to follow this project's Code of Conduct

请问有关系抽取任务的config样例吗，我在例子里没有看到[Question]

What is your question?

请问怎么用这个框架训练关系抽取任务

What have you tried?

No response

Code (if necessary)

No response

What's your environment?

AdaSeq Version (e.g., 1.0 or master):
ModelScope Version (e.g., 1.0 or master):
PyTorch Version (e.g., 1.12.1):
OS (e.g., Ubuntu 20.04):
Python version:
CUDA/cuDNN version:
GPU models and configuration:
Any other relevant information:

Code of Conduct

I agree to follow this project's Code of Conduct

[Feature] 建议将pytorch版本升级到2.0

Is your feature request related to a problem?

我的电脑是nvidia - rtx 3060版本，cuda版本是12.1， torch版本是2.0，目前这个Adaseq的torch版本是1.12.1，cuda是10.2，感觉太不方便了。

Describe the solution you'd like.

No response

Describe alternatives you've considered.

No response

Additional context.

No response

Code of Conduct

I agree to follow this project's Code of Conduct

[Question] About the search engine

What is your question?

As I understand, for all datasets, this system will use a search engine for retrieving and ranking related text to train the model. Where is this part of code? And is it used online when training the model or offline to create the dataset before training?

Thank for your responding.

Question on EMNLP22: Named Entity and Relation Extraction with Multi-Modal Retrieval

What is your question?

Thanks for your work on EMNLP22: Named Entity and Relation Extraction with Multi-Modal Retrieval.

However, your readme file does not provide environment information, such as the version of the modelscope.

Could you please clarify the specific operating environment and precautions?

The current code seems unable to load the xlm-roberta-large model.

What have you tried?

I read the relevant files of modelscope in https://www.modelscope.cn/docs/环境安装, but still cannot determine the reason why the xlm-roberta-large model cannot be loaded.

In addition, your repo lacks MoE code, could you please add it? Thanks.

Code (if necessary)

ERROR - modelscope - Authentication token does not exist, failed to access model xlm-roberta-large which may not exist or may be private. Please login first.

RuntimeError: Try loading from huggingface and modelscope failed

modelscope: 404 Client Error: Not Found for url: http://www.modelscope.cn/api/v1/models/xlm-roberta-large/revisions?EndTime=1670428800

What's your environment?

AdaSeq Version (master):
modelscope==1.1.1
torch==1.13.1
OS Ubuntu 20.04.5
Python version: 3.9.13
CUDA Version: 12.0
GPU models: 3090

Code of Conduct

I agree to follow this project's Code of Conduct

[Question] Where is the MoRe code?

What is your question?

No response

What have you tried?

No response

Code (if necessary)

No response

What's your environment?

AdaSeq Version (e.g., 1.0 or master):
ModelScope Version (e.g., 1.0 or master):
PyTorch Version (e.g., 1.12.1):
OS (e.g., Ubuntu 20.04):
Python version:
CUDA/cuDNN version:
GPU models and configuration:
Any other relevant information:

Code of Conduct

I agree to follow this project's Code of Conduct

[Question]

What is your question?

始终无法ping通huggingface，把模型拷贝到本地不知道怎么加载进入项目中

What have you tried?

在ymal文件中，按照本地加载数据集的格式修改模型位置。没有成功

Code (if necessary)

requests.exceptions.ConnectionError: (MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /xlm-roberta-large/resolve/main/config.json (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f00c74d8880>: Failed to establish a new connection: [Errno 101] Network is unreachable'))"), '(Request ID: bedc4561-50b0-4bff-b4b8-d3349f1e15a0)')

What's your environment?

AdaSeq Version (e.g., 1.0 or master):
ModelScope Version (e.g., 1.0 or master):
PyTorch Version (e.g., 1.12.1):
OS (e.g., Ubuntu 20.04):
Python version:
CUDA/cuDNN version:
GPU models and configuration:
Any other relevant information:

Code of Conduct

I agree to follow this project's Code of Conduct

[Question] ValueError: unknown url type: 'adaseq-0.6.6-py3-none-any.whl.metadata'

What is your question?

我在使用这个推理的时候，遇到了ValueError: unknown url type: 'adaseq-0.6.6-py3-none-any.whl.metadata'
请问这个是怎么回事呢

from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks

p = pipeline(Tasks.named_entity_recognition, '/data3/huyan/liheng/tmp/AdaSeq/AdaSeq/experiments/eBay/231125005305.028950/output_best')
result = p('Nike Reax TR Mesh Herren Sneaker low Turnschuhe Sportschuhe Freizeitschuhe')

print(result)

报错如下
2023-11-25 22:15:49,690 - modelscope - INFO - PyTorch version 2.1.1+cu118 Found.
2023-11-25 22:15:49,691 - modelscope - INFO - Loading ast index from /home/huyan/liheng/.cache/modelscope/ast_indexer
2023-11-25 22:15:49,788 - modelscope - INFO - Loading done! Current index file version is 1.9.5, with md5 c652a785900e4613e32639cfe65e325f and a total number of 945 components indexed
Traceback (most recent call last):
File "/data3/huyan/liheng/tmp/AdaSeq/AdaSeq/test.py", line 4, in
p = pipeline(Tasks.named_entity_recognition, '/data3/huyan/liheng/tmp/AdaSeq/AdaSeq/experiments/eBay/231125005305.028950/output_best')
File "/data3/huyan/liheng/conda/envs/pytorch_d/lib/python3.9/site-packages/modelscope/pipelines/builder.py", line 135, in pipeline
register_plugins_repo(cfg.safe_get('plugins'))
File "/data3/huyan/liheng/conda/envs/pytorch_d/lib/python3.9/site-packages/modelscope/utils/plugins.py", line 446, in register_plugins_repo
module_name, module_version, _ = get_modules_from_package(plugin)
File "/data3/huyan/liheng/conda/envs/pytorch_d/lib/python3.9/site-packages/modelscope/utils/plugins.py", line 736, in get_modules_from_package
data = get(package, tmpdir=tmpdir)
File "/data3/huyan/liheng/conda/envs/pytorch_d/lib/python3.9/site-packages/modelscope/utils/plugins.py", line 688, in get
target, _headers = _download_dist(url, scratch_file, index_url,
File "/data3/huyan/liheng/conda/envs/pytorch_d/lib/python3.9/site-packages/modelscope/utils/plugins.py", line 549, in _download_dist
target, _headers = urlretrieve(url, scratch_file, auth=auth)
File "/data3/huyan/liheng/conda/envs/pytorch_d/lib/python3.9/site-packages/modelscope/utils/plugins.py", line 504, in urlretrieve
res = opener.open(url, data=data)
File "/data3/huyan/liheng/conda/envs/pytorch_d/lib/python3.9/urllib/request.py", line 501, in open
req = Request(fullurl, data)
File "/data3/huyan/liheng/conda/envs/pytorch_d/lib/python3.9/urllib/request.py", line 320, in init
self.full_url = url
File "/data3/huyan/liheng/conda/envs/pytorch_d/lib/python3.9/urllib/request.py", line 346, in full_url
self._parse()
File "/data3/huyan/liheng/conda/envs/pytorch_d/lib/python3.9/urllib/request.py", line 375, in _parse
raise ValueError("unknown url type: %r" % self.full_url)
ValueError: unknown url type: 'adaseq-0.6.6-py3-none-any.whl.metadata'

但实际上我已经成功安装了 adaseq == 0.6.6

What have you tried?

尝试安装了各种版本的adaseq和modelscope

Code (if necessary)

from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks

print(result)

What's your environment?

AdaSeq Version (e.g., 1.0 or master): 0.6.6
ModelScope Version (e.g., 1.0 or master): 1.9.5
PyTorch Version (e.g., 1.12.1): 2.1.1
OS (e.g., Ubuntu 20.04): Ubuntu 20.04
Python version: 3.9
CUDA/cuDNN version: 11.8
GPU models and configuration:
Any other relevant information:

Code of Conduct

I agree to follow this project's Code of Conduct

[Question] 为什么model.save_pretrained总会报错，无法生成pred文件

What is your question?

代码是在天池上copy的，但每个模型训练结束执行model.save_pretrained时总会报错，报错如下
~/opt/anaconda3/lib/python3.9/site-packages/modelscope/trainers/hooks/checkpoint_hook.py in copy_files_and_dump_config(trainer, output_dir, config, bin_file)
261 if hasattr(model, 'save_pretrained'):
262 # Save pretrained of model, skip saving checkpoint
--> 263 model.save_pretrained(
264 output_dir,
265 bin_file,

~/opt/anaconda3/lib/python3.9/site-packages/adaseq/models/base.py in save_pretrained(self, target_folder, save_checkpoint_names, save_function, config, save_config_function, with_meta, **kwargs)
161 for field in ['experiment', 'dataset', 'train', 'evaluation']:
162 if field in config:
--> 163 del config[field]
164
165 if (

AttributeError: delitem

What have you tried?

No response

Code (if necessary)

`from modelscope.utils.config import Config

config = Config.from_string("""
experiment:
exp_dir: experiments/
exp_name: transformer_crf
seed: 42

task: named-entity-recognition

dataset:
data_file:
train: /Users/yyy/data/train.conll
valid: /Users/yyy/data/dev.conll
test: /Users/yyy/data/test.conll
data_type: conll

preprocessor:
type: sequence-labeling-preprocessor
max_length: 80

data_collator: SequenceLabelingDataCollatorWithPadding

model:
type: sequence-labeling-model
embedder:
model_name_or_path: damo/nlp_raner_named-entity-recognition_chinese-base-news
dropout: 0.1
use_crf: true

train:
max_epochs: 20
dataloader:
batch_size_per_gpu: 16
optimizer:
type: AdamW
lr: 5.0e-5
param_groups:
- regex: crf
lr: 5.0e-1
lr_scheduler:
type: StepLR
step_size: 2
gamma: 0.8
hooks:
- type: TensorboardHook

evaluation:
dataloader:
batch_size_per_gpu: 128
metrics:
- type: ner-metric
- type: ner-dumper
model_type: sequence_labeling
dump_format: conll
""", file_format='.yaml')

initialize a trainer

import os
from adaseq.commands.train import build_trainer_from_partial_objects

work_dir = 'experiments/transformer_crf'
os.makedirs(work_dir, exist_ok=True)

trainer = build_trainer_from_partial_objects(
config,
work_dir=work_dir,
seed=42,
device='cuda:0'
)

do training

trainer.train()

do testing

trainer.test()`

What's your environment?

AdaSeq Version (e.g., 1.0 or master):
ModelScope Version (e.g., 1.0 or master):
PyTorch Version (e.g., 1.12.1):
OS (e.g., Ubuntu 20.04):
Python version:
CUDA/cuDNN version:
GPU models and configuration:
Any other relevant information:

Code of Conduct

I agree to follow this project's Code of Conduct

数据集格式[Question]

What is your question?

我使用conll本地加载数据，如果我的数据是bio，而log显示biose会有影响吗

What have you tried?

No response

Code (if necessary)

No response

What's your environment?

AdaSeq Version (e.g., 1.0 or master):
ModelScope Version (e.g., 1.0 or master):
PyTorch Version (e.g., 1.12.1):
OS (e.g., Ubuntu 20.04):
Python version:
CUDA/cuDNN version:
GPU models and configuration:
Any other relevant information:

Code of Conduct

I agree to follow this project's Code of Conduct

[Question] Can not import BERT model because config lacks `Task` field

What is your question?

I need Chinese BERT-like classifier to split Chinese sentence into separate words on the mobile client. Mobile client allows to use custom models, but only in the tensorflow lite format. It is possible to convert pytorch models into tensorflow lite format, but I have to know input shape of the model.

I wrote a simple python script to do this, but faced with several issues I haven't solved yet.

How can I get shape for SequenceLabelingModel ? Seems SequenceLabelingModel is used for this babert model

What have you tried?

Load model as a plain pytorch model and get its parameters:

def loadModel():
    model = torch.load("/home/gelassen/Downloads/chinese_babert-base/pytorch_model.bin")
    model_shape = list(model.parameters())[0].shape 
    print(model_shape)
    print("Model shape" + model_shape)

loadModel()

Load model via Model.from_config() factory method:

def buildAdaseqModel() -> nn.Module:
    return Model.from_config("/home/gelassen/Downloads/chinese_babert-base/config.json")

Code (if necessary)

No response

What's your environment?

AdaSeq Version (master):
ModelScope Version (master):
PyTorch Version (default one, supplied with adaseq):
OS (e.g., Ubuntu 22.04):
Python version: Python 3.10.12
CUDA/cuDNN version: Driver Version: 470.199.02 CUDA Version: 11.4
GPU models and configuration: NVIDIA GeForce GTX 860M
Any other relevant information:

Code of Conduct

I agree to follow this project's Code of Conduct

[Question] How to configure continue training for a model

What is your question?

I trained a baseline checkpoint for 50 epochs with a LinearLR schedule starting from 1.0 to 0.0 in 50 iters. If I try to continue training that checkpoint on the same dataset or a new dataset, the learning rate is set to zero, eval accuracy remains the same even though the loss is decreasing. Can you please help me continue training so that I can use existing baselines on new datasets

What have you tried?

I tried training a model with a baseline config and update it with a new config to continue training.

Code (if necessary)

No response

What's your environment?

AdaSeq Version (e.g., 1.0 or master):
ModelScope Version (e.g., 1.0 or master):
PyTorch Version (e.g., 1.12.1):
OS (e.g., Ubuntu 20.04):
Python version: 3.7.5
CUDA/cuDNN version:
GPU models and configuration:
Any other relevant information:

Code of Conduct

I agree to follow this project's Code of Conduct

[Question]How to solve [datasets.builder.DatasetGenerationError: An error occurred while generating the dataset]

What is your question?

Traceback (most recent call last):
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\datasets\builder.py", line 1618, in _prepare_split_single
writer = writer_class(
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\datasets\arrow_writer.py", line 334, in init
self.stream = self._fs.open(fs_token_paths[2][0], "wb")
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\fsspec\spec.py", line 1309, in open
f = self._open(
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\fsspec\implementations\local.py", line 180, in _open
return LocalFileOpener(path, mode, fs=self, **kwargs)
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\fsspec\implementations\local.py", line 298, in init
self._open()
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\fsspec\implementations\local.py", line 303, in _open
self.f = open(self.path, mode=self.mode)
FileNotFoundError: [Errno 2] No such file or directory: 'C:/Users/shawn/.cache/huggingface/datasets/named_entity_recognition_dataset_builder/default-c270794ce0d
23d06/0.0.0/db737b9bb893f20fb03d04403a30bf7c033256c212b7e9f0ebc6e9c958535c51.incomplete/named_entity_recognition_dataset_builder-train-00000-00000-of-NNNNN.arro
w'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\runpy.py", line 197, in _run_module_as_main
return run_code(code, main_globals, None,
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\runpy.py", line 87, in run_code
exec(code, run_globals)
File "C:\Users\shawn\anaconda3\envs\pytorch\Scripts\adaseq.exe_main.py", line 7, in
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\adaseq\main.py", line 13, in run
main(prog='adaseq')
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\adaseq\commands_init.py", line 29, in main
args.func(args)
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\adaseq\commands\train.py", line 84, in train_model_from_args
train_model(
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\adaseq\commands\train.py", line 156, in train_model
trainer = build_trainer_from_partial_objects(
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\adaseq\commands\train.py", line 185, in build_trainer_from_partial_objects
dm = DatasetManager.from_config(task=config.task, **config.dataset)
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\adaseq\data\dataset_manager.py", line 182, in from_config
hfdataset = hf_load_dataset(path, name=name, **kwargs)
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\datasets\load.py", line 1797, in load_dataset
builder_instance.download_and_prepare(
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\datasets\builder.py", line 909, in download_and_prepare
self._download_and_prepare(
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\datasets\builder.py", line 1670, in _download_and_prepare
super()._download_and_prepare(
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\datasets\builder.py", line 1004, in _download_and_prepare
self._prepare_split(split_generator, **prepare_split_kwargs)
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\datasets\builder.py", line 1508, in _prepare_split
for job_id, done, content in self._prepare_split_single(
File "C:\Users\shawn\anaconda3\envs\pytorch\lib\site-packages\datasets\builder.py", line 1665, in _prepare_split_single
raise DatasetGenerationError("An error occurred while generating the dataset") from e
datasets.builder.DatasetGenerationError: An error occurred while generating the dataset

What have you tried?

set http proxy and successfully conneted to Youtube.

Code (if necessary)

No response

What's your environment?

AdaSeq Version (e.g., 1.0 or master):
ModelScope Version (e.g., 1.0 or master):
PyTorch Version (e.g., 1.12.1):
OS (e.g., Ubuntu 20.04):
Python version:
CUDA/cuDNN version:
GPU models and configuration:
Any other relevant information:

Code of Conduct

I agree to follow this project's Code of Conduct

modelscope / adaseq Goto Github PK

adaseq's Introduction

English | 中文 | 日本語

Introduction

Models and Online Accessibility

QuickTour

Why should I use ModelScope library

Installation

Docker

Setup Local Python Environment

Learn More

License

Citation

adaseq's People

Contributors

Stargazers

Watchers

Forkers

adaseq's Issues

What is your question?

What have you tried?

Code (if necessary)

What is your question?

What have you tried?

Code (if necessary)

What's your environment?

Code of Conduct

Checklist before your report.

What happened?

Python traceback

Operating system

Python version

Output of pip freeze

How to reproduce

Code of Conduct

Is your feature request related to a problem?

Describe the solution you'd like.

Describe alternatives you've considered.

Additional context.

Code of Conduct

What is your question?

What have you tried?

Code (if necessary)

What's your environment?

Code of Conduct

What is your question?

What have you tried?

Code (if necessary)

What's your environment?

Code of Conduct

Is your feature request related to a problem?

Describe the solution you'd like.

Describe alternatives you've considered.

Additional context.

Code of Conduct

What is your question?

What have you tried?

Code (if necessary)

What's your environment?

Code of Conduct

What is your question?

What have you tried?

Code (if necessary)

What's your environment?

Code of Conduct

Checklist before your report.

What happened?

Python traceback

Operating system

Python version

Output of pip freeze

How to reproduce

Code of Conduct

What is your question?

What have you tried?

Code (if necessary)

What's your environment?

Code of Conduct

Is your feature request related to a problem?

Describe the solution you'd like.