geekjuruo / lead Goto Github PK
View Code? Open in Web Editor NEWA Chinese Spell Checking Model Released on EMNLP2022.
A Chinese Spell Checking Model Released on EMNLP2022.
Hi, I want to train this model on multiple GPUs, but I cannot find the configuration for adding multiple GPUs in the training process.
Apart from this, I found that the utilization of GPU reaches only 10% on my NVIDIA 3090.
感觉论文关于对比学习和融入字典的想法很厉害,请问如果我想了解代码中关于这部分的代码应该重点阅读哪一部分?
The code “from reader.BasicReaderWithDict import BasicReader” in file WordDictReader.py reported an error. The error was that I could not find the “.BasicReaderWithDict”. After I changed it to “from reader import BasicReader”, I reported an error “File "/home/nlp/MyProject/LEAD/LEAD-main/LEAD-main/reader/HybridReader.py", line 6, in
from reader.WordDictReader import WordDictReader
File "/home/nlp/MyProject/LEAD/LEAD-main/LEAD-main/reader/WordDictReader.py", line 15, in
class WordDictReader(BasicReader):
TypeError: module() takes at most 2 arguments (3 given)” again. May I ask why? How to solve it.
请问在损失函数Lk中,s代表最小batch中第s个字是错误的,这个s是在数据集中指出的那个位置,还是通过别的方法确定的呢?
运行时出现这样的错误,请问我该怎么处理它呢?
100%|██████████| 284201/284201 [00:01<00:00, 192911.68it/s]
Train Size: 284201, Valid Train Size: 0
Traceback (most recent call last):
File "/root/.pycharm_helpers/pydev/pydevd.py", line 1496, in _exec
pydev_imports.execfile(file, globals, locals) # execute the script
File "/root/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/remote-home/cs_tcci_renjun/RENJUN/LEAD-main/main.py", line 49, in
pipeline.initialize()
File "/remote-home/cs_tcci_renjun/RENJUN/LEAD-main/pipeline/BasicPipeline.py", line 107, in initialize
self.get_loader()
File "/remote-home/cs_tcci_renjun/RENJUN/LEAD-main/pipeline/BasicPipeline.py", line 69, in get_loader
self.data_loaders[key] = DataLoader(dataset=value, collate_fn=lambda data: self.processor.process(data, key), shuffle=shuffle, drop_last=False, batch_size=batch_size)
File "/remote-home/cs_tcci_renjun/envs/rjlead/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 277, in init
sampler = RandomSampler(dataset, generator=generator) # type: ignore[arg-type]
File "/remote-home/cs_tcci_renjun/envs/rjlead/lib/python3.9/site-packages/torch/utils/data/sampler.py", line 97, in init
raise ValueError("num_samples should be a positive integer "
ValueError: num_samples should be a positive integer value, but got num_samples=0
Download the glyph-enhanced pretrained model from GCC and put the model files in resources/glyph
从该网址下载的pytorch_model.bin文件会报错模型参数不匹配,忽略该错误后,纠错出来的结果会出现很多unk,效果很差,是否有其他可替换该模型参数的其他下载网址
Run times error:
Traceback (most recent call last):
File "/home/nlp/MyProject/LEAD/main.py", line 49, in
pipeline.initialize()
File "/home/nlp/MyProject/LEAD/pipeline/BasicPipeline.py", line 104, in initialize
self.init_model()
File "/home/nlp/MyProject/LEAD/pipeline/MultiModelPipeline.py", line 17, in init_model
super(MultiModelPipeline, self).init_model()
File "/home/nlp/MyProject/LEAD/pipeline/BasicPipeline.py", line 84, in init_model
model = model_class()
File "/home/nlp/MyProject/LEAD/model/GlyphClassifier.py", line 13, in init
self.bert = GlyphEncoder()
File "/home/nlp/MyProject/LEAD/model/GlyphEncoder.py", line 16, in init
bert_config = BertConfig.from_pretrained(os.path.join(glyph_path, "config.json"))
File "/home/nlp/anaconda3/envs/lead/lib/python3.9/site-packages/transformers/configuration_utils.py", line 501, in from_pretrained
config_dict, kwargs = cls.get_config_dict(pretrained_model_name_or_path, **kwargs)
File "/home/nlp/anaconda3/envs/lead/lib/python3.9/site-packages/transformers/configuration_utils.py", line 550, in get_config_dict
configuration_file = get_configuration_file(
File "/home/nlp/anaconda3/envs/lead/lib/python3.9/site-packages/transformers/configuration_utils.py", line 841, in get_configuration_file
all_files = get_list_of_files(
File "/home/nlp/anaconda3/envs/lead/lib/python3.9/site-packages/transformers/file_utils.py", line 1952, in get_list_of_files
return list_repo_files(path_or_repo, revision=revision, token=token)
File "/home/nlp/anaconda3/envs/lead/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 112, in _inner_fn
validate_repo_id(arg_value)
File "/home/nlp/anaconda3/envs/lead/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 160, in validate_repo_id
raise HFValidationError(
huggingface_hub.utils._validators.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': './resources/glyph/config.json'. Use repo_type
argument if needed.
, it seems that the reason for the error because no. / resources/glyph/config. The json 'file, please glyph - enhanced pretrained model specific which one is to download, is GCC using the training model in the link or GCC model after fine-tuning, And where do you get the required configuration files? Or can you provide it?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.