Coder Social home page Coder Social logo

ymy-k / dptext-detr Goto Github PK

View Code? Open in Web Editor NEW
167.0 9.0 21.0 25.94 MB

[AAAI'23 Oral] DPText-DETR: Towards Better Scene Text Detection with Dynamic Points in Transformer

License: Other

Python 79.19% C++ 2.04% Cuda 18.76%
detection-transformer scene-text-detection dynamic-point-query

dptext-detr's People

Contributors

ymy-k avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dptext-detr's Issues

training time

Hello, thanks for your work. Can you tell me how long pre-training and fine-tuning need to be trained on what configuration of the GPU?

Rotate text detection problem

@ymy-k @chaimi2013
How to prepare data for training, data example:
x1,y1,x2,y2,x3,y3,x4,y4,text
x1,y1,x2,y2,x3,y3,x4,y4,text
x1,y1,x2,y2,x3,y3,x4,y4,text
Each annotation text file contains bounding box coordinates with corresponding text.

How to evaluate custom datasets?

There is no polygon label information in my dataset, only bbox in the form of xyxy. What should I do if I want to evaluate the effect of the trained model on my dataset?
In other words, do I just need to convert my dataset format to the provided zip format? Is there any code that can do this?

Layers dropout in the EFSA

Dear author,

Have you ever verified the effectiveness of the dropout in EFSA? Since many previous works report the non-necessity of adding any dropout to DETR architecture, why is here not the same situation?

编译setup.py时候出现错误ModuleNotFoundError: No module named 'tools.version_utils'

各位有没有发生这个错误,我在windows10的环境下,安装好了detectron2(测试了ok)
然后按照顺序,执行python setup.py build develop 后,出现这个ModuleNotFoundError: No module named 'tools.version_utils'错误,是什么原因呢?如何解决:

Downloading https://files.pythonhosted.org/packages/fa/d0/724c8204f87b6f807e3e67de32b8b4922d579154a448ce94e89129064bf1/scipy-1.11.0.tar.gz#sha256=f
9b0248cb9d08eead44cde47cbf6339f1e9aa0dfde28f5fb27950743e317bd5d
Best match: scipy 1.11.0
Processing scipy-1.11.0.tar.gz
Writing C:\Users\ADMINI1\AppData\Local\Temp\easy_install-t3od1aen\scipy-1.11.0\setup.cfg
Running scipy-1.11.0\setup.py -q bdist_egg --dist-dir C:\Users\ADMINI
1\AppData\Local\Temp\easy_install-t3od1aen\scipy-1.11.0\egg-dist-tmp-8kzh7iwu
Traceback (most recent call last):
File "D:\Anaconda3\envs\detr\lib\site-packages\setuptools\sandbox.py", line 156, in save_modules
yield saved
File "D:\Anaconda3\envs\detr\lib\site-packages\setuptools\sandbox.py", line 198, in setup_context
yield
File "D:\Anaconda3\envs\detr\lib\site-packages\setuptools\sandbox.py", line 259, in run_setup
_execfile(setup_script, ns)
File "D:\Anaconda3\envs\detr\lib\site-packages\setuptools\sandbox.py", line 46, in _execfile
exec(code, globals, locals)
File "C:\Users\ADMINI~1\AppData\Local\Temp\easy_install-t3od1aen\scipy-1.11.0\setup.py", line 27, in

ModuleNotFoundError: No module named 'tools.version_utils'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "setup.py", line 65, in
setup(
File "D:\Anaconda3\envs\detr\lib\site-packages\setuptools_init_.py", line 153, in setup
return distutils.core.setup(**attrs)
File "D:\Anaconda3\envs\detr\lib\distutils\core.py", line 148, in setup
dist.run_commands()
File "D:\Anaconda3\envs\detr\lib\distutils\dist.py", line 966, in run_commands
self.run_command(cmd)
File "D:\Anaconda3\envs\detr\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "D:\Anaconda3\envs\detr\lib\site-packages\setuptools\command\develop.py", line 34, in run
self.install_for_development()
File "D:\Anaconda3\envs\detr\lib\site-packages\setuptools\command\develop.py", line 129, in install_for_development
self.process_distribution(None, self.dist, not self.no_deps)
File "D:\Anaconda3\envs\detr\lib\site-packages\setuptools\command\easy_install.py", line 750, in process_distribution
distros = WorkingSet([]).resolve(
File "D:\Anaconda3\envs\detr\lib\site-packages\pkg_resources_init_.py", line 771, in resolve
dist = best[req.key] = env.best_match(
File "D:\Anaconda3\envs\detr\lib\site-packages\pkg_resources_init_.py", line 1056, in best_match
return self.obtain(req, installer)
File "D:\Anaconda3\envs\detr\lib\site-packages\pkg_resources_init_.py", line 1068, in obtain
return installer(requirement)
File "D:\Anaconda3\envs\detr\lib\site-packages\setuptools\command\easy_install.py", line 675, in easy_install
return self.install_item(spec, dist.location, tmpdir, deps)
File "D:\Anaconda3\envs\detr\lib\site-packages\setuptools\command\easy_install.py", line 701, in install_item
dists = self.install_eggs(spec, download, tmpdir)
File "D:\Anaconda3\envs\detr\lib\site-packages\setuptools\command\easy_install.py", line 896, in install_eggs
return self.build_and_install(setup_script, setup_base)
File "D:\Anaconda3\envs\detr\lib\site-packages\setuptools\command\easy_install.py", line 1168, in build_and_install
self.run_setup(setup_script, setup_base, args)
File "D:\Anaconda3\envs\detr\lib\site-packages\setuptools\command\easy_install.py", line 1152, in run_setup
run_setup(setup_script, args)
File "D:\Anaconda3\envs\detr\lib\site-packages\setuptools\sandbox.py", line 262, in run_setup
raise
File "D:\Anaconda3\envs\detr\lib\contextlib.py", line 131, in exit
self.gen.throw(type, value, traceback)
File "D:\Anaconda3\envs\detr\lib\site-packages\setuptools\sandbox.py", line 198, in setup_context
yield
File "D:\Anaconda3\envs\detr\lib\contextlib.py", line 131, in exit
self.gen.throw(type, value, traceback)
File "D:\Anaconda3\envs\detr\lib\site-packages\setuptools\sandbox.py", line 169, in save_modules
saved_exc.resume()
File "D:\Anaconda3\envs\detr\lib\site-packages\setuptools\sandbox.py", line 143, in resume
raise exc.with_traceback(self._tb)
File "D:\Anaconda3\envs\detr\lib\site-packages\setuptools\sandbox.py", line 156, in save_modules
yield saved
File "D:\Anaconda3\envs\detr\lib\site-packages\setuptools\sandbox.py", line 198, in setup_context
yield
File "D:\Anaconda3\envs\detr\lib\site-packages\setuptools\sandbox.py", line 259, in run_setup
_execfile(setup_script, ns)
File "D:\Anaconda3\envs\detr\lib\site-packages\setuptools\sandbox.py", line 46, in _execfile
exec(code, globals, locals)
File "C:\Users\ADMINI~1\AppData\Local\Temp\easy_install-t3od1aen\scipy-1.11.0\setup.py", line 27, in

ModuleNotFoundError: No module named 'tools.version_utils'

How to prepare custom data for training

How to prepare data for training, data example:
x1,y1,x2,y2,x3,y3,x4,y4,text
x1,y1,x2,y2,x3,y3,x4,y4,text
x1,y1,x2,y2,x3,y3,x4,y4,text
Each annotation text file contains bounding box coordinates with corresponding text.

Nice but instalation is only for the code writer, not for humans

Your work is interesting to 10 people in the world and you made a code that is understandable with hints, in other words, a setup text is just for you.
Download dataset. Where to put it? Create a new folder in the directory or look it up somewhere.?
Polygon3 library is dead. error: legacy-install-failure , ERROR: Failed building wheel for Polygon3
pip install detectron2 - dead. Tried to clone but 16 tools not working properly.
Running got error: ./datasets/totaltext/train_poly_ori.json is missing...

I wish to use it but there is no way to solve your puzzle.

关于论文中3.2节Positional Label Form的问题

“If the original top side of text instance lies in the bottom position, the starting point is adjusted to the other side” 这段话我不知道该如何理解,‘’如果文本实例的原始顶部位于底部位置,则将起始点调整到另一侧‘’,那如何判断文本实例的顶部和底部呢?

Problems with Polygon3

Thank you for this project. Unfortunately, instalation in Conda, pycharm, and in cmd

error: legacy-install-failure
× Encountered error while trying to install package.
╰─> Polygon3
ERROR: Could not find a version that satisfies the requirement detectron2 (from versions: none)
ERROR: No matching distribution found for detectron2

evaluation on CTW1500

Hi, thanks for the great work.
I have some issues with the evaluation performance on the CTW1500 dataset.
If I evaluate using the code from this github, the performance is the same as reported.
However, when I try to change the output format to meet the CTW-1500 official test code, the performance drops a lot.

I add the below code in demo/demo.py

def write_txt(pred, fname):
    print(pred["instances"])
    polys = pred["instances"].polygons
    txtfname = fname[:-3]+"txt"
    with open(txtfname, 'w', encoding='utf-8') as wf:
        for p in polys:
            line = np.round(p.detach().cpu().numpy()).astype(np.int32).tolist()
            line = list(map(str,line))
            line = ",".join(line)
            wf.write(line + '\n')

and edit the output as below code in demo/demo.py

            if args.output:
                if os.path.isdir(args.output):
                    assert os.path.isdir(args.output), args.output
                    out_filename = os.path.join(args.output, os.path.basename(path))
                else:
                    assert len(args.input) == 1, "Please specify a directory with args.output"
                    out_filename = args.output
                
                write_txt(predictions, out_filename) # add this line
                visualized_output.save(out_filename)

The result of CTW1500 official test code is
( Prec. / Recall / F1-score : 91.4 / 79.0 / 84.7 )
There is a gap with the data in the report
(Prec. / Recall / F1-score : 91.7 / 86.2 / 88.8 )
Are there any mistakes in my edit and inferences?

License issue

@ymy-k
is it possible to use this repo for commercial purposes?
When I checked the code inside adet there is no need to use adet, we can replace this library directly from detection2 and use.
could you please let me know what can I do for commercial purposes?

how to convert the custom model to onnx?

你好,我训练了我自己的模型,但是无法将它转为onnx格式,我的问题是如图所示,
企业微信截图_91785ac4-0f0e-495f-b3eb-2bb4dd25c41c
我尝试了从源码重新编译onnx,以及更换不同的torch和框架版本,都不能解决问题,不知道这个模型是否可以转成onnx呢?

Installation not working without cuda

Hello,

I am trying to do the installation locally where I don't have a GPU. I downloaded pytorch and torchvision without the cuda extension and the setup.py command is not working.

How can properly make the installations without cuda?

Thanks in advance,
Filipe Lauar.

Performance on ICDAR2015

Thanks for sharing this amazing work which achieves excellent performances on these datasets with texts in irregular shape. Have you verified the effectiveness of DPText-DETR on ICDAR2015 dataset? since it is a comman benchmark scene text detection dataset.

Onnx Export

Hey,

nice work! Is it possible to export the network to onnx?

Thanks and keep up the good work

找不到核心代码

您好,谢谢你的伟大工作。我在解读源码时,没能找到你论文中创新工作的具体代码实现。

  1. 改进后的 composite queries ,对于论文中公式 (2) 计算 point1 ... pointn 的计算代码写在了哪个 py 文件里?
  2. EFAS 里的 the local circular convolution 在哪个 py 文件里?
  3. Decoder Layer 中 prediction heads 在哪个 py 文件里?

Triton server

Hello
anyone build this repo in triton inference server?
I face CUDA_HOME issue during building on nvcr.io/nvidia/tritonserver:23.03-py3.

如何训练自定义数据

我有 coco 格式的数据集,但在训练和将 use_poygon 更改为 false 时,因为我只有边界框,所以出现此错误:assert self.use_polygon and self.num_ctrl_points == 16 # only the polygon version is released now

Can not run the inference code

Dears,

Thank for sharing your nice work!

But unfortunately, I can not run the inference code, as I am facing the following error:
KeyError: 'Non-existent config key: MODEL.TRANSFORMER'

I added this line to solve it:
cfg.set_new_allowed(args.config_file)

Then, I faced this error:
KeyError: "No object named 'TransformerPureDetector' found in 'META_ARCH' registry!"

Therefore your help will be much appreciated!
Thanks in advance!

evaluation报错:Exception: The sample 0003115 not present in GT

非常感谢您的开源工作,我在尝试利用我的custorm-data训练时候,遇到这个问题。请问这个是我的数据标签的内容出现遗漏吗?能否请教回答,非常感谢您。
....省略模型训练输出 [05/24 12:35:40 d2.utils.events]: eta: 0:59:39 iter: 959 total_loss: 5.663 loss_ce: 0.07613 loss_ctrl_points: 0.7339 loss_ce_0: 0.2235 loss_ctrl_points_0: 0.9569 loss_ce_1: 0.1119 loss_ctrl_points_1: 0.7779 loss_ce_2: 0.08364 loss_ctrl_points_2: 0.7342 loss_ce_3: 0.07504 loss_ctrl_points_3: 0.7327 loss_ce_4: 0.0761 loss_ctrl_points_4: 0.7341 loss_ce_enc: 0.0678 loss_bbox_enc: 0.09829 loss_giou_enc: 0.3514 time: 0.3070 data_time: 0.0014 lr: 2e-05 max_mem: 5302M [05/24 12:35:47 d2.utils.events]: eta: 0:59:38 iter: 979 total_loss: 6.035 loss_ce: 0.05531 loss_ctrl_points: 0.7331 loss_ce_0: 0.2261 loss_ctrl_points_0: 0.9199 loss_ce_1: 0.1128 loss_ctrl_points_1: 0.7784 loss_ce_2: 0.0791 loss_ctrl_points_2: 0.73 loss_ce_3: 0.06397 loss_ctrl_points_3: 0.7315 loss_ce_4: 0.05708 loss_ctrl_points_4: 0.732 loss_ce_enc: 0.04719 loss_bbox_enc: 0.1012 loss_giou_enc: 0.3392 time: 0.3073 data_time: 0.0015 lr: 2e-05 max_mem: 5302M [05/24 12:35:53 adet.data.datasets.text]: Loaded 435 images in COCO format from datasets/ctw1500/test_poly.json [05/24 12:35:53 d2.data.dataset_mapper]: [DatasetMapper] Augmentations used in inference: [ResizeShortestEdge(short_edge_length=(1000, 1000), max_size=1280, sample_style='choice')] [05/24 12:35:53 d2.data.common]: Serializing 435 elements to byte tensors and concatenating them all ... [05/24 12:35:53 d2.data.common]: Serialized dataset takes 5.59 MiB [05/24 12:35:53 d2.evaluation.evaluator]: Start inference on 435 batches [05/24 12:35:55 d2.evaluation.evaluator]: Inference done 11/435. Dataloading: 0.0004 s/iter. Inference: 0.1324 s/iter. Eval: 0.0002 s/iter. Total: 0.1330 s/iter. ETA=0:00:56 [05/24 12:36:00 d2.evaluation.evaluator]: Inference done 46/435. Dataloading: 0.0008 s/iter. Inference: 0.1438 s/iter. Eval: 0.0002 s/iter. Total: 0.1448 s/iter. ETA=0:00:56 [05/24 12:36:05 d2.evaluation.evaluator]: Inference done 85/435. Dataloading: 0.0008 s/iter. Inference: 0.1369 s/iter. Eval: 0.0002 s/iter. Total: 0.1379 s/iter. ETA=0:00:48 [05/24 12:36:10 d2.evaluation.evaluator]: Inference done 124/435. Dataloading: 0.0008 s/iter. Inference: 0.1342 s/iter. Eval: 0.0008 s/iter. Total: 0.1358 s/iter. ETA=0:00:42 [05/24 12:36:15 d2.evaluation.evaluator]: Inference done 163/435. Dataloading: 0.0008 s/iter. Inference: 0.1325 s/iter. Eval: 0.0007 s/iter. Total: 0.1340 s/iter. ETA=0:00:36 [05/24 12:36:20 d2.evaluation.evaluator]: Inference done 203/435. Dataloading: 0.0008 s/iter. Inference: 0.1314 s/iter. Eval: 0.0006 s/iter. Total: 0.1328 s/iter. ETA=0:00:30 [05/24 12:36:25 d2.evaluation.evaluator]: Inference done 241/435. Dataloading: 0.0008 s/iter. Inference: 0.1316 s/iter. Eval: 0.0005 s/iter. Total: 0.1329 s/iter. ETA=0:00:25 [05/24 12:36:30 d2.evaluation.evaluator]: Inference done 280/435. Dataloading: 0.0008 s/iter. Inference: 0.1310 s/iter. Eval: 0.0007 s/iter. Total: 0.1325 s/iter. ETA=0:00:20 [05/24 12:36:35 d2.evaluation.evaluator]: Inference done 321/435. Dataloading: 0.0008 s/iter. Inference: 0.1298 s/iter. Eval: 0.0007 s/iter. Total: 0.1313 s/iter. ETA=0:00:14 [05/24 12:36:41 d2.evaluation.evaluator]: Inference done 360/435. Dataloading: 0.0008 s/iter. Inference: 0.1299 s/iter. Eval: 0.0006 s/iter. Total: 0.1313 s/iter. ETA=0:00:09 [05/24 12:36:46 d2.evaluation.evaluator]: Inference done 400/435. Dataloading: 0.0008 s/iter. Inference: 0.1295 s/iter. Eval: 0.0006 s/iter. Total: 0.1309 s/iter. ETA=0:00:04 [05/24 12:36:50 d2.evaluation.evaluator]: Total inference time: 0:00:56.166945 (0.130621 s / iter per device, on 1 devices) [05/24 12:36:50 d2.evaluation.evaluator]: Total inference pure compute time: 0:00:55 (0.129224 s / iter per device, on 1 devices) [05/24 12:36:50 adet.evaluation.text_evaluation_det]: Saving results to output/r_50_poly/ctw1500/finetune/inference/text_results.json An invalid detection in temp_det_results/0003001.txt line 3 is removed ... An invalid detection in temp_det_results/0003001.txt line 20 is removed ... An invalid detection in temp_det_results/0003355.txt line 14 is removed ... An invalid detection in temp_det_results/0003355.txt line 16 is removed ... An invalid detection in temp_det_results/0003355.txt line 21 is removed ... An invalid detection in temp_det_results/0003355.txt line 38 is removed ... An invalid detection in temp_det_results/0003356.txt line 11 is removed ... An invalid detection in temp_det_results/0003356.txt line 19 is removed ... ......省略报错 Traceback (most recent call last): File "/home/lixinru/devdata/code_env/DPText-DETR/tools/train_net.py", line 291, in <module> launch( File "/home/lixinru/anaconda3/envs/dpdetr/lib/python3.9/site-packages/detectron2/engine/launch.py", line 82, in launch main_func(*args) File "/home/lixinru/devdata/code_env/DPText-DETR/tools/train_net.py", line 285, in main return trainer.train() File "/home/lixinru/devdata/code_env/DPText-DETR/tools/train_net.py", line 103, in train self.train_loop(self.start_iter, self.max_iter) File "/home/lixinru/devdata/code_env/DPText-DETR/tools/train_net.py", line 93, in train_loop self.after_step() File "/home/lixinru/anaconda3/envs/dpdetr/lib/python3.9/site-packages/detectron2/engine/train_loop.py", line 180, in after_step h.after_step() File "/home/lixinru/anaconda3/envs/dpdetr/lib/python3.9/site-packages/detectron2/engine/hooks.py", line 552, in after_step self._do_eval() File "/home/lixinru/anaconda3/envs/dpdetr/lib/python3.9/site-packages/detectron2/engine/hooks.py", line 525, in _do_eval results = self._func() File "/home/lixinru/anaconda3/envs/dpdetr/lib/python3.9/site-packages/detectron2/engine/defaults.py", line 453, in test_and_save_results self._last_eval_results = self.test(self.cfg, self.model) File "/home/lixinru/anaconda3/envs/dpdetr/lib/python3.9/site-packages/detectron2/engine/defaults.py", line 608, in test results_i = inference_on_dataset(model, data_loader, evaluator) File "/home/lixinru/anaconda3/envs/dpdetr/lib/python3.9/site-packages/detectron2/evaluation/evaluator.py", line 204, in inference_on_dataset results = evaluator.evaluate() File "/home/lixinru/devdata/code_env/DPText-DETR/adet/evaluation/text_evaluation_det.py", line 219, in evaluate text_result = self.evaluate_with_official_code(result_path, self._text_eval_gt_path) File "/home/lixinru/devdata/code_env/DPText-DETR/adet/evaluation/text_evaluation_det.py", line 178, in evaluate_with_official_code return text_eval_script_det.text_eval_main_det(det_file=result_path, gt_file=gt_path) File "/home/lixinru/devdata/code_env/DPText-DETR/adet/evaluation/text_eval_script_det.py", line 318, in text_eval_main_det return rrc_evaluation_funcs_det.main_evaluation(None, det_file, gt_file, default_evaluation_params, validate_data, File "/home/lixinru/devdata/code_env/DPText-DETR/adet/evaluation/rrc_evaluation_funcs_det.py", line 397, in main_evaluation validate_data_fn(p['g'], p['s'], evalParams) File "/home/lixinru/devdata/code_env/DPText-DETR/adet/evaluation/text_eval_script_det.py", line 54, in validate_data raise Exception("The sample %s not present in GT" %k) Exception: The sample 0003115 not present in GT

detail config of ablation studies

Hello, can you share the detailed configuration of the ablation experimentation?
I have conducted several experiments by using fewer training iterations and fewer training data, but the model performance can't reach the digit reported in the paper.

Character segmentation

Is it possible to separate all characters by changing options using pre-trained model? It is needed to find all characters separated from each other instead of one segment for a word.

How to train/fine-tune on custom dataset?

I modified the dataset reading function under text.py to adapt to the custom dataset and also added the key and path of the custom dataset under the _PREDEFINED_SPLITS_TEXT function of bulitin.py, but the error is still reported as follows:

Traceback (most recent call last):
  File "/home/xxx/anaconda3/envs/deepsolo/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
    fn(i, *args)
  File "/home/xxx/anaconda3/envs/deepsolo/lib/python3.8/site-packages/detectron2/engine/launch.py", line 126, in _distributed_worker
    main_func(*args)
  File "/home/xxx/code/DPText-DETR/tools/train_net.py", line 279, in main
    trainer = Trainer(cfg)
  File "/home/xxx/anaconda3/envs/deepsolo/lib/python3.8/site-packages/detectron2/engine/defaults.py", line 378, in __init__
    data_loader = self.build_train_loader(cfg)
  File "/home/xxx/code/DPText-DETR/tools/train_net.py", line 118, in build_train_loader
    return build_detection_train_loader(cfg, mapper=mapper)
  File "/home/xxx/anaconda3/envs/deepsolo/lib/python3.8/site-packages/detectron2/config/config.py", line 207, in wrapped
    explicit_args = _get_args_from_config(from_config, *args, **kwargs)
  File "/home/xxx/anaconda3/envs/deepsolo/lib/python3.8/site-packages/detectron2/config/config.py", line 245, in _get_args_from_config
    ret = from_config_func(*args, **kwargs)
  File "/home/xxx/anaconda3/envs/deepsolo/lib/python3.8/site-packages/detectron2/data/build.py", line 337, in _train_loader_from_config
    dataset = get_detection_dataset_dicts(
  File "/home/xxx/anaconda3/envs/deepsolo/lib/python3.8/site-packages/detectron2/data/build.py", line 240, in get_detection_dataset_dicts
    dataset_dicts = [DatasetCatalog.get(dataset_name) for dataset_name in names]
  File "/home/xxx/anaconda3/envs/deepsolo/lib/python3.8/site-packages/detectron2/data/build.py", line 240, in <listcomp>
    dataset_dicts = [DatasetCatalog.get(dataset_name) for dataset_name in names]
  File "/home/xxx/anaconda3/envs/deepsolo/lib/python3.8/site-packages/detectron2/data/catalog.py", line 53, in get
    raise KeyError(
KeyError: "Dataset 'xxx' is not registered! Available datasets are: coco_2014_train, coco_2014_val, coco_2014_minival, coco_2014_minival_100, coco_2014_valminusminival, coco_2017_train, coco_2017_val, coco_2017_test, coco_2017_test-dev, coco_2017_val_100, keypoints_coco_2014_train, keypoints_coco_2014_val, keypoints_coco_2014_minival, keypoints_coco_2014_valminusminival, keypoints_coco_2014_minival_100, keypoints_coco_2017_train, keypoints_coco_2017_val, keypoints_coco_2017_val_100, coco_2017_train_panoptic_separated, coco_2017_train_panoptic_stuffonly, coco_2017_train_panoptic, coco_2017_val_panoptic_separated, coco_2017_val_panoptic_stuffonly, coco_2017_val_panoptic, coco_2017_val_100_panoptic_separated, coco_2017_val_100_panoptic_stuffonly, coco_2017_val_100_panoptic, lvis_v1_train, lvis_v1_val, lvis_v1_test_dev, lvis_v1_test_challenge, lvis_v0.5_train, lvis_v0.5_val, lvis_v0.5_val_rand_100, lvis_v0.5_test, lvis_v0.5_train_cocofied, lvis_v0.5_val_cocofied, cityscapes_fine_instance_seg_train, cityscapes_fine_sem_seg_train, cityscapes_fine_instance_seg_val, cityscapes_fine_sem_seg_val, cityscapes_fine_instance_seg_test, cityscapes_fine_sem_seg_test, cityscapes_fine_panoptic_train, cityscapes_fine_panoptic_val, voc_2007_trainval, voc_2007_train, voc_2007_val, voc_2007_test, voc_2012_trainval, voc_2012_train, voc_2012_val, ade20k_sem_seg_train, ade20k_sem_seg_val, pic_person_train, pic_person_val"

按照所指示的步骤,到了训练哪一步,得到了一个misaligned address错误

你好,想请问一下,使用以下命令进行训练python tools/train_net.py --config-file configs/DPText_DETR/Pretrain/R_50_poly.yaml --num-gpus 2,完整报错信息如下:

terminate called after throwing an instance of 'c10::CUDAError'                                                                                                                                   
  what():  CUDA error: misaligned address                                                                                                                                                         
Exception raised from create_event_internal at ../c10/cuda/CUDACachingAllocator.cpp:1055 (most recent call first):                                                                                
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7fb64ea53a22 in /home/cunjian/anaconda3/envs/DPText-DETR/lib/python3.8/site-packages/torch/lib/libc10.so)                 
frame #1: <unknown function> + 0x10aa3 (0x7fb64ee10aa3 in /home/cunjian/anaconda3/envs/DPText-DETR/lib/python3.8/site-packages/torch/lib/libc10_cuda.so)                                          
frame #2: c10::cuda::CUDACachingAllocator::raw_delete(void*) + 0x1a7 (0x7fb64ee12147 in /home/cunjian/anaconda3/envs/DPText-DETR/lib/python3.8/site-packages/torch/lib/libc10_cuda.so)            
frame #3: c10::TensorImpl::release_resources() + 0x54 (0x7fb64ea3d5a4 in /home/cunjian/anaconda3/envs/DPText-DETR/lib/python3.8/site-packages/torch/lib/libc10.so)                                
frame #4: std::vector<c10d::Reducer::Bucket, std::allocator<c10d::Reducer::Bucket> >::~vector() + 0x2f9 (0x7fb6f47612e9 in /home/cunjian/anaconda3/envs/DPText-DETR/lib/python3.8/site-packages/to
rch/lib/libtorch_python.so)                                                                                                                                                                       
frame #5: c10d::Reducer::~Reducer() + 0x276 (0x7fb6f4757d16 in /home/cunjian/anaconda3/envs/DPText-DETR/lib/python3.8/site-packages/torch/lib/libtorch_python.so)                                 
frame #6: std::_Sp_counted_ptr<c10d::Reducer*, (__gnu_cxx::_Lock_policy)2>::_M_dispose() + 0x12 (0x7fb6f4786e32 in /home/cunjian/anaconda3/envs/DPText-DETR/lib/python3.8/site-packages/torch/lib/
libtorch_python.so)                                                                                                                                                                               
frame #7: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() + 0x46 (0x7fb6f3ef70f6 in /home/cunjian/anaconda3/envs/DPText-DETR/lib/python3.8/site-packages/torch/lib/libtorch_python
.so)                                                                                                                                                                                              
frame #8: std::_Sp_counted_ptr<c10d::Logger*, (__gnu_cxx::_Lock_policy)2>::_M_dispose() + 0x1d (0x7fb6f478b47d in /home/cunjian/anaconda3/envs/DPText-DETR/lib/python3.8/site-packages/torch/lib/l
ibtorch_python.so)                                                                                                                                                                                
frame #9: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() + 0x46 (0x7fb6f3ef70f6 in /home/cunjian/anaconda3/envs/DPText-DETR/lib/python3.8/site-packages/torch/lib/libtorch_python
.so)                                                                                                                                                                                              
frame #10: <unknown function> + 0xd891ef (0x7fb6f47891ef in /home/cunjian/anaconda3/envs/DPText-DETR/lib/python3.8/site-packages/torch/lib/libtorch_python.so)                                    
frame #11: <unknown function> + 0x4ff8d0 (0x7fb6f3eff8d0 in /home/cunjian/anaconda3/envs/DPText-DETR/lib/python3.8/site-packages/torch/lib/libtorch_python.so)                                    
frame #12: <unknown function> + 0x500b3e (0x7fb6f3f00b3e in /home/cunjian/anaconda3/envs/DPText-DETR/lib/python3.8/site-packages/torch/lib/libtorch_python.so)                                    
frame #13: /home/cunjian/anaconda3/envs/DPText-DETR/bin/python() [0x4d3abe]                                                                                                                       
frame #14: /home/cunjian/anaconda3/envs/DPText-DETR/bin/python() [0x4f9606]                                                                                                                       
frame #15: /home/cunjian/anaconda3/envs/DPText-DETR/bin/python() [0x4d3abe]                                                                                                                       
frame #16: /home/cunjian/anaconda3/envs/DPText-DETR/bin/python() [0x4f9606]                                                                                                                       
frame #17: /home/cunjian/anaconda3/envs/DPText-DETR/bin/python() [0x4d3abe]                                                                                                                       
frame #18: /home/cunjian/anaconda3/envs/DPText-DETR/bin/python() [0x5a726b]                                                                                                                       

Traceback (most recent call last):                                                               
  File "tools/train_net.py", line 291, in <module>                                               
    launch(                                     
  File "/home/cunjian/anaconda3/envs/DPText-DETR/lib/python3.8/site-packages/detectron2/engine/launch.py", line 67, in launch
    mp.spawn(                                   
  File "/home/cunjian/anaconda3/envs/DPText-DETR/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 230, in spawn
    return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')                                                                                                                  
  File "/home/cunjian/anaconda3/envs/DPText-DETR/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 188, in start_processes
    while not context.join():                                                                    
  File "/home/cunjian/anaconda3/envs/DPText-DETR/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 150, in join
    raise ProcessRaisedException(msg, error_index, failed_process.pid)                                                                                                                            
torch.multiprocessing.spawn.ProcessRaisedException:                                              

-- Process 0 terminated with the following error:                                                
Traceback (most recent call last):                                                               
  File "/home/cunjian/anaconda3/envs/DPText-DETR/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
    fn(i, *args)                                
  File "/home/cunjian/anaconda3/envs/DPText-DETR/lib/python3.8/site-packages/detectron2/engine/launch.py", line 126, in _distributed_worker
    main_func(*args)                                                     
  File "/home/cunjian/anaconda3/envs/DPText-DETR/lib/python3.8/site-packages/detectron2/engine/defaults.py", line 494, in run_step
    self._trainer.run_step()                    
  File "/home/cunjian/anaconda3/envs/DPText-DETR/lib/python3.8/site-packages/detectron2/engine/train_loop.py", line 285, in run_step
    losses.backward()                           
  File "/home/cunjian/anaconda3/envs/DPText-DETR/lib/python3.8/site-packages/torch/_tensor.py", line 255, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)                                                                                                            
  File "/home/cunjian/anaconda3/envs/DPText-DETR/lib/python3.8/site-packages/torch/autograd/__init__.py", line 147, in backward
    Variable._execution_engine.run_backward(                                                     
RuntimeError: cuDNN error: CUDNN_STATUS_MAPPING_ERROR

我的pytorch版本跟您readme中的要求一致,我的cudatoolkit版本(也就是nvcc -V)是11.6,我看您的版本是11.1,但是网上说cuda的大版本之间是兼容的,请问我需要更改为11.1吗?或者您能不能结合报错信息给一些debug的意见呢,不胜感激!~

How to get polygon coordinates of segmentation masks

Hello! Thank you for your work! I'm trying to get coordinates of polygons that represent segmentation masks in .txt file after executing demo/demo.py, I want to use them to cut the input image into a bunch of small images that consist of detected text areas. Is there a way to do it?

自己制作数据集的“训练-测试指标”格式转换问题

您好,非常感谢您的开源工作!我这边有一个小疑问请教您~
我自己有一批特定场景的数据想用您的模型进finetune,我自己的数据集的格式都是四点标注的格式(相当于不规则的4点多边形,但每个text instance起始点不固定,方向顺时针逆时针都有),即[x1,y1,x2,y2,x3,y3,x4,y4],通过均匀采样点可以生成类似于datasets/ctw1500/test_poly.json中16点顺时针格式,但是我发现最终输出指标的时候使用的是datasets/evaluation/gt_ctw1500.zip中14点逆时针格式,请问gt_ctw1500.zip的格式是必须的么,还是随意一个固定方向的多边形标注点格式即可,比如一张图有的text instance是从左上角开始顺时针标注的,有的text instance是从右下角开始逆时针标注是否可行?

about class_loss and matcher

大佬好,我在阅读源码过程中,发现在您的losses.py文件中的loss_labels函数中,独热编码好像好像有点问题,因为看sigmoid_focal_loss函数中的要求是target与input必须是相同维度,且值为1表示文本,值为0表示背景。您源码中的input shape应该是
[bs, num_queries, num_pts, 1], 但是如果按照您生成对应gt的独热编码代码,生成的shape是[bs, num_queries, num_pts, 1],但值全部都是0.
我看到源码在初始化target_classess矩阵的时候用的是num_class,也就是1. 同样的疑问在matcher中的计算分类的权重损失矩阵时也存在,在BoxHungarianMatcher()类中,cost_class = pos_cost_class[:, tgt_ids] - neg_cost_class[:, tgt_ids], pos_cost_class.shape = [bs,*num_queries, 1], tgt_ids由于是文本,应该都是1,这里还出现了数组越界的问题。不知道是否是我哪里理解得不正确,还请大佬能解答一下我的疑惑,非常感谢。
1
2
3

setup error

作者您好,很感谢您这篇工作。但我在编译环境时候遇到编译问题,尝试了一些解决办法都不成功,不知道是我哪里没有设置对
/home/pr-316/anaconda3/envs/DPText-DETR/lib/python3.8/site-packages/torch/include/ATen/cuda/CUDAContext.h:5:10: fatal error: cuda_runtime_api.h: 没有那个文件或目录

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.