Coder Social home page Coder Social logo

tgrnet's Introduction

TGRNet: A Table Graph Reconstruction Network for Table Structure Recognition

Xue, Wenyuan, et al. "TGRNet: A Table Graph Reconstruction Network for Table Structure Recognition." arXiv preprint arXiv:2106.10598 (2021).

This work has been accepted for presentation at ICCV2021. The preview version has released at arXiv.org (https://arxiv.org/abs/2106.10598).

Abstract

A table arranging data in rows and columns is a very effective data structure, which has been widely used in business and scientific research. Considering large-scale tabular data in online and offline documents, automatic table recognition has attracted increasing attention from the document analysis community. Though human can easily understand the structure of tables, it remains a challenge for machines to understand that, especially due to a variety of different table layouts and styles. Existing methods usually model a table as either the markup sequence or the adjacency matrix between different table cells, failing to address the importance of the logical location of table cells, e.g., a cell is located in the first row and the second column of the table. In this paper, we reformulate the problem of table structure recognition as the table graph reconstruction, and propose an end-to-end trainable table graph reconstruction network (TGRNet) for table structure recognition. Specifically, the proposed method has two main branches, a cell detection branch and a cell logical location branch, to jointly predict the spatial location and the logical location of different cells. Experimental results on three popular table recognition datasets and a new dataset with table graph annotations (TableGraph-350K) demonstrate the effectiveness of the proposed TGRNet for table structure recognition.

Getting Started

Requirements

Create the environment from the environment.yml file conda env create --file environment.yml or install the software needed in your environment independently. If you meet some problems when installing PyTorch Geometric, please follow the official installation indroduction (https://pytorch-geometric.readthedocs.io/en/latest/notes/installation.html).

dependencies:
  - python==3.7.0
  - pip==20.2.4
  - pip:
    - dominate==2.5.1
    - imageio==2.8.0
    - networkx==2.3
    - numpy==1.18.2
    - opencv-python==4.4.0.46
    - pandas==1.0.3
    - pillow==7.1.1
    - torchfile==0.1.0
    - tqdm==4.45.0
    - visdom==0.1.8.9
    - Polygon3==3.0.8

PyTorch Installation

# CUDA 10.2
pip install torch==1.5.0 torchvision==0.6.0
# CUDA 10.1
pip install torch==1.5.0+CU101 torchvision==0.6.0+CU101 -f https://download.pytorch.org/whl/torch_stable.html
# CUDA 9.2
pip install torch==1.5.0+CU92 torchvision==0.6.0+CU92 -f https://download.pytorch.org/whl/torch_stable.html

PyTorch Geometric Installation

pip install torch-scatter==2.0.4 -f https://pytorch-geometric.com/whl/torch-1.5.0+${CUDA}.html
pip install torch-sparse==0.6.3 -f https://pytorch-geometric.com/whl/torch-1.5.0+${CUDA}.html
pip install torch-cluster==1.5.4 -f https://pytorch-geometric.com/whl/torch-1.5.0+${CUDA}.html
pip install torch-spline-conv==1.2.0 -f https://pytorch-geometric.com/whl/torch-1.5.0+${CUDA}.html
pip install torch-geometric

where ${CUDA} should be replaced by your specific CUDA version (cu92, cu101, cu102).

Datasets Preparation

cd ./datasets
tar -zxvf datasets.tar.gz
## The './datasets/' folder should look like:
- datasets/
  - cmdd/
  - icdar13table/
  - icdar19_ctdar/
  - tablegraph24k/

Pretrained Models Preparation

IMPORTANT Acoording to feedbacks from users (I also tested by myself), the pretrained models may not work for some enviroments. I have tested the following enviroment that can work as expected.

  - CUDA 9.2
  - torch 1.7.0+torchvision 0.8.0
  - torch-cluster 1.5.9
  - torch-geometric 1.6.3
  - torch-scatter 2.0.6
  - torch-sparse 0.6.9
  - torch-spline-conv 1.2.1
  • Download pretrained models from Google Dive or Alibaba Cloud.
  • Put checkpoints.tar.gz in "./checkpoints/" and extract it.
cd ./checkpoints
tar -zxvf checkpoints.tar.gz
## The './checkpoints/' folder should look like:
- checkpoints/
  - cmdd_overall/
  - icdar13table_overall/
  - icdar19_lloc/
  - tablegraph24k_overall/

Test

We have prepared scripts for test and you can just run them.

- test_cmdd.sh
- test_icdar13table.sh
- test_tablegraph-24k.sh
- test_icdar19ctdar.sh

Train

Todo

tgrnet's People

Contributors

xuewenyuan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

tgrnet's Issues

Getting 'KeyError' while loading weights

Hi Team,
I was running the tests on cmdd dataset using test_cmdd.sh
I am able to create the dataset correctly.
dataset [TbRecCMDDDataset] was created The number of test images = 104. Testset: ./datasets/cmdd /home/ashishk/.pyenv/versions/3.7.0/envs/tgr/lib/python3.7/site-packages/torchvision/models/_utils.py:209: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and will be removed in 0.15, please use 'weights' instead. f"The parameter '{pretrained_param}' is deprecated since 0.13 and will be removed in 0.15, " /home/ashishk/.pyenv/versions/3.7.0/envs/tgr/lib/python3.7/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or Nonefor 'weights' are deprecated since 0.13 and will be removed in 0.15. The current behavior is equivalent to passingweights=None. warnings.warn(msg) initialize network with normal initialize network with normal model [TbRecModel] was created Load SUfficx best loading the model from ./checkpoints/cmdd_overall/best_net_Backbone.pth

However, when I am loading the pre-trained model weights for cell Logical prediction model, I am getting following error.

Traceback (most recent call last): File "test.py", line 32, in <module> model.setup(opt) # regular setup: load and print networks; create schedulers File "/home/ashishk/research/TGRNet/models/base_model.py", line 92, in setup self.load_networks(load_suffix, rm_layers) File "/home/ashishk/research/TGRNet/models/base_model.py", line 219, in load_networks net.load_state_dict(state_dict,strict=False) File "/home/ashishk/.pyenv/versions/3.7.0/envs/tgr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1590, in load_state_dict load(self) File "/home/ashishk/.pyenv/versions/3.7.0/envs/tgr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1578, in load load(child, prefix + name + '.') File "/home/ashishk/.pyenv/versions/3.7.0/envs/tgr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1578, in load load(child, prefix + name + '.') File "/home/ashishk/.pyenv/versions/3.7.0/envs/tgr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1575, in load state_dict, prefix, local_metadata, True, missing_keys, unexpected_keys, error_msgs) File "/home/ashishk/.pyenv/versions/3.7.0/envs/tgr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1474, in _load_from_state_dict hook(state_dict, prefix, local_metadata, strict, missing_keys, unexpected_keys, error_msgs) File "/home/ashishk/.pyenv/versions/3.7.0/envs/tgr/lib/python3.7/site-packages/torch_geometric/nn/dense/linear.py", line 140, in _lazy_load_hook weight = state_dict[prefix + 'weight'] KeyError: 'gconv_row.lin.weight'

Could you please provide any suggestions here?

环境同步后,Lloc各项结果还是接近于0

          @EmperorKaiser 谢谢。

@hsdjkfnsfc 抱歉回复比较晚。我在cuda10.2下复现了你的问题,但是切换到cuda9.2后,结果正常。所以,很可能是不同cuda版本下,软件的影响,特别时torch geometric。
我的测试环境:

  • cuda 9.2
  • torch 1.7.0+torchvision 0.8.0
  • torch-cluster 1.5.9
  • torch-geometric 1.6.3
  • torch-scatter 2.0.6
  • torch-sparse 0.6.9
  • torch-spline-conv 1.2.1

Originally posted by @xuewenyuan in #2 (comment)

train过程

在训练过程出现维度为[-1,0]got 1,出现维度不一样,是best的pth参数不同吗

graph_edge

您好,在解压后的datasets后得到的的三个txt文件中都有类似这样的路径。
/data/xuewenyuan/data/icdar19_ctdar/graph_edge/cTDaR_t00496_0_edge.csv
然而我没有找到graph_edge文件夹。
请问怎样才能得到这个文件夹,是需要自己生成吗?

Testing model on ICDAR 2013 and 2019

Hi,
I am trying to run your model on the ICDAR 2013 and 2019 datasets but I was getting the error below:

Screen Shot 1400-12-17 at 13 35 38

It seems the path in the pickle files have been set to your directory. Please can you help out?

Thanks

TableGraph-350K数据集

Hi Yuan,
请问下哪里可以获得TableGraph-350K的数据集呢?README里只提供了24K数据集。

Inference

Is there a script to perform inference using the checkpoints ? not testing the algorithm on a whole dataset but only perform inference

The paths are not suitable.

Hi guys,
When I try to test on icdar13table, I have followed the guides in REAME.md. However, I face a problem when run the test_icdar13table.sh .

  • In files: text.txt, train.txt, and val.txt in folder icdar13table, the paths to pkl files are not suitable to the construction that you guide in README, for example: /data/xuewenyuan/data/icdar13table/gt/eu-012_3_1_0.pkl. I think it should be './datasets/icdar13table/gt/eu-012_3_1_0.pkl'. Fortunately, this problem can be easily solve by replace method.
  • The other problem appear in the following stacktrace, I beleive that the paths in pkl file are wrong.
Traceback (most recent call last):
  File "test.py", line 27, in <module>
    evaluator = IC15Evaluator(opt)
  File "/home/ubuntu/trihuynh/TGRNet/util/evaluator_vis.py", line 19, in __init__
    self.gt_dict, self.scale_dict, self.gt_box_sum, self.masks = self.create_gt_dict(self.data_list)
  File "/home/ubuntu/trihuynh/TGRNet/util/evaluator_vis.py", line 141, in create_gt_dict
    table_img  = Image.open(table_anno['image_path']).convert("RGB")
  File "/home/ubuntu/anaconda3/envs/pytorch_p38/lib/python3.8/site-packages/PIL/Image.py", line 2953, in open
    fp = builtins.open(filename, "rb")
FileNotFoundError: [Errno 2] No such file or directory: '/data/xuewenyuan/data/icdar13table/image/eu-012_3_1_0.jpg'

These problem could be handle by refactor the structure of datasets. Besides, I suggest you change instructions in README or provide the other pkl 😃.
I didn't try on the others dataset so I don't know the status in them.

model

Please kindly provide the training process of the model. Thank you very much.

如何推斷單個圖像?

嗨,元

謝謝你令人印象深刻的工作! 我想知道是否可以對單個圖像進行推理。 你能告訴我如何準備輸入以及如何運行腳本嗎?

謝謝

数据集和关于graph_edge

您好,想问问提供的数据集标签里面的test.txt, train.txt, val.txt都有提到graph_edge,但是这再哪里呢

How to inference a single image?

Hi, yuan

Thank you for your impressing work! I am wondering if I can run inference on a single image. Could you tell me how to prepare input and how to run the script?

Thank you

How to generate seg_label file?

I want to use my dataset to train TGRNet model. My dataset contains spatial and logical location, so I want to know how to generate seg_label file.

no prt_net_Backbone.pth when training ICDAR2013 dataset from scratch

Hi, I am trying to train this model on ICDAR2013 dataset, and I use train_icdar13table.sh. It seems that the model is defined in prt_net_Backbone.pth, but there is no such pth file can be downloaded. Could you provide some details about the model in the pth file or provde that prt_net_Backbone file? Thank you so much.

测试阶段模型加载报错

RuntimeError: Error(s) in loading state_dict for Cell_Lloc_Pre:
size mismatch for gconv_row.lin.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([512, 768]).
size mismatch for gconv_col.lin.weight: copying a param with shape torch.Size([20]) from checkpoint, the shape in current model is torch.Size([512, 768]).

========================
请问这个问题(情况)怎么解决?

作者方便提供一下预训练模型吗?

作者您好,调整相关路径之后运行了您的测试脚本没有问题,但是训练时找不到预训练模型,根据您GITHUB上提供的预训练模型地址下载下来的是训练好的pth模型文件,方便提供一下原始的预训练模型吗?
报错是这样的:FileNotFoundError: [Errno 2] No such file or directory: './checkpoints/cmdd_overall/prt_net_Backbone.pth'
方便提供这个prt_net_Backbone.pth文件吗?

pkl修改问题,以及数据集问题

在进行测试过程中发现image_path不一致,没有找到合适的pkl修改方法,请问能不能重新提供一下gt的pkl文件,另外,graph_edge文件夹缺失。

数据集优有一点小问题

作者您的数据集当中,在pickle文件以及text文件当中使用多个是您本机的绝对路径,希望您更改一下。

同时我在运行您的test文件的时候发现Cell Spatial Location的结果与您的论文一致,但是Cell Logical Location完全不一致
image
image

Cell Logistic Location的各项指标几乎为0

作者您好,我在运行test文件时,Cell Spatial Location可以复现论文中的结果,但是 Cell Logistic Location各项指标几乎为0,请问这是什么原因啊?我用的配置是:cuda 10.1,其余配置均和readme一样。
image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.