Coder Social home page Coder Social logo

tablenet-pytorch's Introduction

TableNet-pytorch

Pytorch Implementation of TableNet Research Paper : https://arxiv.org/abs/2001.01469

TableNet Architecture

Description

In this project we will implement an end-to-end Deep learning architecture which will not only localize the Table in an image, but will also generate structure of Table by segmenting columns in that Table. After detecting Table structure from the image, we will use Pytesseract OCR package to read the contents of the Table.

To know more about the approach, refer my medium blog post,

Part 1: https://asagar60.medium.com/tablenet-deep-learning-model-for-end-to-end-table-detection-and-tabular-data-extraction-from-b1547799fe29

Part 2: https://asagar60.medium.com/tablenet-deep-learning-model-for-end-to-end-table-detection-and-tabular-data-extraction-from-a49ac4cbffd4

Data

We will use both Marmot and Marmot Extended dataset for Table Recognition. Marmot dataset contains Table bounding box coordinates and extended version of this dataset contains Column bounding box coordinates.

Marmot Dataset : https://www.icst.pku.edu.cn/cpdp/docs/20190424190300041510.zip Marmot Extended dataset : https://drive.google.com/drive/folders/1QZiv5RKe3xlOBdTzuTVuYRxixemVIODp

Download processed Marmot dataset: https://drive.google.com/file/d/1irIm19B58-o92IbD9b5qd6k3F31pqp1o/view?usp=sharing

Model

We will use DenseNet121 as encoder and build model upon it.

Trainable Params

Params

Download saved model : https://drive.google.com/file/d/1TKALmlwUM_n4gULh6A6Q35VPRUpWDmJZ/view?usp=sharing

Performance compared to other encoder models ( Resnet18, EfficientNet-B0, EfficientNet-B1, VGG19 )

Table Detection - F1

Table F1

Table Detection - Loss

Table Loss

Column Detection - F1

Column F1

Column Detection - Loss

Column Loss

Predictions

Predictions from the model

Prediction 1

After fixing table mask using contours

Prediction 2

After fixing column mask using contours

Prediction 3

After processing it through pytesseract

Prediction 4

Deployed application

https://vimeo.com/577282006

Future Work

  • Deploy this application on a remote server using AWS /StreamLit sharing/heroku.
  • Model Quantization for faster inference time.
  • Train for more epochs and compare the performances.
  • Increase data size by adding data from ICDAR 2013 Table recognition dataset.

References

  1. Table Net Research Paper
  2. 7 tips for squeezing maximum performance from pytorch
  3. StreamLit
  4. AppliedAI Course

tablenet-pytorch's People

Contributors

asagar60 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

tablenet-pytorch's Issues

No module named 'model'

I am facing this problem when trying to run on colab. I would be grateful if you could help me how to solve this.


ModuleNotFoundError Traceback (most recent call last)

in
13 import pytesseract
14 from io import StringIO
---> 15 from model import TableNet

ModuleNotFoundError: No module named 'model'


NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.

To view examples of installing some common dependencies, click the
"Open Examples" button below.

No densenet model found

Hi Arun,
I'm not able to find any model within this project, particularly densenet_config_4_model_checkpoint.pth.tar

Could you please provide densenet_config_4_model_checkpoint.pth.tar at least ?

Thanks in advance

Tessar

I found the project very interesting.
but there is an error message.
can you help me.
error

Unable to Run the project successfully

FileNotFoundError: [Errno 2] No such file or directory: 'densenet_config_4_model_checkpoint.pth.tar'
Traceback:
File "C:\Users\user\Anaconda3\envs\streamlitenv\lib\site-packages\streamlit\scriptrunner\script_runner.py", line 557, in _run_script
exec(code, module.dict)
File "app.py", line 238, in
model = load_model()
File "C:\Users\user\Anaconda3\envs\streamlitenv\lib\site-packages\streamlit\legacy_caching\caching.py", line 573, in wrapped_func
return get_or_create_cached_value()
File "C:\Users\user\Anaconda3\envs\streamlitenv\lib\site-packages\streamlit\legacy_caching\caching.py", line 557, in get_or_create_cached_value
return_value = func(*args, **kwargs)
File "app.py", line 156, in load_model
model.load_state_dict(torch.load("densenet_config_4_model_checkpoint.pth.tar")['state_dict'])
File "C:\Users\user\Anaconda3\envs\streamlitenv\lib\site-packages\torch\serialization.py", line 699, in load
with _open_file_like(f, 'rb') as opened_file:
File "C:\Users\user\Anaconda3\envs\streamlitenv\lib\site-packages\torch\serialization.py", line 230, in _open_file_like
return _open_file(name_or_buffer, mode)
File "C:\Users\user\Anaconda3\envs\streamlitenv\lib\site-packages\torch\serialization.py", line 211, in init
super(_open_file, self).init(open(name, mode))

Dependencies

Hello, can you tell me where I can find the dependencies for this project? Searched but did not spot them :) Greetings

could not find MARK

Hi, this project is very helpful for me to write my thesis, can you solve the problem about the installation guide of dependencies for local deployment?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.