tomassosorio / ocr_tablenet Goto Github PK

View Code? Open in Web Editor NEW

144.0 144.0 44.0 61.16 MB

TableNet Implementation on Pytorch

Python 100.00%

ocr_tablenet's People

Contributors

Stargazers

Watchers

Forkers

zhengdeding andrew05200 chuonglqspkt krvicky koryakovdmitry runrunliuliu atulyaatul1999 sree181 boy-be-ambitious huoran559 kucukagan juliakas nipulseervi tangmiao0 yuansky sekhar14 panyinyin pankajs99 xelixdev sergheidinu littlefish123 davidchenyuhao bwv988 dhwgithub lxyuan0420 farnazzeidi faresmakhlouf vishalw-iitk ffffffffchopin mos3b-faqeeh agnes-u techthiyanes ess-chaimae wangdian215 anoop-qasolve chinmaychahar venky3692 thisisamish luhaochun aunkidwai imihalcea rrwabina thomaslwang c61811

ocr_tablenet's Issues

A Bug？

Hi I am just testing this model on the regular images that I randomly took from papers.
However, I always encounter the error:

ValueError: operands could not be broadcast together with shapes (896,896,4) (3,) (896,896,4)

I figure out this may come from the Normalize and I change the input to

image = Image.open(image_path).convert("RGB")

and it works. Maybe a fix?

Excuse me, can you detect wireless form

RuntimeError: [enforce fail at inline_container.cc:145] . PytorchStreamReader failed reading zip archive: invalid header or archive is corrupted

Hi while I am running predict.py (in Google Colab), I am getting the below error. I have uploaded the best_model.ckpt file as well

RuntimeError: [enforce fail at inline_container.cc:145] . PytorchStreamReader failed reading zip archive: invalid header or archive is corrupted

encounter ModuleNotFoundError: No module named 'tensorboard' when running in Conda virtual environment

I use anaconda to manage the virtual environment and run the package to avoid package conflicts
then I encountered the module not fund error.
I checked with pip list and found the package is installed.
that means the package tensorboard installed but could not be found by conda
later I figured out it needs to run the additional commands if running the script under conda virtual environment
$ conda install -c conda-forge tensorboard==2.4.1
$ conda install -y -c conda-forge protobuf==3.14.0

Reference:
https://stackoverflow.com/questions/61320572/modulenotfounderror-no-module-named-tensorboard
https://stackoverflow.com/questions/58686400/can-not-get-pytorch-working-with-tensorboard

add license

please add license information in the project

RuntimeError: CUDA out of memory.

Tried to allocate 1.91 GiB (GPU 0; 7.93 GiB total capacity; 5.33 GiB already allocated; 1.91 GiB free; 5.36 GiB reserved in total by PyTorch)

When i try to run python train.py

How to convert to onnx or anywhere to download onnx version? and how to use onnx?

Python version

Which python version is needed such that it is compatible with all the required modules ?

unable to predict on an image using "predict.py".

Traceback (most recent call last):
File "predict.py", line 142, in
predict()
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 829, in call
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "predict.py", line 135, in predict
pred = Predict(model_weights, transforms)
File "predict.py", line 38, in init
self.model = TableNetModule.load_from_checkpoint('/content/best_model.ckpt')
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/core/saving.py", line 137, in load_from_checkpoint
checkpoint = pl_load(checkpoint_path, map_location=lambda storage, loc: storage)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/utilities/cloud_io.py", line 32, in load
return torch.load(f, map_location=map_location)
File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 587, in load
with _open_zipfile_reader(opened_file) as opened_zipfile:
File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 242, in init
super(_open_zipfile_reader, self).init(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: [enforce fail at inline_container.cc:145] . PytorchStreamReader failed reading zip archive: failed finding central directory

How I can get the coordinates of the cells

Thanks for your perfect works.
I need get the bounding box of the cells. Do you have any ideas for this?

error while predicting with default image

all the requirements are installed but raising error
pytesseract.pytesseract.TesseractNotFoundError: tesseract is not installed or it's not in your PATH. See README file for more information.

Suggest to loosen the dependency on albumentations

Hi, your project OCR_tablenet(commit id: 5f4d781) requires "albumentations==0.5.2" in its dependency. After analyzing the source code, we found that the following versions of albumentations can also be suitable, i.e., albumentations 0.5.1, since all functions that you directly (7 APIs: albumentations.pytorch.transforms.ToTensorV2.init, albumentations.augmentations.transforms.Resize.init, albumentations.core.composition.Compose.init, albumentations.augmentations.transforms.RandomResizedCrop.init, albumentations.augmentations.transforms.VerticalFlip.init, albumentations.augmentations.transforms.HorizontalFlip.init, albumentations.augmentations.transforms.Normalize.init) or indirectly (propagate to 12 albumentations's internal APIs and 0 outsider APIs) used from the package have not been changed in these versions, thus not affecting your usage.

Therefore, we believe that it is quite safe to loose your dependency on albumentations from "albumentations==0.5.2" to "albumentations>=0.5.1,<=0.5.2". This will improve the applicability of OCR_tablenet and reduce the possibility of any further dependency conflict with other projects.

May I pull a request to further loosen the dependency on albumentations?

By the way, could you please tell us whether such an automatic tool for dependency analysis may be potentially helpful for maintaining dependencies easier during your development?

How to process png image

Thank you for you wonderful work at first!
I want to process png file with the repo, but in predict.py about line 50, It raise ValueError: operands could not be broadcast together with shapes (896,896) (3,) (896,896) , What should I do to avoid this.

How to improve the recognition effect non-training set？

Thanks for your project！
Table position detection performs well. But when I use non-training set images for testing ，The table information extraction effect is very poor ，using pytesseract OCR. Maybe there is a problem with my picture format?

Import error

Hi there @tomassosorio

Quick introduction: I need to extract data from PDF/images containing tables. Unfortunately, I have several different formats and traditional tools (PDFPlumber, Tabula, Camelot) do not seem to work for every possible format.
So now I'm trying a DL approach, and looking for some TableNet implementation code I found this repo.

I'm trying to use you code on Google Colab, but unfortunately I was not able to make it work. Notice that I have very little experience with DL libraries, so I apologise if my question is trivial.

Anyway. here's my code:

# Mount drive
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

!pip install -r  /content/drive/MyDrive/TableNet/requirements.txt

!python /content/drive/MyDrive/TableNet/predict.py --model_weights='/content/drive/MyDrive/TableNet/best_model.ckpt' --image_path='/content/drive/MyDrive/TableNet/TablesImages/Test_table.png'

This is the error I get:

Traceback (most recent call last):
  File "/content/drive/MyDrive/TableNet/predict.py", line 19, in <module>
    from tablenet import TableNetModule
  File "/content/drive/MyDrive/TableNet/tablenet/__init__.py", line 3, in <module>
    from .marmot import MarmotDataModule
  File "/content/drive/MyDrive/TableNet/tablenet/marmot.py", line 7, in <module>
    import pytorch_lightning as pl
  File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/__init__.py", line 66, in <module>
    from pytorch_lightning import metrics
  File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/metrics/__init__.py", line 14, in <module>
    from pytorch_lightning.metrics.metric import Metric
  File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/metrics/metric.py", line 23, in <module>
    from pytorch_lightning.metrics.utils import _flatten, dim_zero_cat, dim_zero_mean, dim_zero_sum
  File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/metrics/utils.py", line 18, in <module>
    from pytorch_lightning.utilities import rank_zero_warn
  File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/utilities/__init__.py", line 24, in <module>
    from pytorch_lightning.utilities.apply_func import move_data_to_device
  File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/utilities/apply_func.py", line 25, in <module>
    from torchtext.data import Batch
  File "/usr/local/lib/python3.7/dist-packages/torchtext/__init__.py", line 6, in <module>
    from . import experimental
  File "/usr/local/lib/python3.7/dist-packages/torchtext/experimental/__init__.py", line 2, in <module>
    from . import transforms
  File "/usr/local/lib/python3.7/dist-packages/torchtext/experimental/transforms.py", line 4, in <module>
    from torchtext._torchtext import RegexTokenizer as RegexTokenizerPybind
ImportError: /usr/local/lib/python3.7/dist-packages/torchtext/_torchtext.so: undefined symbol: _ZNK3c104Type14isSubtypeOfExtERKSt10shared_ptrIS0_EPSo

I have to admit I have no idea what is causing the error. Could you please help me?

Thanks a lot and great work!

SList Object to Pandas DataFrame Conversion

Hi, I get " SList Object" after predicting my test images. I need to convert it to CSV/ pandas data frame format. Can anybody suggest to me how can I do it?

Thanks in advance.

Finetuning "best_model.cktp" on the same dataset

I took 18 samples from your data.zip and their table and column mask for training. I wanted to overfit on those samples. WHile training and at 100+ Epochs I don't see the loss reducing that much.

I am using your training script in collab.
Attached is 18 training samples.

Uploading 18Samples.zip…
.

Error: invalid load key, '<'

I am using Google Colab. Upon running python predict.py, I am getting the following error:

Traceback (most recent call last):
  File "predict.py", line 142, in <module>
    predict()
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "predict.py", line 135, in predict
    pred = Predict(model_weights, transforms)
  File "predict.py", line 38, in __init__
    self.model = TableNetModule.load_from_checkpoint(checkpoint_path)
  File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/core/saving.py", line 137, in load_from_checkpoint
    checkpoint = pl_load(checkpoint_path, map_location=lambda storage, loc: storage)
  File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/utilities/cloud_io.py", line 32, in load
    return torch.load(f, map_location=map_location)
  File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 595, in load
    return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
  File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 764, in _legacy_load
    magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, '<'.

Also, I am unable to install many dependencies in the requirements.txt file using pip 21.0.1
For eg. numpy 1.20.0 is not available.

I cannot unzip the zip file when download the repo from the github

Entry point not found - torchvision

Hi,

I get a pop-up error when running py predict.py:

"The procedure entry point
torchCheckFail@detail[....] could not be located in the dynamic link library C:\Users[...]\tablenetVirtualEnv\Lib\site-packages\torchvision_C.pyd.
"

It can be linked with the problem that I cannot install torchvision 0.8.2. It doesn't find it.
I can't find the version for Windows (i am using windows 10, python 3.9.2) either: https://pypi.org/project/torchvision/0.8.2/#files

So I try with 0.9.2 but there is this pop up. In the end I can ignore it and it will do an extraction:

py predict.py
[ 0 1 2 3 4 5
0 Wellname ‘Toss (ft) LOT (psi) ‘Sv (psi) ‘Sv-LOT (psi) Normalized
1 117-4 4265 2785 2871 116 LoTisv
2 117-4 631 4465 4782 216 0.96
3 117-4 7615 5690 5045 246 0.94
4 117-4 9528 7439 m2 333 0.96
5 172 414 2042 089 247 0.96
6 172 821 4684 sigt 505 0.92
7 172 6361 6815 464 0.90
8 172 8550 10368 10150 218 0.93
9 Popeye 11982 aria 2828 247 1.02
10 Average 4491 4742 4901 16 0.95
11 161-4 6778 6075 7105 159 0.96
12 161-4 2639 2784 130 0.97
13 161-4 9154 27a 2857 145 0.98
14 205-2 4442 5989 6105 16 0.95
15 205-2 4518 6685 6786 16 0.96
16 205-2 8091 6670 eats 401 0.98
17 205-2 9812 2087 3103 145 0.99
18 205-2 8845 3085 161 16 0.98
19 205-42 4806 4597 arta 16 0.96
20 205.414 4862 4713 4829 16 0.96
21 205.14 6588 16 0.97
22 205.414 NaN NaN 131 0.98
23 Genesis 6703 NaN NaN 0.97
24 Average NaN NaN NaN
25 NaN NaN NaN NaN NaN]

How could I remove that problem?

Gpu getting filled up after 3-4 epoch

Hi,
I wrote the tablenet code with the resnet backbone with the help of your code, but there one issue which I was not able to solve that.
while the training the GPU memory get filled up after 3-4 epoch. Please help me.

I am giving a link to my model code.
https://discuss.pytorch.org/t/memory-getting-filled-up-after-3-or-4-epoch/118333?u=njnitesh

ValueError: `Dataloader` returned 0 length. Please make sure that your Dataloader at least returns 1 batch

Sorry but I have some troubles when I trainning. Can anybody give me some advices?

how to use it to train a model on my own dataset?

Thank you very much for your work. I am very interested in your work. However, when I plan to use this model to train on my own data set, I encountered some difficulties and hope to get your help.
my dataset is in the form of csv files, just like CSV datasets metioned in （https://github.com/yhenon/pytorch-retinanet#annotations-format）.

CSV datasets
The CSVGenerator provides an easy way to define your own datasets. It uses two CSV files: one file containing annotations and one file containing a class name to ID mapping.

Annotations format
The CSV file with annotations should contain one annotation per line. Images with multiple bounding boxes should use one row per bounding box. Note that indexing for pixel values starts at 0. The expected format of each line is:

path/to/image.jpg,x1,y1,x2,y2,class_name
Some images may not contain any labeled objects. To add these images to the dataset as negative examples, add an annotation where x1, y1, x2, y2 and class_name are all empty:

path/to/image.jpg,,,,,
A full example:

/data/imgs/img_001.jpg,837,346,981,456,cow
/data/imgs/img_002.jpg,215,312,279,391,cat
/data/imgs/img_002.jpg,22,5,89,84,bird
/data/imgs/img_003.jpg,,,,,
This defines a dataset with 3 images. img_001.jpg contains a cow. img_002.jpg contains a cat and a bird. img_003.jpg contains no interesting objects/animals.

Class mapping format
The class name to ID mapping file should contain one mapping per line. Each line should use the following format:

class_name,id
Indexing for classes starts at 0. Do not include a background class as it is implicit.

For example:

cow,0
cat,1
bird,2

ValueError: operands could not be broadcast together with shapes (896,896,4) (3,) (896,896,4)

i am testing the model with different cases and something strange is, if I just feed it a screenshot .png picture of a table (sample attached)

and I run the command
python predict.py --model_weights='./tablenet_pretrained.ckpt' --image_path='./sample_table2.png'

it gives me value error operands could not be broadcast together with shapes (896,896,4) (3,) (896,896,4)

here is the full error log:

  File "/home/bluespinach/Documents/projects/tool_exp/OCR_tablenet/predict.py", line 148, in <module>
    predict()
  File "/home/bluespinach/miniconda3/envs/tablenet/lib/python3.9/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/home/bluespinach/miniconda3/envs/tablenet/lib/python3.9/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/home/bluespinach/miniconda3/envs/tablenet/lib/python3.9/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/bluespinach/miniconda3/envs/tablenet/lib/python3.9/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/home/bluespinach/Documents/projects/tool_exp/OCR_tablenet/predict.py", line 144, in predict
    print(pred.predict(image))
  File "/home/bluespinach/Documents/projects/tool_exp/OCR_tablenet/predict.py", line 50, in predict
    processed_image = self.transforms(image=np.array(image))["image"]
  File "/home/bluespinach/miniconda3/envs/tablenet/lib/python3.9/site-packages/albumentations/core/composition.py", line 182, in __call__
    data = t(force_apply=force_apply, **data)
  File "/home/bluespinach/miniconda3/envs/tablenet/lib/python3.9/site-packages/albumentations/core/transforms_interface.py", line 89, in __call__
    return self.apply_with_params(params, **kwargs)
  File "/home/bluespinach/miniconda3/envs/tablenet/lib/python3.9/site-packages/albumentations/core/transforms_interface.py", line 102, in apply_with_params
    res[key] = target_function(arg, **dict(params, **target_dependencies))
  File "/home/bluespinach/miniconda3/envs/tablenet/lib/python3.9/site-packages/albumentations/augmentations/transforms.py", line 1496, in apply
    return F.normalize(image, self.mean, self.std, self.max_pixel_value)
  File "/home/bluespinach/miniconda3/envs/tablenet/lib/python3.9/site-packages/albumentations/augmentations/functional.py", line 141, in normalize
    img -= mean
ValueError: operands could not be broadcast together with shapes (896,896,4) (3,) (896,896,4)

something strange is, if I input the .png file which is transferred from a pdf page, it works well. here is my testing sample, the pdf page

could you help to address why the smaller table screenshot picture doesn't work?

Thank you very much!

Best regards,
JKyang01

Error when loading landscape image

Hi,

I am getting an error when trying to load an image in landscape mode. Image dimensions: 2128 × 1695

ValueError: operands could not be broadcast together with shapes (896,896,4) (3,) (896,896,4)

Any thought??

Whether the wireless table can be detected

Excuse me, I have two questions：
I want to do table detection only without identifying the table structure. What should I do?
Whether the wireless table can be detected？