tomassosorio / ocr_tablenet Goto Github PK
View Code? Open in Web Editor NEWTableNet Implementation on Pytorch
TableNet Implementation on Pytorch
Hi while I am running predict.py (in Google Colab), I am getting the below error. I have uploaded the best_model.ckpt file as well
RuntimeError: [enforce fail at inline_container.cc:145] . PytorchStreamReader failed reading zip archive: invalid header or archive is corrupted
Hi,
I wrote the tablenet code with the resnet backbone with the help of your code, but there one issue which I was not able to solve that.
while the training the GPU memory get filled up after 3-4 epoch. Please help me.
I am giving a link to my model code.
https://discuss.pytorch.org/t/memory-getting-filled-up-after-3-or-4-epoch/118333?u=njnitesh
How to convert to onnx or anywhere to download onnx version? and how to use onnx?
Hi,
I am getting an error when trying to load an image in landscape mode. Image dimensions: 2128 × 1695
ValueError: operands could not be broadcast together with shapes (896,896,4) (3,) (896,896,4)
Any thought??
Thank you very much for your work. I am very interested in your work. However, when I plan to use this model to train on my own data set, I encountered some difficulties and hope to get your help.
my dataset is in the form of csv files, just like CSV datasets
metioned in (https://github.com/yhenon/pytorch-retinanet#annotations-format).
CSV datasets
The CSVGenerator provides an easy way to define your own datasets. It uses two CSV files: one file containing annotations and one file containing a class name to ID mapping.
Annotations format
The CSV file with annotations should contain one annotation per line. Images with multiple bounding boxes should use one row per bounding box. Note that indexing for pixel values starts at 0. The expected format of each line is:
path/to/image.jpg,x1,y1,x2,y2,class_name
Some images may not contain any labeled objects. To add these images to the dataset as negative examples, add an annotation where x1, y1, x2, y2 and class_name are all empty:
path/to/image.jpg,,,,,
A full example:
/data/imgs/img_001.jpg,837,346,981,456,cow
/data/imgs/img_002.jpg,215,312,279,391,cat
/data/imgs/img_002.jpg,22,5,89,84,bird
/data/imgs/img_003.jpg,,,,,
This defines a dataset with 3 images. img_001.jpg contains a cow. img_002.jpg contains a cat and a bird. img_003.jpg contains no interesting objects/animals.
Class mapping format
The class name to ID mapping file should contain one mapping per line. Each line should use the following format:
class_name,id
Indexing for classes starts at 0. Do not include a background class as it is implicit.
For example:
cow,0
cat,1
bird,2
Excuse me, I have two questions:
I want to do table detection only without identifying the table structure. What should I do?
Whether the wireless table can be detected?
Thank you for you wonderful work at first!
I want to process png file with the repo, but in predict.py about line 50, It raise ValueError: operands could not be broadcast together with shapes (896,896) (3,) (896,896) , What should I do to avoid this.
Which python version is needed such that it is compatible with all the required modules ?
Hi there @tomassosorio
Quick introduction: I need to extract data from PDF/images containing tables. Unfortunately, I have several different formats and traditional tools (PDFPlumber, Tabula, Camelot) do not seem to work for every possible format.
So now I'm trying a DL approach, and looking for some TableNet implementation code I found this repo.
I'm trying to use you code on Google Colab, but unfortunately I was not able to make it work. Notice that I have very little experience with DL libraries, so I apologise if my question is trivial.
Anyway. here's my code:
# Mount drive
from google.colab import drive
drive.mount('/content/drive', force_remount=True)
!pip install -r /content/drive/MyDrive/TableNet/requirements.txt
!python /content/drive/MyDrive/TableNet/predict.py --model_weights='/content/drive/MyDrive/TableNet/best_model.ckpt' --image_path='/content/drive/MyDrive/TableNet/TablesImages/Test_table.png'
This is the error I get:
Traceback (most recent call last):
File "/content/drive/MyDrive/TableNet/predict.py", line 19, in <module>
from tablenet import TableNetModule
File "/content/drive/MyDrive/TableNet/tablenet/__init__.py", line 3, in <module>
from .marmot import MarmotDataModule
File "/content/drive/MyDrive/TableNet/tablenet/marmot.py", line 7, in <module>
import pytorch_lightning as pl
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/__init__.py", line 66, in <module>
from pytorch_lightning import metrics
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/metrics/__init__.py", line 14, in <module>
from pytorch_lightning.metrics.metric import Metric
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/metrics/metric.py", line 23, in <module>
from pytorch_lightning.metrics.utils import _flatten, dim_zero_cat, dim_zero_mean, dim_zero_sum
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/metrics/utils.py", line 18, in <module>
from pytorch_lightning.utilities import rank_zero_warn
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/utilities/__init__.py", line 24, in <module>
from pytorch_lightning.utilities.apply_func import move_data_to_device
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/utilities/apply_func.py", line 25, in <module>
from torchtext.data import Batch
File "/usr/local/lib/python3.7/dist-packages/torchtext/__init__.py", line 6, in <module>
from . import experimental
File "/usr/local/lib/python3.7/dist-packages/torchtext/experimental/__init__.py", line 2, in <module>
from . import transforms
File "/usr/local/lib/python3.7/dist-packages/torchtext/experimental/transforms.py", line 4, in <module>
from torchtext._torchtext import RegexTokenizer as RegexTokenizerPybind
ImportError: /usr/local/lib/python3.7/dist-packages/torchtext/_torchtext.so: undefined symbol: _ZNK3c104Type14isSubtypeOfExtERKSt10shared_ptrIS0_EPSo
I have to admit I have no idea what is causing the error. Could you please help me?
Thanks a lot and great work!
Hi I am just testing this model on the regular images that I randomly took from papers.
However, I always encounter the error:
ValueError: operands could not be broadcast together with shapes (896,896,4) (3,) (896,896,4)
I figure out this may come from the Normalize and I change the input to
image = Image.open(image_path).convert("RGB")
and it works. Maybe a fix?
Hi, your project OCR_tablenet(commit id: 5f4d781) requires "albumentations==0.5.2" in its dependency. After analyzing the source code, we found that the following versions of albumentations can also be suitable, i.e., albumentations 0.5.1, since all functions that you directly (7 APIs: albumentations.pytorch.transforms.ToTensorV2.init, albumentations.augmentations.transforms.Resize.init, albumentations.core.composition.Compose.init, albumentations.augmentations.transforms.RandomResizedCrop.init, albumentations.augmentations.transforms.VerticalFlip.init, albumentations.augmentations.transforms.HorizontalFlip.init, albumentations.augmentations.transforms.Normalize.init) or indirectly (propagate to 12 albumentations's internal APIs and 0 outsider APIs) used from the package have not been changed in these versions, thus not affecting your usage.
Therefore, we believe that it is quite safe to loose your dependency on albumentations from "albumentations==0.5.2" to "albumentations>=0.5.1,<=0.5.2". This will improve the applicability of OCR_tablenet and reduce the possibility of any further dependency conflict with other projects.
May I pull a request to further loosen the dependency on albumentations?
By the way, could you please tell us whether such an automatic tool for dependency analysis may be potentially helpful for maintaining dependencies easier during your development?
I took 18 samples from your data.zip and their table and column mask for training. I wanted to overfit on those samples. WHile training and at 100+ Epochs I don't see the loss reducing that much.
I am using your training script in collab.
Attached is 18 training samples.
please add license information in the project
I am using Google Colab. Upon running python predict.py
, I am getting the following error:
Traceback (most recent call last):
File "predict.py", line 142, in <module>
predict()
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "predict.py", line 135, in predict
pred = Predict(model_weights, transforms)
File "predict.py", line 38, in __init__
self.model = TableNetModule.load_from_checkpoint(checkpoint_path)
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/core/saving.py", line 137, in load_from_checkpoint
checkpoint = pl_load(checkpoint_path, map_location=lambda storage, loc: storage)
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/utilities/cloud_io.py", line 32, in load
return torch.load(f, map_location=map_location)
File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 595, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 764, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, '<'.
Also, I am unable to install many dependencies in the requirements.txt
file using pip 21.0.1
For eg. numpy 1.20.0 is not available.
Thanks for your project!
Table position detection performs well. But when I use non-training set images for testing ,The table information extraction effect is very poor ,using pytesseract OCR. Maybe there is a problem with my picture format?
Traceback (most recent call last):
File "predict.py", line 142, in
predict()
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 829, in call
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "predict.py", line 135, in predict
pred = Predict(model_weights, transforms)
File "predict.py", line 38, in init
self.model = TableNetModule.load_from_checkpoint('/content/best_model.ckpt')
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/core/saving.py", line 137, in load_from_checkpoint
checkpoint = pl_load(checkpoint_path, map_location=lambda storage, loc: storage)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/utilities/cloud_io.py", line 32, in load
return torch.load(f, map_location=map_location)
File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 587, in load
with _open_zipfile_reader(opened_file) as opened_zipfile:
File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 242, in init
super(_open_zipfile_reader, self).init(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: [enforce fail at inline_container.cc:145] . PytorchStreamReader failed reading zip archive: failed finding central directory
all the requirements are installed but raising error
pytesseract.pytesseract.TesseractNotFoundError: tesseract is not installed or it's not in your PATH. See README file for more information.
Hi, I get " SList Object" after predicting my test images. I need to convert it to CSV/ pandas data frame format. Can anybody suggest to me how can I do it?
Thanks in advance.
Tried to allocate 1.91 GiB (GPU 0; 7.93 GiB total capacity; 5.33 GiB already allocated; 1.91 GiB free; 5.36 GiB reserved in total by PyTorch)
When i try to run python train.py
Thanks for your perfect works.
I need get the bounding box of the cells. Do you have any ideas for this?
Hi,
I get a pop-up error when running py predict.py:
"The procedure entry point
torchCheckFail@detail[....] could not be located in the dynamic link library C:\Users[...]\tablenetVirtualEnv\Lib\site-packages\torchvision_C.pyd.
"
It can be linked with the problem that I cannot install torchvision 0.8.2. It doesn't find it.
I can't find the version for Windows (i am using windows 10, python 3.9.2) either: https://pypi.org/project/torchvision/0.8.2/#files
So I try with 0.9.2 but there is this pop up. In the end I can ignore it and it will do an extraction:
py predict.py
[ 0 1 2 3 4 5
0 Wellname ‘Toss (ft) LOT (psi) ‘Sv (psi) ‘Sv-LOT (psi) Normalized
1 117-4 4265 2785 2871 116 LoTisv
2 117-4 631 4465 4782 216 0.96
3 117-4 7615 5690 5045 246 0.94
4 117-4 9528 7439 m2 333 0.96
5 172 414 2042 089 247 0.96
6 172 821 4684 sigt 505 0.92
7 172 6361 6815 464 0.90
8 172 8550 10368 10150 218 0.93
9 Popeye 11982 aria 2828 247 1.02
10 Average 4491 4742 4901 16 0.95
11 161-4 6778 6075 7105 159 0.96
12 161-4 2639 2784 130 0.97
13 161-4 9154 27a 2857 145 0.98
14 205-2 4442 5989 6105 16 0.95
15 205-2 4518 6685 6786 16 0.96
16 205-2 8091 6670 eats 401 0.98
17 205-2 9812 2087 3103 145 0.99
18 205-2 8845 3085 161 16 0.98
19 205-42 4806 4597 arta 16 0.96
20 205.414 4862 4713 4829 16 0.96
21 205.14 6588 16 0.97
22 205.414 NaN NaN 131 0.98
23 Genesis 6703 NaN NaN 0.97
24 Average NaN NaN NaN
25 NaN NaN NaN NaN NaN]
How could I remove that problem?
I use anaconda to manage the virtual environment and run the package to avoid package conflicts
then I encountered the module not fund error.
I checked with pip list and found the package is installed.
that means the package tensorboard installed but could not be found by conda
later I figured out it needs to run the additional commands if running the script under conda virtual environment
$ conda install -c conda-forge tensorboard==2.4.1
$ conda install -y -c conda-forge protobuf==3.14.0
Reference:
https://stackoverflow.com/questions/61320572/modulenotfounderror-no-module-named-tensorboard
https://stackoverflow.com/questions/58686400/can-not-get-pytorch-working-with-tensorboard
i am testing the model with different cases and something strange is, if I just feed it a screenshot .png picture of a table (sample attached)
and I run the command
python predict.py --model_weights='./tablenet_pretrained.ckpt' --image_path='./sample_table2.png'
it gives me value error operands could not be broadcast together with shapes (896,896,4) (3,) (896,896,4)
here is the full error log:
File "/home/bluespinach/Documents/projects/tool_exp/OCR_tablenet/predict.py", line 148, in <module>
predict()
File "/home/bluespinach/miniconda3/envs/tablenet/lib/python3.9/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
File "/home/bluespinach/miniconda3/envs/tablenet/lib/python3.9/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/home/bluespinach/miniconda3/envs/tablenet/lib/python3.9/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/bluespinach/miniconda3/envs/tablenet/lib/python3.9/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/home/bluespinach/Documents/projects/tool_exp/OCR_tablenet/predict.py", line 144, in predict
print(pred.predict(image))
File "/home/bluespinach/Documents/projects/tool_exp/OCR_tablenet/predict.py", line 50, in predict
processed_image = self.transforms(image=np.array(image))["image"]
File "/home/bluespinach/miniconda3/envs/tablenet/lib/python3.9/site-packages/albumentations/core/composition.py", line 182, in __call__
data = t(force_apply=force_apply, **data)
File "/home/bluespinach/miniconda3/envs/tablenet/lib/python3.9/site-packages/albumentations/core/transforms_interface.py", line 89, in __call__
return self.apply_with_params(params, **kwargs)
File "/home/bluespinach/miniconda3/envs/tablenet/lib/python3.9/site-packages/albumentations/core/transforms_interface.py", line 102, in apply_with_params
res[key] = target_function(arg, **dict(params, **target_dependencies))
File "/home/bluespinach/miniconda3/envs/tablenet/lib/python3.9/site-packages/albumentations/augmentations/transforms.py", line 1496, in apply
return F.normalize(image, self.mean, self.std, self.max_pixel_value)
File "/home/bluespinach/miniconda3/envs/tablenet/lib/python3.9/site-packages/albumentations/augmentations/functional.py", line 141, in normalize
img -= mean
ValueError: operands could not be broadcast together with shapes (896,896,4) (3,) (896,896,4)
something strange is, if I input the .png file which is transferred from a pdf page, it works well. here is my testing sample, the pdf page
could you help to address why the smaller table screenshot picture doesn't work?
Thank you very much!
Best regards,
JKyang01
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.