green-wood / bttr Goto Github PK

Official implementation for ICDAR 2021 best poster paper "Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer"

Home Page: https://arxiv.org/abs/2105.02412

License: MIT License

Python 94.35% Jupyter Notebook 5.65%

deep-learning handwritten-text-recognition icdar2021 latex math-recognition pytorch pytorch-lightning transformer

bttr's Introduction

Hi there 👋

🔭 Student in Peking University, Beijing, China
🧐 Multimodal Deep Learning with Transformer
🌱 Learning OCaml and functional programming style

bttr's People

Stargazers

Watchers

Forkers

kapitsa2811 yiwenzheng ntcuong2103 adak32 xh-b liuyongjie985 kwon-jaehong qinb w32zhong trendingtechnology felixdittrich92 narab ashleyyyi aikoo91 favcode zhengkaitu rysheng lanyun1103 muhtasin-mashrur-adit seankmmt awalrujaa z3plus2 mkuehn94 mojashi sharda-tech apostatee fireae ngnquan songyanyi voidexception cv-ip gulpfire aniketgurav enderxiao pkmanupati comp-7705 lym-x shaofang amritasuresh dotneet n8guillery pnrajan dtiku-cn kavitam20aie244 chihyingho dlrac 1030692824 jordanandrade7

bttr's Issues

how to use TensorBoard?

hello i don't know how to add scalar to TensorBoard? I want to do this kind of topic, hoping to improve some ExpRate, but I don’t know much about lightning TensorBoard.

How can it get pretrained model ?

Hi,
I wanna test your BTTR model but, it need to training process which will take a lot of time.
So, can you give me a pretrained model link?

Best regards.

test.py error occurs

When I run test.py code, the following error occurs. Can i get some helps?

in test.py code
test_year = "2016"
ckp_path = "pretrained model"

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
Load data from: /home/motive/PycharmProjects/BTTR/bttr/datamodule/../../data.zip
Extract data from: 2016, with data size: 1147
total  1147 batch data loaded
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Testing: 100%|██████████| 1147/1147 [07:34<00:00,  2.01s/it]ExpRate: 0.32258063554763794
length of total file: 1147
Testing: 100%|██████████| 1147/1147 [07:34<00:00,  2.52it/s]
--------------------------------------------------------------------------------
DATALOADER:0 TEST RESULTS
{}
--------------------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/motive/PycharmProjects/BTTR/test.py", line 17, in <module>
    trainer.test(model, datamodule=dm)
  File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 579, in test
    results = self._run(model)
  File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 759, in _run
    self.post_dispatch()
  File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 789, in post_dispatch
    self.accelerator.teardown()
  File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/pytorch_lightning/accelerators/gpu.py", line 51, in teardown
    self.lightning_module.cpu()
  File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/pytorch_lightning/utilities/device_dtype_mixin.py", line 141, in cpu
    return super().cpu()
  File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 471, in cpu
    return self._apply(lambda t: t.cpu())
  File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 359, in _apply
    module._apply(fn)
  File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/torchmetrics/metric.py", line 317, in _apply
    setattr(this, key, [fn(cur_v) for cur_v in current_val])
  File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/torchmetrics/metric.py", line 317, in <listcomp>
    setattr(this, key, [fn(cur_v) for cur_v in current_val])
  File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 471, in <lambda>
    return self._apply(lambda t: t.cpu())
AttributeError: 'tuple' object has no attribute 'cpu'

ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data' (/usr/local/lib/python3.8/dist-packages/torchmetrics/utilities/data.py)

Traceback (most recent call last):
File "train.py", line 1, in
from pytorch_lightning.utilities.cli import LightningCLI
File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/init.py", line 20, in
from pytorch_lightning import metrics # noqa: E402
File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/metrics/init.py", line 15, in
from pytorch_lightning.metrics.classification import ( # noqa: F401
File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/metrics/classification/init.py", line 14, in
from pytorch_lightning.metrics.classification.accuracy import Accuracy # noqa: F401
File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/metrics/classification/accuracy.py", line 18, in
from pytorch_lightning.metrics.utils import deprecated_metrics
File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/metrics/utils.py", line 22, in
from torchmetrics.utilities.data import get_num_classes as _get_num_classes
ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data' (/usr/local/lib/python3.8/dist-packages/torchmetrics/utilities/data.py)

How long does BTTR take to train?

Hi, thank you for great repository!

How long does it take to train for your experiment in the paper?
I mean training on CROHME 2014/2016/2019 on four NVIDIA 1080Ti GPUs.

Thanks,

predicting on gpu is slower

Hi ,

As this model is a bit slower compared to the existing state-of-the-art model on CPU.
So I tried to make predictions on GPU and surprisingly it slower on Gpu compare to CPU as well.

I am attaching a code snapshot here

device = torch.device('cuda')if torch.cuda.is_available() else torch.device('cpu')

model = LitBTTR.load_from_checkpoint('pretrained-2014.ckpt',map_location=device)

img = Image.open(img_path)
img = ToTensor()(img)
img.to(device)

t1 = time.time()
hyp = model.beam_search(img)
t2 = time.time()

Kindly help me out here how i can reduce prediction time

FYI - using GPU on aws g4dn.xlarge configuration machine

can you provide transfer learning code?

Hi~ @Green-Wood

I wanna apply trasnfer learning using pretrained model.

but, LightningCLI() is wrapped and difficult to customize.

Thanks & best regards.

val_exprate=0 and save checkpoint

hello!thanks for your time!
When I transfer some code in decoder or use it directly,the val_exprate are always be 0.000,I don't know why.
Another problem is,I noticed that this code don't have the function to save checkpoint or something.Can you give me some help?Thanks again!

can you provide predict.py code?

Hi ~ @Green-Wood.

I feel grateful mind for your help.
I wanna get predict.py code that prints latex from an input image.
If this code is provided, it will be very useful to others as well.

Best regards.

束搜索的时间是训练时间的4倍，这正常吗？

旋转位置编码效果怎么样

在看代码里，发现有ImageRotaryEmbed

After adding new token in dictionary getting error .

Hi ,
getting error after adding new token in dictionary.txt

Error(s) in loading state_dict for LitBTTR:
size mismatch for bttr.decoder.word_embed.0.weight: copying a param with shape torch.Size([113, 256]) from checkpoint, the shape in current model is torch.Size([115, 256]).
size mismatch for bttr.decoder.proj.weight: copying a param with shape torch.Size([113, 256]) from checkpoint, the shape in current model is torch.Size([115, 256]).
size mismatch for bttr.decoder.proj.bias: copying a param with shape torch.Size([113]) from checkpoint, the shape in current model is torch.Size([115]).

Kindly help me out how can i fix this error.