Coder Social home page Coder Social logo

bentoml / ocr-as-a-service Goto Github PK

View Code? Open in Web Editor NEW
48.0 5.0 2.0 2.76 MB

Turn any OCR models into online inference API endpoint πŸš€ πŸŒ–

Home Page: https://bentoml.com

Python 100.00%
ocr ocr-python ai-applications model-deployment model-serving

ocr-as-a-service's Introduction

OCR as a Service


Turn any OCR models into online inference API endpoint πŸš€
Powered by BentoML 🍱

πŸ“– Introduction πŸ“–

This project demonstrates how to effortlessly serve an OCR model using BentoML. It accepts PDFs as input and returns the text contained within. The service employs Microsoft's DiT using Meta's detectron2 for image segmentation and EasyOCR for OCR.

Architecture

πŸƒβ€β™‚οΈ Running the Service πŸƒβ€β™‚οΈ

Containers

The most convenient way to run this service is through containers, as the project relies on numerous external dependencies. We provide two pre-built containers optimized for CPU and GPU usage, respectively.

To run the service, you'll need a container engine such as Docker, Podman, etc. Quickly test the service by running the appropriate container:

# cpu
docker run -p 3000:3000 ghcr.io/bentoml/ocr-as-a-service:cpu

# gpu
docker run --gpus all -p 3000:3000 ghcr.io/bentoml/ocr-as-a-service:gpu

BentoML CLI


Prerequisite πŸ“‹

βœ… Python

This project requires Python 3.8 or higher.

βœ… Poppler, to convert pdf to image

On MacOS, make sure to install poppler to use pdf2image:

brew install poppler

On Linux distros, install pdftoppm and pdftocairo using your package manager, i.e. with apt-get:

sudo apt install poppler-utils

βœ… Python Development Package

To build the Detectron2 wheel, python3-dev package is required. On Linux distros, run the following:

sudo apt install python3-dev

You may need to install a specific version of python3-dev, e.g., python3.10-dev for Python 3.10.

For MacOS, Python Development Package is installed by default.

Refer to Detectron2 installation page for platform specific instructions and further troubleshootings.


Once you have all prerequisite installed, clone the repository and install the dependencies:

git clone https://github.com/bentoml/OCR-as-a-Service.git && cd OCR-as-a-Service

pip install -r requirements/pypi.txt

# This depends on PyTorch, hence needs to be installed afterwards
pip install 'git+https://github.com/facebookresearch/detectron2.git'

To serve the model with BentoML:

bentoml serve

You can then open your browser at http://127.0.0.1:3000 and interact with the service through Swagger UI.

🌐 Interacting with the Service 🌐

BentoML's default model serving method is through an HTTP server. In this section, we demonstrate various ways to interact with the service:

cURL

curl -X 'POST' \
  'http://localhost:3000/image_to_text' \
  -H 'accept: application/pdf' \
  -H 'Content-Type: multipart/form-data' \
  -F file=@path-to-pdf

Replace path-to-pdf with the file path of the PDF you want to send to the service.

Via BentoClient 🐍

To send requests in Python, one can use bentoml.client.Client to send requests to the service. Check out client.py for the example code.

Swagger UI

You can use Swagger UI to quickly explore the available endpoints of any BentoML service. Swagger UI

πŸš€ Deploying to Production πŸš€

Effortlessly transition your project into a production-ready application using BentoCloud, the production-ready platform for managing and deploying machine learning models.

Start by creating a BentoCloud account. Once you've signed up, log in to your BentoCloud account using the command:

bentoml cloud login --api-token <your-api-token> --endpoint <bento-cloud-endpoint>

Note: Replace <your-api-token> and <bento-cloud-endpoint> with your specific API token and the BentoCloud endpoint respectively.

Next, build your BentoML service using the build command:

bentoml build

Then, push your freshly-built Bento service to BentoCloud using the push command:

bentoml push <name:version>

Lastly, deploy this application to BentoCloud with a single bentoml deployment create command following the deployment instructions.

BentoML offers a number of options for deploying and hosting online ML services into production, learn more at Deploying a Bento.

πŸ‘₯ Community πŸ‘₯

BentoML has a thriving open source community where thousands of ML/AI practitioners are contributing to the project, helping other users and discussing the future of AI. πŸ‘‰ Pop into our Slack community!

ocr-as-a-service's People

Contributors

aarnphm avatar haivilo avatar jianshen92 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

ocr-as-a-service's Issues

Can't test the docker containers as is

Hi.

I run the command from README:

docker run --gpus all -p 3000:3000 ghcr.io/bentoml/ocr-as-a-service:gpu

and after pulling it throws an error:

FileNotFoundError: BentoML config file specified in ENV VAR not found: 'BENTOML_CONFIG=./config/default.yaml'

I've tried to create a dummy config with this content:

version: 1
api_server:
  workers: 4

which results in another error:

2023-09-05T07:20:23+0000 [ERROR] [cli] Failed to download https://layoutlm.blob.core.windows.net/dit/dit-fts/publaynet_dit-b_cascade.pth
Traceback (most recent call last):
  File "/usr/local/bin/bentoml", line 8, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/bentoml_cli/utils.py", line 334, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/bentoml_cli/utils.py", line 305, in wrapper
    return_value = func(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/bentoml_cli/utils.py", line 262, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/bentoml_cli/env_manager.py", line 122, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/bentoml_cli/serve.py", line 218, in serve
    serve_http_production(
  File "/usr/local/lib/python3.9/site-packages/simple_di/__init__.py", line 139, in _
    return func(*_inject_args(bind.args), **_inject_kwargs(bind.kwargs))
  File "/usr/local/lib/python3.9/site-packages/bentoml/serve.py", line 266, in serve_http_production
    svc = load(bento_identifier, working_dir=working_dir)
  File "/usr/local/lib/python3.9/site-packages/bentoml/_internal/service/loader.py", line 328, in load
    svc = load_bento_dir(bento_path, standalone_load=standalone_load)
  File "/usr/local/lib/python3.9/site-packages/bentoml/_internal/service/loader.py", line 236, in load_bento_dir
    return _load_bento(bento, standalone_load)
  File "/usr/local/lib/python3.9/site-packages/bentoml/_internal/service/loader.py", line 246, in _load_bento
    svc = import_service(
  File "/usr/local/lib/python3.9/site-packages/simple_di/__init__.py", line 139, in _
    return func(*_inject_args(bind.args), **_inject_kwargs(bind.kwargs))
  File "/usr/local/lib/python3.9/site-packages/bentoml/_internal/service/loader.py", line 137, in import_service
    module = importlib.import_module(module_name, package=working_dir)
  File "/usr/local/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 850, in exec_module
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "/home/bentoml/bento/src/service.py", line 12, in <module>
    from warmup import convert_pdf_to_images
  File "/home/bentoml/bento/src/warmup.py", line 121, in <module>
    predictor = dit.get_predictor(cfg)
  File "/home/bentoml/bento/src/dit/__init__.py", line 38, in get_predictor
    return DefaultPredictor(get_cfg(cfg))
  File "/usr/local/lib/python3.9/site-packages/detectron2/engine/defaults.py", line 288, in __init__
    checkpointer.load(cfg.MODEL.WEIGHTS)
  File "/usr/local/lib/python3.9/site-packages/detectron2/checkpoint/detection_checkpoint.py", line 61, in load
    path = self.path_manager.get_local_path(path)
  File "/usr/local/lib/python3.9/site-packages/iopath/common/file_io.py", line 1197, in get_local_path
    bret = handler._get_local_path(path, force=force, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/iopath/common/file_io.py", line 797, in _get_local_path
    cached = download(path, dirname, filename=filename)
  File "/usr/local/lib/python3.9/site-packages/iopath/common/download.py", line 58, in download
    tmp, _ = request.urlretrieve(url, filename=tmp, reporthook=hook(t))
  File "/usr/local/lib/python3.9/urllib/request.py", line 239, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/usr/local/lib/python3.9/urllib/request.py", line 214, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/local/lib/python3.9/urllib/request.py", line 523, in open
    response = meth(req, response)
  File "/usr/local/lib/python3.9/urllib/request.py", line 632, in http_response
    response = self.parent.error(
  File "/usr/local/lib/python3.9/urllib/request.py", line 561, in error
    return self._call_chain(*args)
  File "/usr/local/lib/python3.9/urllib/request.py", line 494, in _call_chain
    result = func(*args)
  File "/usr/local/lib/python3.9/urllib/request.py", line 641, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 409: Public access is not permitted on this storage account.

Same issues is with cpu docker image.

I suspect the config file should include some credentials I don't have access to. How can I fix this issue to test the service?

Easyocr does not work with python3.11

Collecting opencv-python-headless<=4.5.4.60 (from easyocr->-r requirements/pypi.txt (line 4))
  Using cached opencv-python-headless-4.5.4.60.tar.gz (89.8 MB)
  Installing build dependencies ... error
  error: subprocess-exited-with-error
  
  Γ— pip subprocess to install build dependencies did not run successfully.
  β”‚ exit code: 1
  ╰─> [19 lines of output]
      Ignoring numpy: markers 'python_version == "3.6" and platform_machine != "aarch64" and platform_machine != "arm64"' don't match your environment
      Ignoring numpy: markers 'python_version == "3.7" and platform_machine != "aarch64" and platform_machine != "arm64"' don't match your environment
      Ignoring numpy: markers 'python_version == "3.8" and platform_machine != "aarch64" and platform_machine != "arm64"' don't match your environment
      Ignoring numpy: markers 'python_version <= "3.9" and sys_platform == "linux" and platform_machine == "aarch64"' don't match your environment
      Ignoring numpy: markers 'python_version <= "3.9" and sys_platform == "darwin" and platform_machine == "arm64"' don't match your environment
      Ignoring numpy: markers 'python_version == "3.9" and platform_machine != "aarch64" and platform_machine != "arm64"' don't match your environment
      Collecting setuptools
        Using cached setuptools-67.7.2-py3-none-any.whl (1.1 MB)
      Collecting wheel
        Using cached wheel-0.40.0-py3-none-any.whl (64 kB)
      Collecting scikit-build
        Using cached scikit_build-0.17.3-py3-none-any.whl (82 kB)
      Collecting cmake
        Using cached cmake-3.26.3-py2.py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (24.0 MB)
      Collecting pip
        Using cached pip-23.1.2-py3-none-any.whl (2.1 MB)
      ERROR: Ignored the following versions that require a different python version: 1.21.2 Requires-Python >=3.7,<3.11; 1.21.3 Requires-Python >=3.7,<3.11; 1.21.4 Requires-Python >=3.7,<3.11; 1.21.5 Requires-Python >=3.7,<3.11; 1.21.6 Requires-Python >=3.7,<3.11
      ERROR: Could not find a version that satisfies the requirement numpy==1.21.2 (from versions: 1.3.0, 1.4.1, 1.5.0, 1.5.1, 1.6.0, 1.6.1, 1.6.2, 1.7.0, 1.7.1, 1.7.2, 1.8.0, 1.8.1, 1.8.2, 1.9.0, 1.9.1, 1.9.2, 1.9.3, 1.10.0.post2, 1.10.1, 1.10.2, 1.10.4, 1.11.0, 1.11.1, 1.11.2, 1.11.3, 1.12.0, 1.12.1, 1.13.0, 1.13.1, 1.13.3, 1.14.0, 1.14.1, 1.14.2, 1.14.3, 1.14.4, 1.14.5, 1.14.6, 1.15.0, 1.15.1, 1.15.2, 1.15.3, 1.15.4, 1.16.0, 1.16.1, 1.16.2, 1.16.3, 1.16.4, 1.16.5, 1.16.6, 1.17.0, 1.17.1, 1.17.2, 1.17.3, 1.17.4, 1.17.5, 1.18.0, 1.18.1, 1.18.2, 1.18.3, 1.18.4, 1.18.5, 1.19.0, 1.19.1, 1.19.2, 1.19.3, 1.19.4, 1.19.5, 1.20.0, 1.20.1, 1.20.2, 1.20.3, 1.21.0, 1.21.1, 1.22.0, 1.22.1, 1.22.2, 1.22.3, 1.22.4, 1.23.0rc1, 1.23.0rc2, 1.23.0rc3, 1.23.0, 1.23.1, 1.23.2, 1.23.3, 1.23.4, 1.23.5, 1.24.0rc1, 1.24.0rc2, 1.24.0, 1.24.1, 1.24.2, 1.24.3)
      ERROR: No matching distribution found for numpy==1.21.2
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

Canβ€˜t pull image, permission denied

Unable to pull image from ghcr.io, I initially thought it was due to missing login information, but even after logging in with a Personal Access Token, it still shows "Cannot pull image, permission denied",I don't know if this is an exception or if other people will encounter the same problem.
image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.