Tibetan Column Detection

Overview

This Python project focuses on generating training data for detecting columns or text blocks of tibetan texts by embedding Tibetan text into images.

It includes functions to create lorem ipsum-like Tibetan text, read random Tibetan text files from a directory, and calculate and embed text within specified bounding boxes in images. The project effectively handles Tibetan script, ensuring proper display and formatting within the images.

Features

Automated Data Generation: Simplifies the process of generating training data for Tibetan NLP tasks.
Customizable Input: Allows users to specify various input parameters like images, labels, directories for backgrounds and corporate images, etc.
Image Processing: Utilizes the PIL library for image manipulation.
Bounding Box Preparation: Includes a utility function prepare_bbox_string for handling bounding boxes.
Multiprocessing Support: Leverages multiprocessing for efficient data processing.
Debugging Mode: Includes a debug mode for troubleshooting and ensuring correct data processing.

Getting Started

Prerequisites

Python 3.x
PIL (Python Imaging Library)
YOLO utilities (for bounding box handling)
Additional Python libraries: numpy, tqdm, yaml

Installation

Clone the repository to your local machine:

git clone https://github.com/nih23/Tibetan-NLP.git
cd Tibetan-NLP

Generating training data

Training data is generated by simply running generate_training_data.py. Make sure to update folders for background images.

python generate_training_data.py

Train YOLOv8n

Training of YOLOv8n is done by a CLI call to Ultralytics.

yolo detect train data=data/yolo_tibetan/tibetan_text_boxes.yml epochs=1000 imgsz=1024

The model is then converted into a torchscript for inference:

yolo detect export model=runs/detect/train9/weights/best.pt

Inference

We can now employ our trained model for recognition and classification of tibetan text blocks as follows:

yolo predict task=detect model=runs/detect/train9/weights/best.torchscript imgsz=1024 source=data/my_inference_data/*.jpg

The results are then saved to folder runs/detect/predict

Contributions

Contributions to this project are welcome! Please fork the repository and submit a pull request with your proposed changes.

License

This project is licensed under the MIT License - see the LICENSE file for details.

nih23 / tibetan-nlp Goto Github PK

tibetan-nlp's Introduction

Tibetan Column Detection

Overview

Features

Getting Started

Prerequisites

Installation

Generating training data

Train YOLOv8n

Inference

Contributions

License

tibetan-nlp's People

Contributors

Stargazers

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent