Coder Social home page Coder Social logo

x-anylabeling's Introduction

Grounding DINO SOTA Zero-Shot Openset Object Detection Model

Recognize Anything Model Strong Image Tagging Model

Segment Anything Model Powerful Object Segmentation Anything Model
PULC PersonAttribute Model Advanced Multi-Label Classification Model

📄 Table of Contents

🥳 What's New ⏏️

  • Nov. 2023:
    • 🤗 Release the latest version 2.1.0 🤗
    • Supoort InternImage model (CVPR'23).
    • Release version 2.0.0.
    • Added support for Grounding-SAM, combining GroundingDINO with HQ-SAM to achieve sota zero-shot high-quality predictions!
    • Enhanced support for HQ-SAM model to achieve high-quality mask predictions.
    • Support the PersonAttribute and VehicleAttribute model for multi-label classification task.
    • Introducing a new multi-label attribute annotation functionality.
    • Release version 1.1.0.
    • Support pose estimation: YOLOv8-Pose.
    • Support object-level tag with yolov5_ram.
    • Add a new feature enabling batch labeling for arbitrary unknown categories based on Grounding-DINO.
  • Oct. 2023:
    • Release version 1.0.0.
    • Add a new feature for rotation box.
    • Support YOLOv5-OBB with DroneVehicle and DOTA-v1.0/v1.5/v2.0 model.
    • SOTA Zero-Shot Object Detection - GroundingDINO is released.
    • SOTA Image Tagging Model - Recognize Anything is released.
    • Support YOLOv5-SAM and YOLOv8-EfficientViT_SAM union task.
    • Support YOLOv5 and YOLOv8 segmentation task.
    • Release Gold-YOLO and DAMO-YOLO models.
    • Release MOT algorithms: OC_Sort (CVPR'23).
    • Add a new feature for small object detection using SAHI.
  • Sep. 2023:
    • Release version 0.2.4.
    • Release EfficientViT-SAM (ICCV'23),SAM-Med2D, MedSAM and YOLOv5-SAM.
    • Support ByteTrack (ECCV'22) for MOT task.
    • Support PP-OCRv4 model.
    • Add video annotation feature.
    • Add yolo/coco/voc/mot/dota export functionality.
    • Add the ability to process all images at once.
  • Aug. 2023:
    • Release version 0.2.0.
    • Release LVMSAM and it's variants BUID, ISIC, Kvasir.
    • Support lane detection algorithm: CLRNet (CVPR'22).
    • Support 2D human whole-body pose estimation: DWPose (ICCV'23 Workshop).
  • Jul. 2023:
  • Jun. 2023:
  • May. 2023:

👋 Brief Introduction ⏏️

X-AnyLabeling is an exceptional annotation tool that draws inspiration from renowned projects like LabelImg, roLabelImg, Labelme, and Anylabeling. It transcends the realm of ordinary annotation tools, representing a significant stride into the future of automated data annotation. This cutting-edge tool not only simplifies the annotation process but also seamlessly integrates state-of-the-art AI models to deliver superior results. With a strong focus on practical applications, X-AnyLabeling is purpose-built to provide developers with an industrial-grade, feature-rich solution for automating annotation and data processing across a wide range of complex tasks.

🔥 Highlight ⏏️

🗝️Key Features

  • Support for importing images and videos.
  • CPU and GPU inference support with on-demand selection.
  • Compatibility with multiple SOTA deep-learning algorithms.
  • Single-frame prediction and one-click processing for all images.
  • Export options for formats like COCO-JSON, VOC-XML, YOLOv5-TXT, DOTA-TXT and MOT-CSV.
  • Integration with popular frameworks such as PaddlePaddle, OpenMMLab, timm, and others.
  • Providing comprehensive help documentation along with active developer community support.
  • Accommodation of various visual tasks such as detection, segmentation, face recognition, and so on.
  • Modular design that empowers users to compile the system according to their specific needs and supports customization and further development.
  • Image annotation capabilities for polygons, rectangles, rotation, circles, lines, and points, as well as text detection, recognition, and KIE annotations.

⛏️Model Zoo

Object Detection SOD with SAHI Facial Landmark Detection 2D Pose Estimation
2D Lane Detection OCR MOT Instance Segmentation
Image Tagging Grounding DINO Recognition Rotation
SAM BC-SAM Skin-SAM Polyp-SAM

For more details, please refer to models_list.

📖 Tutorials ⏏️

🔜Quick Start

Download and run the GUI version directly from Release or Baidu Disk.

Note:

  • For MacOS:

    • After installation, go to the Applications folder.
    • Right-click on the application and choose Open.
    • From the second time onwards, you can open the application normally using Launchpad.
  • Due to the lack of necessary hardware, the current tool is only available in executable versions for Windows and Linux. If you require executable programs for other operating systems, e.g., MacOS, please refer to the following steps for self-compilation.

  • To obtain more stable performance and feature support, it is strongly recommended to build from source code.

👨🏼‍💻Build from source

  • Install the required libraries:
pip install -r requirements.txt

If you need to use GPU inference, install the corresponding requirements-gpu.txt file and download the appropriate version of onnxruntime-gpu based on your local CUDA and CuDNN versions. For more details, refer to the FAQ.

  • Generate resources [Option]:
pyrcc5 -o anylabeling/resources/resources.py anylabeling/resources/resources.qrc
  • Run the application:
python anylabeling/app.py

📦Build executable

It's essential to note that these steps are not obligatory for regular users; they are intended for scenarios where customization or re-distribution of executable files is necessary.

#Windows-CPU
bash scripts/build_executable.sh win-cpu

#Windows-GPU
bash scripts/build_executable.sh win-gpu

#Linux-CPU
bash scripts/build_executable.sh linux-cpu

#Linux-GPU
bash scripts/build_executable.sh linux-gpu
Note:
  1. Before compiling, please modify the __preferred_device__ parameter in the "anylabeling/app_info.py" file according to the appropriate GPU/CPU version.
  2. If you need to compile the GPU version, install the corresponding environment using "pip install -r requirements-gpu*.txt". Specifically, for compiling the GPU version, manually modify the "datas" list parameters in the "anylabeling--gpu.spec" file to include the relevant dynamic libraries (.dll or *.so) of your local onnxruntime-gpu. Additionally, when downloading the onnxruntime-gpu package, ensure compatibility with your CUDA version. You can refer to the official documentation for the specific compatibility table.
  3. For macOS versions, you can make modifications by referring to the "anylabeling-win-*.spec" script.

📋 Usage ⏏️

📌Basic usage

  1. Build and launch using the instructions above.
  2. Click Change Output Dir in the Menu/File to specify a output directory; otherwise, it will save by default in the current image path.
  3. Click Open/Open Dir/Open Video to select a specific file, folder, or video.
  4. Click the Start drawing xxx button on the left-hand toolbar or the Auto Lalbeling control to initiate.
  5. Click and release left mouse to select a region to annotate the rect box. Alternatively, you can press the "Run (i)" key for one-click processing.

Note: The annotation will be saved to the folder you specify and you can refer to the below hotkeys to speed up your workflow.

🚀Advanced usage

  • Select AutoLalbeing Button on the left side or press the shortcut key "Ctrl + A" to activate auto labeling.
  • Select one of the Segment Anything-liked Models from the dropdown menu Model, where the Quant indicates the quantization of the model.
  • Use Auto segmentation marking tools to mark the object.
    • +Point: Add a point that belongs to the object.
    • -Point: Remove a point that you want to exclude from the object.
    • +Rect: Draw a rectangle that contains the object. Segment Anything will automatically segment the object.
    • Clear: Clear all auto segmentation markings.
    • Finish Object (f): Finish the current marking. After finishing the object, you can enter the label name and save the object.

📜Docs

🧷Hotkeys

Click to Expand/Collapse
Shortcut Function
d Open next file
a Open previous file
p Create polygon
o Create rotation
r Create rectangle
i Run model
r Create rectangle
+ +point of SAM mode
- -point of SAM mode
g Group selected shapes
u Ungroup selected shapes
Ctrl + q Quit
Ctrl + i Open image file
Ctrl + o Open video file
Ctrl + u Load all images from a directory
Ctrl + e Edit label
Ctrl + j Edit polygon
Ctrl + d Duplicate polygon
Ctrl + p Toggle keep previous mode
Ctrl + y Toggle auto use last label
Ctrl + m Run all images at once
Ctrl + a Enable auto annotation
Ctrl + s Save current information
Ctrl + Shift + s Change output directory
Ctrl - Zoom out
Ctrl + 0 Zoom to Original
[Ctrl++, Ctrl+=] Zoom in
Ctrl + f Fit window
Ctrl + Shift + f Fit width
Ctrl + z Undo the last operation
Ctrl + Delete Delete file
Delete Delete polygon
Esc Cancel the selected object
Backspace Remove selected point
↑→↓← Keyboard arrows to move selected object
zxcv Keyboard to rotate selected rect box

📧 Contact ⏏️

🤗 Enjoying this project? Please give it a star! 🤗

If you find this project helpful or interesting, consider starring it to show your support, and if you have any questions or encounter any issues while using this project, feel free to reach out for assistance using the following methods:

✅ License ⏏️

This project is released under the GPL-3.0 license.

🏷️ Citing ⏏️

BibTeX

If you use this software in your research, please cite it as below:

@misc{X-AnyLabeling,
  year = {2023},
  author = {Wei Wang},
  publisher = {Github},
  organization = {CVHub},
  journal = {Github repository},
  title = {Advanced Auto Labeling Solution with Added Features},
  howpublished = {\url{https://github.com/CVHub520/X-AnyLabeling}}
}

x-anylabeling's People

Contributors

cvhub520 avatar david-19940718 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.