Coder Social home page Coder Social logo

kadirnar / segment-anything-video Goto Github PK

View Code? Open in Web Editor NEW
942.0 12.0 67.0 887 KB

MetaSeg: Packaged version of the Segment Anything repository

License: Apache License 2.0

Python 100.00%
object-detection segmentation segment-anything object-segmentation yolov5 yolov6 yolov7 yolov8

segment-anything-video's Introduction

MetaSeg: Packaged version of the Segment Anything repository

teaser
downloads HuggingFace Spaces

Package version Download Count Supported Python versions Project Status pre-commit.ci

This repo is a packaged version of the segment-anything model.

Installation

pip install metaseg

Usage

from metaseg import SegAutoMaskPredictor, SegManualMaskPredictor

# If gpu memory is not enough, reduce the points_per_side and points_per_batch.

# For image
results = SegAutoMaskPredictor().image_predict(
    source="image.jpg",
    model_type="vit_l", # vit_l, vit_h, vit_b
    points_per_side=16,
    points_per_batch=64,
    min_area=0,
    output_path="output.jpg",
    show=True,
    save=False,
)

# For video
results = SegAutoMaskPredictor().video_predict(
    source="video.mp4",
    model_type="vit_l", # vit_l, vit_h, vit_b
    points_per_side=16,
    points_per_batch=64,
    min_area=1000,
    output_path="output.mp4",
)

# For manuel box and point selection

# For image
results = SegManualMaskPredictor().image_predict(
    source="image.jpg",
    model_type="vit_l", # vit_l, vit_h, vit_b
    input_point=[[100, 100], [200, 200]],
    input_label=[0, 1],
    input_box=[100, 100, 200, 200], # or [[100, 100, 200, 200], [100, 100, 200, 200]]
    multimask_output=False,
    random_color=False,
    show=True,
    save=False,
)

# For video

results = SegManualMaskPredictor().video_predict(
    source="video.mp4",
    model_type="vit_l", # vit_l, vit_h, vit_b
    input_point=[0, 0, 100, 100],
    input_label=[0, 1],
    input_box=None,
    multimask_output=False,
    random_color=False,
    output_path="output.mp4",
)

SAHI + Segment Anything

pip install sahi metaseg
from metaseg.sahi_predict import SahiAutoSegmentation, sahi_sliced_predict

image_path = "image.jpg"
boxes = sahi_sliced_predict(
    image_path=image_path,
    detection_model_type="yolov5",  # yolov8, detectron2, mmdetection, torchvision
    detection_model_path="yolov5l6.pt",
    conf_th=0.25,
    image_size=1280,
    slice_height=256,
    slice_width=256,
    overlap_height_ratio=0.2,
    overlap_width_ratio=0.2,
)

SahiAutoSegmentation().image_predict(
    source=image_path,
    model_type="vit_b",
    input_box=boxes,
    multimask_output=False,
    random_color=False,
    show=True,
    save=False,
)

teaser

FalAI(Cloud GPU) + Segment Anything

pip install metaseg fal_serverless
fal-serverless auth login
# For Auto Mask
from metaseg import falai_automask_image

image = falai_automask_image(
    image_path="image.jpg",
    model_type="vit_b",
    points_per_side=16,
    points_per_batch=32,
    min_area=0,
)
image.show() # Show image
image.save("output.jpg") # Save image

# For Manual Mask
from metaseg import falai_manuelmask_image

image = falai_manualmask_image(
    image_path="image.jpg",
    model_type="vit_b",
    input_point=[[100, 100], [200, 200]],
    input_label=[0, 1],
    input_box=[100, 100, 200, 200], # or [[100, 100, 200, 200], [100, 100, 200, 200]],
    multimask_output=False,
    random_color=False,
)
image.show() # Show image
image.save("output.jpg") # Save image

Extra Features

  • Support for Yolov5/8, Detectron2, Mmdetection, Torchvision models
  • Support for video and web application(Huggingface Spaces)
  • Support for manual single multi box and point selection
  • Support for pip installation
  • Support for SAHI library
  • Support for FalAI

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.