Coder Social home page Coder Social logo

kadirnar / segment-anything-video Goto Github PK

View Code? Open in Web Editor NEW
939.0 12.0 67.0 884 KB

MetaSeg: Packaged version of the Segment Anything repository

License: Apache License 2.0

Python 100.00%
object-detection segmentation segment-anything object-segmentation yolov5 yolov6 yolov7 yolov8

segment-anything-video's Introduction

MetaSeg: Packaged version of the Segment Anything repository

teaser
downloads HuggingFace Spaces

Package version Download Count Supported Python versions Project Status pre-commit.ci

This repo is a packaged version of the segment-anything model.

Installation

pip install metaseg

Usage

from metaseg import SegAutoMaskPredictor, SegManualMaskPredictor

# If gpu memory is not enough, reduce the points_per_side and points_per_batch.

# For image
results = SegAutoMaskPredictor().image_predict(
    source="image.jpg",
    model_type="vit_l", # vit_l, vit_h, vit_b
    points_per_side=16,
    points_per_batch=64,
    min_area=0,
    output_path="output.jpg",
    show=True,
    save=False,
)

# For video
results = SegAutoMaskPredictor().video_predict(
    source="video.mp4",
    model_type="vit_l", # vit_l, vit_h, vit_b
    points_per_side=16,
    points_per_batch=64,
    min_area=1000,
    output_path="output.mp4",
)

# For manuel box and point selection

# For image
results = SegManualMaskPredictor().image_predict(
    source="image.jpg",
    model_type="vit_l", # vit_l, vit_h, vit_b
    input_point=[[100, 100], [200, 200]],
    input_label=[0, 1],
    input_box=[100, 100, 200, 200], # or [[100, 100, 200, 200], [100, 100, 200, 200]]
    multimask_output=False,
    random_color=False,
    show=True,
    save=False,
)

# For video

results = SegManualMaskPredictor().video_predict(
    source="video.mp4",
    model_type="vit_l", # vit_l, vit_h, vit_b
    input_point=[0, 0, 100, 100],
    input_label=[0, 1],
    input_box=None,
    multimask_output=False,
    random_color=False,
    output_path="output.mp4",
)

SAHI + Segment Anything

pip install sahi metaseg
from metaseg.sahi_predict import SahiAutoSegmentation, sahi_sliced_predict

image_path = "image.jpg"
boxes = sahi_sliced_predict(
    image_path=image_path,
    detection_model_type="yolov5",  # yolov8, detectron2, mmdetection, torchvision
    detection_model_path="yolov5l6.pt",
    conf_th=0.25,
    image_size=1280,
    slice_height=256,
    slice_width=256,
    overlap_height_ratio=0.2,
    overlap_width_ratio=0.2,
)

SahiAutoSegmentation().image_predict(
    source=image_path,
    model_type="vit_b",
    input_box=boxes,
    multimask_output=False,
    random_color=False,
    show=True,
    save=False,
)

teaser

FalAI(Cloud GPU) + Segment Anything

pip install metaseg fal_serverless
fal-serverless auth login
# For Auto Mask
from metaseg import falai_automask_image

image = falai_automask_image(
    image_path="image.jpg",
    model_type="vit_b",
    points_per_side=16,
    points_per_batch=32,
    min_area=0,
)
image.show() # Show image
image.save("output.jpg") # Save image

# For Manual Mask
from metaseg import falai_manuelmask_image

image = falai_manualmask_image(
    image_path="image.jpg",
    model_type="vit_b",
    input_point=[[100, 100], [200, 200]],
    input_label=[0, 1],
    input_box=[100, 100, 200, 200], # or [[100, 100, 200, 200], [100, 100, 200, 200]],
    multimask_output=False,
    random_color=False,
)
image.show() # Show image
image.save("output.jpg") # Save image

Extra Features

  • Support for Yolov5/8, Detectron2, Mmdetection, Torchvision models
  • Support for video and web application(Huggingface Spaces)
  • Support for manual single multi box and point selection
  • Support for pip installation
  • Support for SAHI library
  • Support for FalAI

segment-anything-video's People

Contributors

kadirnar avatar onuralpszr avatar pranjalya avatar pre-commit-ci[bot] avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

segment-anything-video's Issues

AttributeError: 'list' object has no attribute 'astype'

results = SegManualMaskPredictor().image_predict(
source="C:\software\sam_checkpoint\3515.jpg",
model_type="vit_h", # vit_l, vit_h, vit_b
input_point=[[548, 1031], [1121, 769]],
input_label=[0, 1],
input_box=[229, 684, 800, 800], # or [[100, 100, 200, 200], [100, 100, 200, 200]]
multimask_output=False,
random_color=False,
#output_path="C:\software\sam_checkpoint\output2.jpg",
show=False,
save=True,
)

======================
AttributeError Traceback (most recent call last)
Cell In[43], line 1
----> 1 results = SegManualMaskPredictor().image_predict(
2 source="C:\software\sam_checkpoint\3515.jpg",
3 model_type="vit_h", # vit_l, vit_h, vit_b
4 input_point=[[548, 1031], [1121, 769]],
5 input_label=[0, 1],
6 input_box=[229, 684, 800, 800], # or [[100, 100, 200, 200], [100, 100, 200, 200]]
7 multimask_output=False,
8 random_color=False,
9 #output_path="C:\software\sam_checkpoint\output2.jpg",
10 show=False,
11 save=True,
12 )

File ~\anaconda3\lib\site-packages\metaseg\mask_predictor.py:175, in SegManualMaskPredictor.image_predict(self, source, model_type, input_box, input_point, input_label, multimask_output, output_path, random_color, show, save)
172 elif type(input_box[0]) == int:
173 input_boxes = np.array(input_box)[None, :]
--> 175 masks, _, _ = predictor.predict(
176 point_coords=input_point,
177 point_labels=input_label,
178 box=input_boxes,
179 multimask_output=multimask_output,
180 )
181 mask_image = load_mask(masks, random_color)
182 image = load_box(input_box, image)

File ~\anaconda3\lib\site-packages\metaseg\generator\predictor.py:139, in SamPredictor.predict(self, point_coords, point_labels, box, mask_input, multimask_output, return_logits)
137 if point_coords is not None:
138 assert point_labels is not None, "point_labels must be supplied if point_coords is supplied."
--> 139 point_coords = self.transform.apply_coords(point_coords, self.original_size)
140 coords_torch = torch.as_tensor(point_coords, dtype=torch.float, device=self.device)
141 labels_torch = torch.as_tensor(point_labels, dtype=torch.int, device=self.device)

File ~\anaconda3\lib\site-packages\metaseg\utils\transforms.py:40, in ResizeLongestSide.apply_coords(self, coords, original_size)
38 old_h, old_w = original_size
39 new_h, new_w = self.get_preprocess_shape(original_size[0], original_size[1], self.target_length)
---> 40 coords = deepcopy(coords).astype(float)
41 coords[..., 0] = coords[..., 0] * (new_w / old_w)
42 coords[..., 1] = coords[..., 1] * (new_h / old_h)

AttributeError: 'list' object has no attribute 'astype'

segmented result json

Great job for the project. Is there a way I can get the segmented image in json for its x, y position and its own png file?

Suggestion - Integrate MobileSAM into the pipeline for lightweight and faster inference

Reference: https://github.com/ChaoningZhang/MobileSAM

Our project performs on par with the original SAM and keeps exactly the same pipeline as the original SAM except for a change on the image encode, therefore, it is easy to Integrate into any project.

MobileSAM is around 60 times smaller and around 50 times faster than original SAM, and it is around 7 times smaller and around 5 times faster than the concurrent FastSAM. The comparison of the whole pipeline is summarzed as follows:

image

image

Best Wishes,

Qiao

Input: OpenCV images

Hello,

It seems the example of AutoDetection uses filename. Can it support image as well?

Thanks

I have this problem.IndexError: list index out of range.

[]
vit_b model already exists as 'vit_b.pth'. Skipping download.
Traceback (most recent call last):
File "e:/AIGC/segment-anything-video/test", line 17, in
SahiAutoSegmentation().predict(
File "e:\AIGC\segment-anything-video\metaseg\sahi_predict.py", line 87, in predict
if type(input_box[0]) == list:
IndexError: list index out of range
how to deal it?

def predict(
    self,
    source,
    model_type,
    input_box=None,
    input_point=None,
    input_label=None,
    multimask_output=False,
    random_color=False,
    show=False,
    save=False,
):

    read_image = load_image(source)
    model = self.load_model(model_type)
    predictor = SamPredictor(model)
    predictor.set_image(read_image)

this if type(input_box[0]) == list:
input_boxes, new_boxes = multi_boxes(input_box, predictor, read_image)

        masks, _, _ = predictor.predict_torch(
            point_coords=None,
            point_labels=None,

installed metaseg but it doesn't download models

Tried to use metaseg in image and video project but couldn't do anything as it claimed that it doesn't know a model_type: self.segmentor = SegAutoMaskPredictor(model_type="vit_l", points_per_side=16, points_per_batch=64)
TypeError: init() got an unexpected keyword argument 'model_type'

Code:
import cv2
from PyQt6 import QtCore, QtGui, QtWidgets
from led_grid import LedGrid
from metaseg import SegAutoMaskPredictor

class HandTracker(QtCore.QObject):
def init(self, grid):
super().init()
self.grid = grid
self.cap = cv2.VideoCapture(0)
self.segmentor = SegAutoMaskPredictor(model_type="vit_l", points_per_side=16, points_per_batch=64)

def load_video(self, filepath):
    self.cap = cv2.VideoCapture(filepath)

def step(self):
    success, image = self.cap.read()
    if not success:
        print("Ignoring empty camera frame.")
        return

    image = cv2.cvtColor(cv2.flip(image, 1), cv2.COLOR_BGR2RGB)
    results = self.segmentor.image_predict(
        source=image,
        min_area=1000,
        show=False,
        save=False,
    )

Failed to load image Python extension: libtorch_cuda_cu.so: cannot open shared object file: No such file or directory

(base) root@185:~/Track-Anything# python app.py --device cuda:0 --sam_model_type vit_h --port 80
/root/anaconda3/lib/python3.10/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: libtorch_cuda_cu.so: cannot open shared object file: No such file or directory
warn(f"Failed to load image Python extension: {e}")
Initializing BaseSegmenter to cuda:0

Issue with download_model() Function Not Completing

Hello,

I've encountered an issue with the download_model() function in your software. It seems to be failing and not progressing as expected. Below are the details of the problem:

Function in Question: download_model()
Issue: The function does not successfully complete its operation.
Behavior Observed: The process halts for an extended period without any progress. There is no change in the tqdm progress bar either.
Error Messages and Logs:

Segmenting input.mp4
OpenCV: FFMPEG: tag 0x44495658/'XVID' is not supported with codec id 12 and format 'mp4 / MP4 (MPEG-4 Part 14)'
OpenCV: FFMPEG: fallback to use tag 0x7634706d/'mp4v'
  0%|                                                                                        | 0/742 [00:00<?, ?it/s]
Downloading vit_b.pth model 

The process then seems to halt at the initial stage of downloading the 'vit_b.pth' model with no progress indicated in the tqdm bar.

I would appreciate any guidance or fix for this issue. If there are any further details or logs that can assist in resolving this problem, please let me know, and I'll provide them.

Thank you for your time and assistance.

I found the solution, but a new problem has emerged.

          I found the solution, but a new problem has emerged.

What I want to do is to segment a video and label each class. My first idea is to assign different class labels to different mask_image colors (you can see what I did for this below). However, I noticed that the output mask video changes the colors between different frames, making it difficult for me to track the labels (such as cookie/person and so on). I checked your code and found that you did the same thing to the video as the images. So, it is not surprising to get such a result.

Therefore, I wonder if you could share some of your ideas regarding this. Thanks!

What I did (In sam_predictor.py line 139):
'''
combined_mask = mask_image # combined_mask = cv2.add(frame, mask_image)
out.write(combined_mask)
'''

Originally posted by @CRH400AF-A in #91 (comment)

ONNX support / Segmentation output

Thank you for this great wrapper!

I was wondering if there was support for ONNX models for faster inference.
Also; is it possible to export each layer individually as in the SAM demo?

Finally; I tried the cam streaming (setting source=0) but no success so far!

Can metaseg input a video and output the class label?

Thanks for your great work!

I have a specific requirement for my project and I'm wondering if metaseg can cater to it. I need to input an image with dimensions HW3 (height * width * 3 channels) and obtain an "image" output with class labels in the form of HW1 (height * width * 1 channel). The "1" in this context represents that the pixels belong to different classes, rather than representing exact semantic labels.

Before I proceed, I'd like to confirm if metaseg has the capability to handle such a task. Your response would be highly valuable to me. Thank you for your time, and I'm looking forward to hearing from you.

ImportError: cannot import name 'SamAutomaticMaskGenerator' from partially initialized module 'metaseg' (on Google Colab)

Hello,

I seem to be getting the following error when running the following import on Google Colab:

from metaseg import SegAutoMaskGenerator

Error Output:

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-6-5cdada3bd05d> in <cell line: 1>()
----> 1 from metaseg import SegAutoMaskGenerator

1 frames
/usr/local/lib/python3.9/dist-packages/metaseg/auto_mask_demo.py in <module>
      3 import torch
      4 
----> 5 from metaseg import SamAutomaticMaskGenerator, sam_model_registry
      6 from metaseg.utils import download_model, load_image, load_video
      7 

ImportError: cannot import name 'SamAutomaticMaskGenerator' from partially initialized module 'metaseg' (most likely due to a circular import) (/usr/local/lib/python3.9/dist-packages/metaseg/__init__.py)

---------------------------------------------------------------------------
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.

To view examples of installing some common dependencies, click the
"Open Examples" button below.
---------------------------------------------------------------------------

image

I have restarted the runtime and tried again but I get the same issue.

AttributeError: module 'cv2' has no attribute 'write'

I was trying out this package - amazing work so quickly after the release! - and I'm getting:

[/usr/local/lib/python3.9/dist-packages/metaseg/mask_predictor.py](https://localhost:8080/#) in save_image(self, source, model_type, input_box, input_point, input_label, multimask_output, output_path)
    193 
    194         combined_mask = cv2.add(image, mask_image)
--> 195         cv2.write(output_path, combined_mask)
    196 
    197         return output_path

AttributeError: module 'cv2' has no attribute 'write'

Should that be: cv2.imwrite ?

metaseg-0.7.3 and metaseg-0.5.8 issues

metaseg-0.5.8 : AttributeError: 'list' object has no attribute 'astype'
metaseg-0.7.3: ImportError: Please install FalAI library using 'pip install fal_serverless'.

Supporting Apple M1 ?

Hello,

Does anyone knows how can we use device=mps on Apple M1 Chip in MetaSeg apps?

Thanks

how to get more objects like segment anything online demo?

the image has a high resolution : 5472x3648
How can we achieve a similar detection performance as the online demo for detecting numerous small targets, while using SAM+YOLOV8 that currently detects only a few targets

pests

https://segment-anything.com/demo result : got lot of objects
image

-----------sam+yolov8-seg
image

from metaseg import SahiAutoSegmentation, sahi_sliced_predict

image_path = "pests.jpg"
boxes = sahi_sliced_predict(
image_path=image_path,
detection_model_type="yolov8", # yolov8, detectron2, mmdetection, torchvision
detection_model_path="yolov8x-seg.pt",
conf_th=0.25,
image_size=1024,
slice_height=256,
slice_width=256,
overlap_height_ratio=0.2,
overlap_width_ratio=0.2,
)

SahiAutoSegmentation().image_predict(
source=image_path,
model_type="vit_b",
input_box=boxes,
multimask_output=False,
random_color=False,
show=True,
save=False,
)

SegAutoMaskPredictor producing random color

Hello, first of all, thank you for this awesome works!

I am following the instruction in the README for SegAutoMaskPredictor, specifically this one:

# For video
results = SegAutoMaskPredictor().video_predict(
    source="video.mp4",
    model_type="vit_l", # vit_l, vit_h, vit_b
    points_per_side=16,
    points_per_batch=64,
    min_area=1000,
    output_path="output.mp4",
)

on my private mp4 data. However I note that although the segment seems prefect, they often change color between frames. For example a chair was red in last frame but green in the next frame. I wonder is there any way to enforce color consistency between frames? Any pointer will be appreciated!

ImportError with SegAutoMaskGenerator from metaseg package

Description:

I'm getting an ImportError when trying to import SegAutoMaskGenerator from the metaseg package. Here's the error message:

ImportError: cannot import name 'SamAutomaticMaskGenerator' from partially initialized module 'metaseg' (most likely due to a circular import) (/usr/local/lib/python3.9/dist-packages/metaseg/__init__.py)

Screenshot:
image

Steps to reproduce:

  1. Install the metaseg package (pip install metaseg) on Google Colab.
  2. Run the following code:
    from metaseg import SegAutoMaskGenerator

Expected behavior:

The SegAutoMaskGenerator class should be successfully imported without any errors.

Actual behavior:

The ImportError is raised when trying to import SegAutoMaskGenerator.

Environment:

Python version: 3.9.16
Runtime: Google colab (with standard gpu)

SegAutoMaskPredictor().save_image returning error

Hello,

It seems something has changed in the code and .save_image isn't working anymore.

I have tried the following but there is no output image generated.

autoseg_image = SegAutoMaskPredictor().image_predict(
    source="smudge.png",
    model_type="vit_l",
    points_per_side=16, 
    points_per_batch=64,
    min_area=0,
    output_path='output.jpg'
)

I have also tried setting save=True and I'm getting the following error:
image

Could you help, please?

use shi to segment video

I have a problem when i run this code.
I don't konw why ModuleNotFoundError: No module named 'yolov5'.
I install yolov5 module by following code.

pip install ultralytics

Can someone give me some advice? Thank you very much!

from metaseg.sahi_predict import SahiAutoSegmentation, sahi_sliced_predict
import cv2

cap = cv2.VideoCapture('./test_data/red_girl.mp4')
fourcc = cv2.VideoWriter_fourcc(*'MP4V')  # 视频编解码器
fps = cap.get(cv2.CAP_PROP_FPS)  # 帧数
width, height = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)), int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))  # 宽高
out = cv2.VideoWriter('./output/read_girl_shi.mp4', fourcc, fps, (width, height))  # 写入视频
# Read the first frame
ret, frame = cap.read()
while ret:
    cv2.imwrite("./test_data/temp.jpg", frame)

    image_path = "./test_data/temp.jpg"
    boxes = sahi_sliced_predict(
        image_path=image_path,
        detection_model_type="yolov5", #yolov8, detectron2, mmdetection, torchvision
        detection_model_path="yolov5l6.pt",
        conf_th=0.25,
        image_size=1280,
        slice_height=256,
        slice_width=256,
        overlap_height_ratio=0.2,
        overlap_width_ratio=0.2,
    )
    SahiAutoSegmentation().predict(
        source=image_path,
        model_type="vit_b",
        input_box=boxes,
        multimask_output=False,
        random_color=False,
        show=True,
        save=True,
        output_path="./output/temp.jpg"
    )
    image = cv2.imread("./output/temp.jpg")
    # image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    out.write(image)

cap.release()

document little error

In you ReadMe.md, Usage segment, there is a code error, when i copy this demo to run. You should add ',' behind this line "input_point=[0, 0, 100, 100]".

# For video

results = SegManualMaskPredictor().video_predict(
    source="test.mp4",
    model_type="vit_l", # vit_l, vit_h, vit_b
    input_point=[0, 0, 100, 100]
    input_label=[0, 1],
    input_box=None,
    multimask_output=False,
    random_color=False,
    output_path="output.mp4",
)

how to get semantic label

I want to know how to get the class label information corresponding to the segmentation mask area

I have this issue after updating the code.

I have this issue after updating the code.

Traceback (most recent call last):
File "e:/AIGC/segment-anything-video/test", line 1, in
from metaseg import sahi_sliced_predict, SahiAutoSegmentation
File "e:\AIGC\segment-anything-video\metaseg_init_.py", line 7, in
from metaseg.falai_demo import falai_automask_image, falai_manuelmask_image
File "e:\AIGC\segment-anything-video\metaseg\falai_demo.py", line 5, in
from metaseg import SegAutoMaskPredictor, SegManualMaskPredictor
ImportError: cannot import name 'SegAutoMaskPredictor' from partially initialized module 'metaseg' (most likely due to a circular import) (e:\AIGC\segment-anything-video\metaseg_init_.py)

Can you save sections of the image that are masked after using SegAutoMaskPredictor()

I can't find this in the documentation. After running results = SegAutoMaskPredictor().image_predict(
source="firststeve.png",
model_type="vit_h", # vit_l, vit_h, vit_b
points_per_side=4,
points_per_batch=16,
min_area=0,
output_path="output.png",
show=False,
save=True,
)

I can save the segmented image, but I can't pull out the individual segmented pieces. Does this package support doing something like this or do I have to use the original repository? I just want to know if something like this exists:
image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.