Coder Social home page Coder Social logo

Comments (9)

zhang-tao-whu avatar zhang-tao-whu commented on July 19, 2024

There is no problem with single GPU inference. I believe it is highly likely that the CPU memory is insufficient, leading to an interrupt and forced termination. The OVIS dataset contains many videos with hundreds of frames, and DVIS processes all frames in testing before converting the results from the highly memory-consuming mask format to RLE format. Therefore, there is a high memory requirement.

from dvis.

danyow-cheung avatar danyow-cheung commented on July 19, 2024

image
image
any solution

from dvis.

zhang-tao-whu avatar zhang-tao-whu commented on July 19, 2024

The test pipeline needs to be modified to support inference clip by clip. You can refer to demo_long_video.py..

from dvis.

danyow-cheung avatar danyow-cheung commented on July 19, 2024

I had tried this resolution but all failed

On my edited code I changed something like this
image

and then i encountered an error

Traceback (most recent call last):
  File "/home/hs/AIGC/DVIS-main/demo_video/demo_long_video.py", line 133, in <module>
    predictions, visualized_output = demo.run_on_video(vid_frames, keep=False)
  File "/home/hs/AIGC/DVIS-main/demo_video/predictor.py", line 217, in run_on_video
    vis_output = visualizer.draw_instance_predictions(predictions=ins, ids=pred_ids)
  File "/home/hs/AIGC/DVIS-main/demo_video/visualizer.py", line 92, in draw_instance_predictions
    masks = [GenericMask(x, self.output.height, self.output.width) for x in masks]
  File "/home/hs/AIGC/DVIS-main/demo_video/visualizer.py", line 92, in <listcomp>
    masks = [GenericMask(x, self.output.height, self.output.width) for x in masks]
  File "/home/hs/AIGC/detectron2/detectron2/utils/visualizer.py", line 90, in __init__
    assert m.shape == (
AssertionError: mask shape: (3, 2160, 3840), target dims: 2160, 3840

would you like to share the details about the predictions

from dvis.

danyow-cheung avatar danyow-cheung commented on July 19, 2024

and my pytorch version is 1.11

from dvis.

zhang-tao-whu avatar zhang-tao-whu commented on July 19, 2024

Please refer to lines 829-836 in meta_architecture.py, where predictions refers to the prediction results directly returned by the network.

predictions = {
            "image_size": (output_height, output_width),
            "pred_scores": out_scores,  # is a list, length is n_obj, i.e., [obj1_score,... , obj_n_score]
            "pred_labels": out_labels,  # is a list, length is n_obj, i.e., [obj1_label,... , obj_n_label]
            "pred_masks": out_masks,  # is a list, length is n_obj, i.e., [torch.Tensor(n_frames, H, W),... , torch.Tensor(n_frames, H, W)]
            "pred_ids": out_ids,  # is a list, length is n_obj, i.e., [obj1_id,... , obj_n_id]
            "task": "vis",
        }

You can also refer to the function _get_objects_from_outputs (line 21) in the file predictor.py to understand the meaning of information in the predictions.

from dvis.

zhang-tao-whu avatar zhang-tao-whu commented on July 19, 2024

If you only need to obtain predictions for a portion of the video, I recommend directly extracting the prediction results from demo_long_video.py and storing them locally. This way, you will not need to modify the code extensively.

from dvis.

danyow-cheung avatar danyow-cheung commented on July 19, 2024

got the one frame prediction information , thx anyway

from dvis.

zhang-tao-whu avatar zhang-tao-whu commented on July 19, 2024

For an object, the entire video has only one score and one category. However, please note that the size of the mask is (T, H, W).

from dvis.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.