Coder Social home page Coder Social logo

Comments (2)

MaxTeselkin avatar MaxTeselkin commented on May 25, 2024 1

Honestly I was able to make it work on short sequences using a simple trick to artificially lengthen input sequence @nikitakaraevv. If length of input sequence is lower that 11, I simply lengthen it by duplicating the last frame as much as needed. For example, if input length is 4 (input frame with points + 3 frames to track), I paste 7 duplicates of the last frame into the frames list, pass them to model and shorten predictions list in the end.

Here is how it looks like:

# cotracker fails to track short sequences, so it is necessary to lengthen them by duplicating last frame
lengthened = False
if len(rgb_images) < 11:
    lengthened = True
    original_length = len(rgb_images) - 1  # do not include input frame
    while len(rgb_images) < 11:
        rgb_images.append(rgb_images[-1])
# disable gradient calculation
torch.set_grad_enabled(False)
input_video = torch.from_numpy(np.array(rgb_images)).permute(0, 3, 1, 2)[None].float()
input_video = input_video.to(self.device)
query = torch.tensor([[0, point_x, point_y]]).float()
query = query.to(self.device)
pred_tracks, pred_visibility = self.model(input_video, queries=query[None])
pred_tracks = pred_tracks.squeeze().cpu()[1:]
if lengthened:
    pred_tracks = pred_tracks[:original_length] # shorten output if necessary

And it works perfectly - now predictions look nice even if I track on only one frame.

Regarding the reason for using such short sequences - I simply used them for debugging and thought that my code was incorrect until I tried to track on longer sequences) Anyway, in my opinion CoTracker is the best model for point tracking right now, good job!

from co-tracker.

nikitakaraevv avatar nikitakaraevv commented on May 25, 2024

Hi @MaxTeselkin, thank you for the question!

We did not expect that CoTracker would be applied to such short videos :) I think the reason why it outputs zeros is because of this line:

while ind < T - self.S // 2:

self.S (window size) is 8, so T - self.S // 2 in this case is -1, which leads to the while loop being skipped.
It would be great to make CoTracker work for such short videos as well. I'll think about it!

from co-tracker.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.