Coder Social home page Coder Social logo

Comments (2)

nmanovic avatar nmanovic commented on May 21, 2024

When start a job, if I wait long enough, will all the frames be loaded into the browser?
Or, are they loaded on demand as I seek through the video?

No, they don't. To reduce server load it will try to preload next 500 frames and it will continue preload other frames as soon as necessary. After jump it will start preload next frames from the new position.

Are they cached locally in memory?
Yes.

I'm working with 4k video and the interface isn't that usable, at least for my current use model, until all frames have been loaded.

Could you please describe your annotation use case? Another way to optimize your case is to resize the video before uploading to CVAT.

What I've seen that works well is using a different color on the seek bar for frames that have been loaded.

I like the feature. I will add it into our internal roadmap. It should not be difficult to implement.

If they are demand loaded, it would be nice to have a way to force it to load them all (as long as there's enough memory available).

Let's understand your use case. If you annotate with a reasonable speed (e.g. several seconds per image) CVAT should be fast enough to preload next frames for you.

from cvat.

headdab avatar headdab commented on May 21, 2024

We're annotating sports video from stationary cameras. Sometimes the players and ball are rather small in the resulting video making scaling the video down less desirable. At the end of the day, we have to balance accuracy of the annotations with the time it takes to download and tag the videos. I realize I can scale down the video, or compress the frames more, but need to understand the loading and caching strategy to determine the appropriate trade-offs. How is the frame cache size determined? Currently, I'm annotating 30s clips. I'm guessing the segmentation options when creating jobs is so you can break big jobs into smaller ones, to handle related issues.

The model I'm currently using, in interpolation mode, is to step through the video using at a coarse level (c/v keys) labeling the boxes for a given player. I think that's the best way to get good per player tracks. Then I use the scroll bar and single step to quickly slide back and forth through the video, stopping when necessary to add more keyframes to the track.

We're still trying to find the most effective way to annotate these videos, but that's what we're doing so far. I recall in the research that its easier to annotate one object at a time. I use the filtering functionality to only show the boxes for the current object. If you have other suggestions or ideas, please share.

from cvat.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.