cvat-ai / cvat Goto Github PK

Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.

Home Page: https://cvat.ai

License: MIT License

Python 38.69% HTML 0.48% JavaScript 13.99% Shell 0.14% Dockerfile 0.12% TypeScript 41.53% SCSS 1.64% Smarty 0.05% Open Policy Agent 0.87% Mustache 2.49% Jinja 0.01%

video-annotation computer-vision computer-vision-annotation deep-learning image-annotation annotation-tool annotation labeling labeling-tool image-labeling

cvat's Introduction

Computer Vision Annotation Tool (CVAT)

CVAT is an interactive video and image annotation tool for computer vision. It is used by tens of thousands of users and companies around the world. Our mission is to help developers, companies, and organizations around the world to solve real problems using the Data-centric AI approach.

Start using CVAT online: cvat.ai. You can use it for free, or subscribe to get unlimited data, organizations, autoannotations, and Roboflow and HuggingFace integration.

Or set CVAT up as a self-hosted solution: Self-hosted Installation Guide. We provide Enterprise support for self-hosted installations with premium features: SSO, LDAP, Roboflow and HuggingFace integrations, and advanced analytics (coming soon). We also do trainings and a dedicated support with 24 hour SLA.

Quick start ⚡

Partners ❤️

CVAT is used by teams all over the world. In the list, you can find key companies which help us support the product or an essential part of our ecosystem. If you use us, please drop us a line at [email protected].

Human Protocol uses CVAT as a way of adding annotation service to the Human Protocol.
FiftyOne is an open-source dataset curation and model analysis tool for visualizing, exploring, and improving computer vision datasets and models that are tightly integrated with CVAT for annotation and label refinement.

Public datasets

ATLANTIS, an open-source dataset for semantic segmentation of waterbody images, developed by iWERS group in the Department of Civil and Environmental Engineering at the University of South Carolina is using CVAT.

For developing a semantic segmentation dataset using CVAT, see:

CVAT online: cvat.ai

This is an online version of CVAT. It's free, efficient, and easy to use.

cvat.ai runs the latest version of the tool. You can create up to 10 tasks there and upload up to 500Mb of data to annotate. It will only be visible to you or the people you assign to it.

For now, it does not have analytics features like management and monitoring the data annotation team. It also does not allow exporting images, just the annotations.

We plan to enhance cvat.ai with new powerful features. Stay tuned!

Prebuilt Docker images 🐳

Prebuilt docker images are the easiest way to start using CVAT locally. They are available on Docker Hub:

The images have been downloaded more than 1M times so far.

Screencasts 🎦

Here are some screencasts showing how to use CVAT.

Computer Vision Annotation Course: we introduce our course series designed to help you annotate data faster and better using CVAT. This course is about CVAT deployment and integrations, it includes presentations and covers the following topics:

Speeding up your data annotation process: introduction to CVAT and Datumaro. What problems do CVAT and Datumaro solve, and how they can speed up your model training process. Some resources you can use to learn more about how to use them.
Deployment and use CVAT. Use the app online at app.cvat.ai. A local deployment. A containerized local deployment with Docker Compose (for regular use), and a local cluster deployment with Kubernetes (for enterprise users). A 2-minute tour of the interface, a breakdown of CVAT’s internals, and a demonstration of how to deploy CVAT using Docker Compose.

Product tour: in this course, we show how to use CVAT, and help to get familiar with CVAT functionality and interfaces. This course does not cover integrations and is dedicated solely to CVAT. It covers the following topics:

Pipeline. In this video, we show how to use app.cvat.ai: how to sign up, upload your data, annotate it, and download it.

For feedback, please see Contact us

API

Documentation

SDK

Install with pip install cvat-sdk
PyPI package homepage
Documentation

CLI

Install with pip install cvat-cli
PyPI package homepage
Documentation

Supported annotation formats

CVAT supports multiple annotation formats. You can select the format after clicking the Upload annotation and Dump annotation buttons. Datumaro dataset framework allows additional dataset transformations with its command line tool and Python library.

For more information about the supported formats, see: Annotation Formats.

Annotation format	Import	Export
CVAT for images	✔️	✔️
CVAT for a video	✔️	✔️
Datumaro	✔️	✔️
PASCAL VOC	✔️	✔️
Segmentation masks from PASCAL VOC	✔️	✔️
YOLO	✔️	✔️
MS COCO Object Detection	✔️	✔️
MS COCO Keypoints Detection	✔️	✔️
MOT	✔️	✔️
MOTS PNG	✔️	✔️
LabelMe 3.0	✔️	✔️
ImageNet	✔️	✔️
CamVid	✔️	✔️
WIDER Face	✔️	✔️
VGGFace2	✔️	✔️
Market-1501	✔️	✔️
ICDAR13/15	✔️	✔️
Open Images V6	✔️	✔️
Cityscapes	✔️	✔️
KITTI	✔️	✔️
Kitti Raw Format	✔️	✔️
LFW	✔️	✔️
Supervisely Point Cloud Format	✔️	✔️

Deep learning serverless functions for automatic labeling

CVAT supports automatic labeling. It can speed up the annotation process up to 10x. Here is a list of the algorithms we support, and the platforms they can be run on:

Name	Type	Framework	CPU	GPU
Segment Anything	interactor	PyTorch	✔️	✔️
Deep Extreme Cut	interactor	OpenVINO	✔️
Faster RCNN	detector	OpenVINO	✔️
Mask RCNN	detector	OpenVINO	✔️
YOLO v3	detector	OpenVINO	✔️
YOLO v7	detector	ONNX	✔️	✔️
Object reidentification	reid	OpenVINO	✔️
Semantic segmentation for ADAS	detector	OpenVINO	✔️
Text detection v4	detector	OpenVINO	✔️
SiamMask	tracker	PyTorch	✔️	✔️
TransT	tracker	PyTorch	✔️	✔️
f-BRS	interactor	PyTorch	✔️
HRNet	interactor	PyTorch		✔️
Inside-Outside Guidance	interactor	PyTorch	✔️
Faster RCNN	detector	TensorFlow	✔️	✔️
Mask RCNN	detector	TensorFlow	✔️	✔️
RetinaNet	detector	PyTorch	✔️	✔️
Face Detection	detector	OpenVINO	✔️

License

The code is released under the MIT License.

This software uses LGPL-licensed libraries from the FFmpeg project. The exact steps on how FFmpeg was configured and compiled can be found in the Dockerfile.

FFmpeg is an open-source framework licensed under LGPL and GPL. See https://www.ffmpeg.org/legal.html. You are solely responsible for determining if your use of FFmpeg requires any additional licenses. CVAT.ai Corporation is not responsible for obtaining any such licenses, nor liable for any licensing fees due in connection with your use of FFmpeg.

Contact us

Gitter to ask CVAT usage-related questions. Typically questions get answered fast by the core team or community. There you can also browse other common questions.

Discord is the place to also ask questions or discuss any other stuff related to CVAT.

LinkedIn for the company and work-related questions.

YouTube to see screencast and tutorials about the CVAT.

GitHub issues for feature requests or bug reports. If it's a bug, please add the steps to reproduce it.

#cvat tag on StackOverflow is one more way to ask questions and get our support.

[email protected] to reach out to us if you need commercial support.

cvat's People

Contributors

Stargazers

Watchers

Forkers

todkang syonekura yaoq nuzhny007 jdc08161063 hurmean southernpl opencvfun shanlizi kenderm abrams90 kamil-k matrixplayer lp249839965 fengshow12345 tuyaliang trendingtechnology treiden fitrialif aaelsay2 taoism-o armstrongyang berak drawhamxh wang4363688 whiskychoy junysss kobelin923 haffo emmaymjin levelsethu fendaq starstylesky chen849157649 keyky xiaoyc003 visionbang autohe jiyulongxu zmdfwh wyuzyf zfxu majian7654 ccfiona yangxs liangyifei suixiaodan amoliu feitiandemiaomi wangxiqiu jianweilin satchelwu raymondtaoer jacke121 lizhengdao noticeable nemodrive gzzgz abc3436645 valhongli bubalazi xyhxyh aitoraller erika1203 gc5218112 rkshuai wangzhe0623 daniellaszlo tazjel jiyongma husterjwx kenh1991 blankworld ayan1991 wangyangneu smartcai jangocheng lql0716 pandinosaurus quantumlab bowrian laidag sanyaade-machine-learning baby47 lisa20182017 hassansartaj huyun0 zhaojq-github mediaeater evertonteotonio dxawgzy nishanthjois chenkaiidy kmsravindra saikop99 tohigher itsmengzaime analystashok mathmanu robgf

cvat's Issues

My docker shows this, help to see what the problem is.

My docker shows this, helps to see what the problem is, and why the box does not move along with the car.

Bad Request (400)

I run containers without tf_annotation app:
docker-compose up
and visit with ip like this:
http://10.239.38.156:8080
got this error:
Bad Request (400)

Is there something wrong?

Is there any documentation?

Under README.md it says to read the documentation, but I don't see any.

Sort labels in alphabet order

Currently they are sorted by primary key. If all labels were provided at task creation it results in a semi-random order of labels. It makes significantly harder to find required label when working with a big number of them. One work-around is to add labels one-by-one, but it's not a pleasant process...

Remove all annotations inside a range of frames

This is very useful option to remove all options from one frame to another.
I want to reannotate part of video. It is not good idea to search for keyframes and turn off them or
delete lines in xml file and upload annotation.

Feature request: add tracking

It'd be great to add a tracking mode e.g. see video here.

Specifically, if I enable tracking (per track) if there are no annotations in the future of the track, then attempt to track the last annotated box for all future frames. If the user goes to the next frame while one of these tracks is displayed, then this is considered as marking this as 'good', and it gets added as a key frame. Otherwise the user can edit it manually.

There are probably more UI considerations, but this would provide a lot of value. My use case is tracking people heads, and the standard interpolation is less useful (but still much better than without!) due to heads 'bobbing' while walking etc. Feature tracking would likely solve this in many situations (aside from when a head is occluded etc.).

Aside from UI considerations, this is pretty easy to implement. It can even be done in the browser with opencv.js (and suitable performance, depending on device).

The login page at localhost:8080 can't be reached

I followed the Installation instructions, but after running the docker-compose up -d command, I get a "connection was reset" error in chrome, and don't see the login page.

The output of the docker-compose up -d command was:
Creating network "cvat_default" with the default driver
Creating cvat_db ... done
Creating cvat_redis ... done
Creating cvat ... done

And docker ps outputs:

CONTAINER ID	IMAGE	COMMAND	CREATED	STATUS	PORTS	NAMES
ec26a5065b07	cvat	"/usr/bin/supervisord"	25 minutes ago	Up 25 minutes	0.0.0.0:8080->8080/tcp, 8443/tcp	cvat

Improve documentation for overlap parameter

My understanding is that overlap just specifies how many frames overlap when splitting a video into segments. If that's correct, what's the purpose? Does it actually do anything for the user? (E.g. if I do the tracks on the first segment, are they copied across to the next segment in the overlapping region? This doesn't appear to be the case.) Put another way- should I just set overlap=0 in my video tagging examples, to avoid having to manually resolve different taggings from each segment in the overlap?

Sorry if I've misunderstood something obvious.

PS - great tool!

Where does the shared server directory point at?

To create huge tasks the documentation suggests choosing Share option in the dialog box.
While trying to select the files I see a modal popping up with the following path as title:
//icv-cifs/icv_projects/cvat/data

However I can not navigate from there (nor I can find where this path is). The documentation also does not elaborate much for large number of frames. Any advice?

Annotation file should contain original file names (extension can be wrong)

I have an annotation task which was created for 10 ".png" images. Inside annotation file all files have ".jpg" extension. It isn't correct and leads to problems when you try to use scripts to convert data into COCO format.

Enable video stream access

Hi,

My workflow is such that I have thousands of frames per annotation task, which amounts to extensive disk space usage (e.g. a <15MB video (~40s VGA@30fps) results in ~2GB in jpeg's).

Adding the ability of CVAT to work directly on a video stream will be a significant improvement as it will allow for the user to only specify a URL/path with potentially an optional download and local storage capability.

One way I see this to be done is by usage of OpenCV.js (https://docs.opencv.org/3.4/d5/d10/tutorial_js_root.html).

I could invest time in this. Let me know of your thoughts.

Thanks

FEATURE: More better way to add/remove points for polygons

如何配置环境（how to configure the environment?）

这个环境配置能给一个教程吗，安装完docker和docker-compose后，就无法继续进行下去了。
(can you give me a guide to this project?After I configured the docker and docker-compose,I don't know what to do the next.)

Register new users

Hi
Thanks for great project. It is exactly what I was looking for. Even was able to run on AWS EC2 instance.
When new user tries to register he is getting:

Forbidden
Your account doesn't have access to this page. To proceed, please login with an account that has access or contact your admin.

I guess it still not implemented. Am I right?

Minimum and recommended HW config

Hi,

I'm consider deploying CVAT on AWS. What is minimum / recommended HW config such as number of CPU cores, RAM, disk space?

CVAT - AWS-Deployment guide

It would be nice if we have some docs telling how we can deploy this into the AWS CUDA Deep-learning machine.
Let me add this to the doc of CVAT or if anyone can build CVAT AMI at AWS would be great. Most of the time, we use the CVAT in AWS. I believe it would be helpful to other teams.

Support Pascal VOC Format

Hi, it would be nice to be able to export the annotations as Pascal VOC format. I Couldn't find info about supported formats in the documentation, is this feature supported?

UI becomes slow after 300-400 annotations

I'm labeling large satellite images with hundreds to a few thousand objects of interest.

I noticed that after about 300-400 annotations, the UI slows down. It might take the program ~1 sec to become responsive again after creating a new bbox. After about 800-1000 annotations, it's nearly unusable -- adding an annotation might require ~5 seconds before it will register. For now, I'm just cropping my large images into smaller pieces as a workaround, but it'd be a lot nicer to add all annotations to a single large image (as raw satellite imagery often comes in fairly long strips). I'm using a 2017 MacBook pro to do the labeling.

I don't know enough about the backend to suggest a fix, but happy to answer questions if it's helpful.

Error: Failed to execute 'inverse' on 'SVGMatrix': The matrix is not invertible

Error: Failed to execute 'inverse' on 'SVGMatrix': The matrix is not invertible.
at translateSVGPos (https://cvat-icv.inn.intel.com/static/CACHE/js/33b452232897.js:9422:54)
at ShapeCreatorView. (https://cvat-icv.inn.intel.com/static/CACHE/js/33b452232897.js:7852:30)
at HTMLDivElement.dispatch (https://cvat-icv.inn.intel.com/static/CACHE/js/716e033f0bc5.js:24801:27)
at HTMLDivElement.elemData.handle (https://cvat-icv.inn.intel.com/static/CACHE/js/716e033f0bc5.js:24609:28)

Mark ignore regions and keyframes for an object

Thanks for great annotation server.
It would be great to have "uncertain" flag for annotation (like you have occlusion). Which means that as human I see the object and can annotate but for detection it would be nice to detect but it is OK if detector does not detect it (do no penalize algorithm for that).

The login page at localhost:8080 have Bad Request

I am using Ubuntu 16.04.

I followed the tutorial and installed it successfully, but on the second day, when I ran the docker, it couldn't open the page, showing the 400 status code（Bad Request）, how can I fix this?

The output of the docker-compose up -d command was:
Creating network "cvat_default" with the default driver
Creating cvat_db ... done
Creating cvat_redis ... done
Creating cvat ... done

And the output of docker logs cvat:
logs.txt

Mechanical Turk Integration

Integration of CVAT with MTurk for deploying work as HITs would be very useful for such projects. Need to integrate Turkic framework of VATIC with CVAT.
I would also like to contribute to your project. Please help me in setting up the development environment for the same.

Navigation by frames may works incorrect

Navigation by frames may will work incorrect in next scenario:

Open any task in CVAT
Resize browser to size which is less then CVAT workspace
Scroll browser slider right
Try to navigate with player progress bar

Player will not react to the progress bar navigation if cursor near start of the progress bar. Such "non react" area will increase if you scroll the browser slider more to the right

Keypoint Annotation

I wanted to ask about the keypoint annotation feature you are workig on now. Would that have standard configuration/format like keypoints for annotating human pose? Would it have the same interpolation feature as the current bounding boxes? Finally when would the feature be released? Do you have specific date in mind? Thank you

Video/Image loading status as on youtube

Another question and likely feature suggestion.

When start a job, if I wait long enough, will all the frames be loaded into the browser?
Or, are they loaded on demand as I seek through the video?
Are they cached locally in memory?

I'm working with 4k video and the interface isn't that usable, at least for my current use model, until all frames have been loaded.

Based on the answer above, it would be great to have feedback as to whether the frames have all been loaded or, better, which frames have been loaded. What I've seen that works well is using a different color on the seek bar for frames that have been loaded.

If they are demand loaded, it would be nice to have a way to force it to load them all (as long as there's enough memory available).

Functionality "Upload annotation for task" works bad

It upload and saves extra shapes to database. Some errors occur later (for example during dump).

How to keep track id?

After annotation done --> upload annotation:
The ID of the target is messed up.

How to include the id information of the calibration target in the exported annotation file, and re-import the annotation file without showing confusion.

Links with frame and filter information

Time to time it may useful, ability to create links for specific frame or specific object (with help filter functionality). For example:
https://cvat-host.com/?id=3341&frame=1500
https://cvat-host.com/?id=5443?frame=1500&filter=*[id="123"]

Release notes?

Is there release notes available anywhere?
If not:

should they be added?
what's the best way to figure out what has changed? look through git logs?

Undo functionality

Have you considered undo functionality?
Seems like that would be a very useful feature.
Thanks.

Extend contributing.md

Hi,

Could you, please, describe / suggest development / testing steps.
In particular, how would one perform edit - update - run (debug) cycle for both server and client parts.

Running on AWS EC2

Was trying to run cvat on EC2 AWS and met an issue to access cvat from outside AWS. It was returning Bad requests: 400 all the time. Found solution to add EC2 instance public IP to ALLOWED_HOSTS in docker-compose.override.yml as specified in documentation. But it is not the nicests solution, every time IP changes I have to change that value. It would be great if someone with move AWS experience can provide more elegant solution. Thanks

Re-id app to merge bboxes into tracks after TF annotation

Hi, great tool. For ground-truth annotation, there are often too many objects in every frame. It would be tremendously tedious work to annotate the track for every single one of objects. Is there any pre-trained model or a way to run a custom model that can detect possibly identical objects, and all I have to do is to review and merge their tracks/IDs into one?

thanks

No support to change label name after creating task

I created a task with the wrong label (maybe with mistake). After creating the task, I can not change this label, only add a new one. The decision to create a new task does not look good.

docker-compose down command as written in the readme does not remove volumes

The option -v needs to be added. If that is not done, the data will still be there the next time docker-compose up is run.

How to run it without docker?

It's tedious to install docker and configure the settings, is there any way to run it directly?
After install lots library missed for django, I meet a problem:
ERRORS:
engine.Task: (auth.E005) The permission codenamed 'view_task' clashes with a builtin permission for model 'engine.Task'.
the script is:
sudo python3 manage.py createsuperuser

Verification for data on client and server side

Last experience has showed the need to check data before load it to client or to server. Much invalid data save in database now.

Video file name / url in output file

Hi,
First of all, thanks for the tool. It works great!

When I annotate video files, and for that purpose I create an annotation task per video, I cannot seem to find any reference to the original video name / path / url inside the task itself. Moreover, inside the output XML file generated after annotating, there are no references to that information at all. The only thing I can find is the url of the corresponding task, but I don't think I can extract from that url the information I'm looking for (i.e. the name of the video).

The only workaround I can think of is naming the annotation task after the video itself, and do the same for the xml file. However, I don't really like that solution. The ideal solution for me would be to have video file name inside the xml file.

Am I missing something? Please point me in the right direction.

Thank you very much

XML file metadata labels is incomplete

The labeling schema doesn't make it into the output XML file.

As an example, I created a job with a 'labels' spec of:

person @select=type:white,blue,ref ball

and the dumped XML file is:

<?xml version="1.0" encoding="utf-8"?>                          
<annotations>                   
  <version>1.0</version>        
  <meta>                        
    <task>                      
      <id>16</id>               
      <name>test</name>         
      <size>902</size>          
      <mode>interpolation</mode>                                
      <overlap>5</overlap>      
      <bugtracker></bugtracker> 
      <created>2018-07-26 02:58:56.014598+03:00</created>       
      <updated>2018-07-26 02:58:56.014613+03:00</updated>       
      <labels>                  
        <label>                 
          <name>ball</name>     
          <attributes>          
          </attributes>         
        </label>                
      </labels>                 
      <segments>                
        <segment>               
          <id>24</id>           
          <start>0</start>      
          <stop>901</stop>      
          <url>http://13.66.164.80/?id=24</url>                 
        </segment>              
      </segments>               
      <owner>                   
        <username>cvat</username>                               
        <email>[email protected]</email>                         
      </owner>                  
    </task>                     
    <dumped>2018-07-26 02:59:11.669206+03:00</dumped>           
  </meta>                       
</annotations>

Note that most of the 'labels' information is missing. The only way I was able to be sure of my 'labels' spec for an existing job is that it was stored in the browser history.

Thanks.

is there a way to display only one track in annotation mode?

Say I have 10 person tracks, person0 - person9, can I hide all tracks except for one person, say person3?
The filtering options seems to hide the entire group only?

All checkboxes in temporary attributes are checked when reopening job after saving the job

All checkboxes in temporary attributes are checked when reopening job after saving the job although when dumping the annotation, the file downloaded is correct which contain the attributes I checked it only not all of them.

Installation steps without Docker

Hello Guys,
Can you support installation steps without using docker ?

Add frame parameter for links

Example: https://cvat.com/?id=15&frame=1500

clarify how to create multiple labels

I might have missed it, but the documentation doesn't seem to specify how to add multiple labels. I assumed it was just comma-separated (e.g. "car,truck,bus") but that didn't work.

Could not create the task. ffmpy.FFRuntimeError

Built docker image from the latest sources. Created superuser. Getting an error on task creation:

Could not create the task. ffmpy.FFRuntimeError: ffmpeg -i /home/django/data/2/.upload/20170209T193000.000000Z.mp4 -start_number 0 -b:v 10000k -vsync 0 -an -y -q:v 16 /tmp/cvat-p9csbe_h.data/%d.jpg exited with status 1 STDOUT: STDERR:

Connected to running cvat container with docker exec -it <container id> /bin/bash and pasted command from error message into terminal. It fails, because folder cvat-* in /tmp doesn't exist.

Not correct number of frames in video

I load video with resolution 4096x2178. Number of frames is 1079.
In job statistic I see number of frames is 959 and in xml file number of frames is the same.

Using the same attribute for label twice -> Stuck

There is no warning that you are using the same attribute multiple times. This can easily happen when copy, pasting stuff.

Errors I've experienced when doing this:

Job doesnt start up. Instead you can only see the loading screen.
2. Can't exit out of drawing the bounding box.

EDIT: Number two has more to do with large files (3.5Gb) I think. I will further investigate.

Also the one line given makes it extremely uncomfortable to type/paste in labels.

Greetings

Host Container on Docker Hub

Could you please connect this repository to docker hub. This way it would be possible just to download to already built container, since the build process is rather lengthy.

Defiant doesn't support dash (-) in xpath nodes

Filter is not working on labels with dash character (possible same situation with attributes)
Defiant issue: hbi99/defiant.js#12

I encountered difficulties in the task configuration page.

Now，I hava create new task.After I filled out name,labels and select files,I submitted this page.But I have been waiting for this page for two or three hours for this page that has one word: "Sucessful Request!Creating...".
So,I want to know how to configure the task,can you share your configuration.
my configuration is as follows:
Name: task 1
Labels: vehicle @select=type:undefined,car,truck,bus,train ~radio=quality:good,bad ~checkbox=parked:false
Select Files: 2.mp4