128 lines
11 KiB
Markdown
128 lines
11 KiB
Markdown
|
|
---
|
|||
|
|
title: 'Automatic annotation'
|
|||
|
|
linkTitle: 'Automatic annotation'
|
|||
|
|
weight: 16
|
|||
|
|
description: 'Automatic annotation of tasks'
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
Automatic annotation in CVAT is a tool that you can use
|
|||
|
|
to automatically pre-annotate your data with pre-trained models.
|
|||
|
|
|
|||
|
|
CVAT can use models from the following sources:
|
|||
|
|
|
|||
|
|
- [Pre-installed models](#models).
|
|||
|
|
- Models integrated from [Hugging Face and Roboflow](#adding-models-from-hugging-face-and-roboflow).
|
|||
|
|
- {{< ilink "/docs/manual/advanced/serverless-tutorial" "Self-hosted models deployed with Nuclio" >}}.
|
|||
|
|
- {{< ilink "/docs/enterprise/segment-anything-2-tracker" "AI agent functions (SAM2 tracking)" >}}
|
|||
|
|
for CVAT Online and Enterprise.
|
|||
|
|
|
|||
|
|
The following table describes the available options:
|
|||
|
|
|
|||
|
|
| | Self-hosted | Online |
|
|||
|
|
| ------------------------------------------- | ---------------------- | ------------------------------------------------------ |
|
|||
|
|
| **Price** | Free | See [Pricing](https://www.cvat.ai/pricing/cvat-online) |
|
|||
|
|
| **Models** | You have to add models | You can use pre-installed models |
|
|||
|
|
| **Hugging Face & Roboflow <br>integration** | Not supported | Supported |
|
|||
|
|
| **AI Agent Functions** | Supported (Enterprise) | Supported (SAM2 tracking available) |
|
|||
|
|
|
|||
|
|
See:
|
|||
|
|
|
|||
|
|
- [Running Automatic annotation](#running-automatic-annotation)
|
|||
|
|
- [Labels matching](#labels-matching)
|
|||
|
|
- [Models](#models)
|
|||
|
|
- [Adding models from Hugging Face and Roboflow](#adding-models-from-hugging-face-and-roboflow)
|
|||
|
|
|
|||
|
|
## Running Automatic annotation
|
|||
|
|
|
|||
|
|
To start automatic annotation, do the following:
|
|||
|
|
|
|||
|
|
1. On the top menu, click **Tasks**.
|
|||
|
|
1. Find the task you want to annotate and click **Action** > **Automatic annotation**.
|
|||
|
|
|
|||
|
|

|
|||
|
|
|
|||
|
|
1. In the Automatic annotation dialog, from the drop-down list, select a [model](#models).
|
|||
|
|
1. [Match the labels](#labels-matching) of the model and the task.
|
|||
|
|
1. (Optional) In case you need the model to return masks as polygons, switch toggle **Return masks as polygons**.
|
|||
|
|
1. (Optional) In case you need to remove all previous annotations, switch toggle **Clean old annotations**.
|
|||
|
|
1. (Optional) You can specify a **Threshold** for the model.
|
|||
|
|
If not provided, the default value from the model settings will be used.
|
|||
|
|
|
|||
|
|

|
|||
|
|
|
|||
|
|
1. Click **Annotate**.
|
|||
|
|
|
|||
|
|
CVAT will show the progress of annotation on the progress bar.
|
|||
|
|
|
|||
|
|

|
|||
|
|
|
|||
|
|
You can stop the automatic annotation at any moment by clicking cancel.
|
|||
|
|
|
|||
|
|
## Labels matching
|
|||
|
|
|
|||
|
|
Each model is trained on a dataset and supports only the dataset's labels.
|
|||
|
|
|
|||
|
|
For example:
|
|||
|
|
|
|||
|
|
- DL model has the label `car`.
|
|||
|
|
- Your task (or project) has the label `vehicle`.
|
|||
|
|
|
|||
|
|
To annotate, you need to match these two labels to give
|
|||
|
|
CVAT a hint that, in this case, `car` = `vehicle`.
|
|||
|
|
|
|||
|
|
If you have a label that is not on the list
|
|||
|
|
of DL labels, you will not be able to
|
|||
|
|
match them.
|
|||
|
|
|
|||
|
|
For this reason, supported DL models are suitable only
|
|||
|
|
for certain labels.
|
|||
|
|
|
|||
|
|
To check the list of labels for each model, see [Models](#models)
|
|||
|
|
papers and official documentation.
|
|||
|
|
|
|||
|
|
## Models
|
|||
|
|
|
|||
|
|
Automatic annotation uses pre-installed and added models.
|
|||
|
|
|
|||
|
|
{{% alert title="Note" color="primary" %}}
|
|||
|
|
For self-hosted solutions,
|
|||
|
|
you need to
|
|||
|
|
{{< ilink "/docs/administration/advanced/installation_automatic_annotation" "install Automatic Annotation first" >}}
|
|||
|
|
and {{< ilink "/docs/manual/advanced/models" "add models" >}}.
|
|||
|
|
{{% /alert %}}
|
|||
|
|
|
|||
|
|
List of pre-installed models:
|
|||
|
|
|
|||
|
|
<!--lint disable maximum-line-length-->
|
|||
|
|
|
|||
|
|
| Model | Description |
|
|||
|
|
| ------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|||
|
|
| Attributed face detection | Three OpenVINO models work together: <br><br><li> [Face Detection 0205](https://docs.openvino.ai/2022.3/omz_models_model_face_detection_0205.html): face detector based on MobileNetV2 as a backbone with a FCOS head for indoor and outdoor scenes shot by a front-facing camera. <li>[Emotions recognition retail 0003](https://docs.openvino.ai/2022.3/omz_models_model_emotions_recognition_retail_0003.html#emotions-recognition-retail-0003): fully convolutional network for recognition of five emotions (‘neutral’, ‘happy’, ‘sad’, ‘surprise’, ‘anger’). <li>[Age gender recognition retail 0013](https://docs.openvino.ai/2022.3/omz_models_model_age_gender_recognition_retail_0013.html): fully convolutional network for simultaneous Age/Gender recognition. The network can recognize the age of people in the \[18 - 75\] years old range; it is not applicable for children since their faces were not in the training set. |
|
|||
|
|
| RetinaNet R101 | RetinaNet is a one-stage object detection model that utilizes a focal loss function to address class imbalance during training. Focal loss applies a modulating term to the cross entropy loss to focus learning on hard negative examples. RetinaNet is a single, unified network composed of a backbone network and two task-specific subnetworks. <br><br>For more information, see: <li>[Site: RetinaNET](https://paperswithcode.com/lib/detectron2/retinanet) |
|
|||
|
|
| Text detection | Text detector based on PixelLink architecture with MobileNetV2, depth_multiplier=1.4 as a backbone for indoor/outdoor scenes. <br><br> For more information, see: <li>[Site: OpenVINO Text detection 004](https://docs.openvino.ai/2022.3/omz_models_model_text_detection_0004.html) |
|
|||
|
|
| YOLO v3 | YOLO v3 is a family of object detection architectures and models pre-trained on the COCO dataset. <br><br> For more information, see: <li>[Site: YOLO v3](https://docs.openvino.ai/2022.3/omz_models_model_yolo_v3_tf.html) |
|
|||
|
|
| YOLO v7 | YOLOv7 is an advanced object detection model that outperforms other detectors in terms of both speed and accuracy. It can process frames at a rate ranging from 5 to 160 frames per second (FPS) and achieves the highest accuracy with 56.8% average precision (AP) among real-time object detectors running at 30 FPS or higher on the V100 graphics processing unit (GPU). <br><br> For more information, see: <li>[GitHub: YOLO v7](https://github.com/WongKinYiu/yolov7) <li>[Paper: YOLO v7](https://arxiv.org/pdf/2207.02696.pdf) |
|
|||
|
|
|
|||
|
|
<!--lint enable maximum-line-length-->
|
|||
|
|
|
|||
|
|
## Adding models from Hugging Face and Roboflow
|
|||
|
|
|
|||
|
|
In case you did not find the model you need, you can add a model
|
|||
|
|
of your choice from [Hugging Face](https://huggingface.co/)
|
|||
|
|
or [Roboflow](https://roboflow.com/).
|
|||
|
|
|
|||
|
|
{{% alert title="Note" color="primary" %}}
|
|||
|
|
You cannot add models from Hugging Face and Roboflow to self-hosted CVAT.
|
|||
|
|
{{% /alert %}}
|
|||
|
|
|
|||
|
|
<!--lint disable maximum-line-length-->
|
|||
|
|
|
|||
|
|
For more information,
|
|||
|
|
see [Streamline annotation by integrating Hugging Face and Roboflow models](https://www.cvat.ai/post/integrating-hugging-face-and-roboflow-models).
|
|||
|
|
|
|||
|
|
This video demonstrates the process:
|
|||
|
|
|
|||
|
|
<iframe width="560" height="315" src="https://www.youtube.com/embed/SbU3aB65W5s" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>
|
|||
|
|
|
|||
|
|
<!--lint enable maximum-line-length-->
|