317 lines
22 KiB
Markdown
317 lines
22 KiB
Markdown
|
|
---
|
||
|
|
title: 'OpenCV and AI Tools'
|
||
|
|
linkTitle: 'OpenCV and AI Tools'
|
||
|
|
weight: 14
|
||
|
|
description: 'Overview of semi-automatic and automatic annotation tools available in CVAT.'
|
||
|
|
---
|
||
|
|
|
||
|
|
Label and annotate your data in semi-automatic and automatic mode with the help of **AI** and **OpenCV** tools.
|
||
|
|
|
||
|
|
While {{< ilink "/docs/manual/advanced/annotation-with-polygons/track-mode-with-polygons" "interpolation" >}}
|
||
|
|
is good for annotation of the videos made by the security cameras,
|
||
|
|
**AI** and **OpenCV** tools are good for both:
|
||
|
|
videos where the camera is stable and videos, where it
|
||
|
|
moves together with the object, or movements of the object are chaotic.
|
||
|
|
|
||
|
|
See:
|
||
|
|
|
||
|
|
- [Interactors](#interactors)
|
||
|
|
- [AI tools: annotate with interactors](#ai-tools-annotate-with-interactors)
|
||
|
|
- [AI tools: add extra points](#ai-tools-add-extra-points)
|
||
|
|
- [AI tools: delete points](#ai-tools-delete-points)
|
||
|
|
- [OpenCV: intelligent scissors](#opencv-intelligent-scissors)
|
||
|
|
- [Settings](#settings)
|
||
|
|
- [Interactors models](#interactors-models)
|
||
|
|
- [Detectors](#detectors)
|
||
|
|
- [Labels matching](#labels-matching)
|
||
|
|
- [Annotate with detectors](#annotate-with-detectors)
|
||
|
|
- [Detectors models](#detectors-models)
|
||
|
|
- [Trackers](#trackers)
|
||
|
|
- [AI tools: annotate with trackers](#ai-tools-annotate-with-trackers)
|
||
|
|
- [OpenCV: annotate with trackers](#opencv-annotate-with-trackers)
|
||
|
|
- [When tracking](#when-tracking)
|
||
|
|
- [Trackers models](#trackers-models)
|
||
|
|
- [OpenCV: histogram equalization](#opencv-histogram-equalization)
|
||
|
|
|
||
|
|
## Interactors
|
||
|
|
|
||
|
|
Interactors are a part of **AI** and **OpenCV** tools.
|
||
|
|
|
||
|
|
Use interactors to label objects in images by
|
||
|
|
creating a polygon semi-automatically.
|
||
|
|
|
||
|
|
When creating a polygon, you can use positive points
|
||
|
|
or negative points (for some models):
|
||
|
|
|
||
|
|
- **Positive points** define the area in which the object is located.
|
||
|
|
- **Negative points** define the area in which the object is not located.
|
||
|
|
|
||
|
|

|
||
|
|
|
||
|
|
### AI tools: annotate with interactors
|
||
|
|
|
||
|
|
To annotate with interactors, do the following:
|
||
|
|
|
||
|
|
1. Click **Magic wand** , and go to the **Interactors** tab.
|
||
|
|
2. From the **Label** drop-down, select a label for the polygon.
|
||
|
|
3. From the **Interactor** drop-down, select a model (see [Interactors models](#interactors-models)).
|
||
|
|
<br>Click the **Question mark** to see information about each model:
|
||
|
|
<br>
|
||
|
|
4. (Optional) If the model returns masks, and you need to
|
||
|
|
convert masks to polygons, use the **Convert masks to polygons** toggle.
|
||
|
|
5. Click **Interact**.
|
||
|
|
6. Use the left click to add positive points and the right click to add negative points.
|
||
|
|
<br>Number of points you can add depends on the model.
|
||
|
|
7. On the top menu, click **Done** (or **Shift+N**, **N**).
|
||
|
|
|
||
|
|
### AI tools: add extra points
|
||
|
|
|
||
|
|
{{% alert title="Note" color="primary" %}}
|
||
|
|
More points improve outline accuracy, but make shape editing harder.
|
||
|
|
Fewer points make shape editing easier, but reduce outline accuracy.
|
||
|
|
{{% /alert %}}
|
||
|
|
|
||
|
|
Each model has a minimum required number of points for annotation.
|
||
|
|
Once the required number of points is reached, the request
|
||
|
|
is automatically sent to the server.
|
||
|
|
The server processes the request and adds a polygon to the frame.
|
||
|
|
|
||
|
|
For a more accurate outline, postpone request
|
||
|
|
to finish adding extra points first:
|
||
|
|
|
||
|
|
1. Hold down the **Ctrl** key.
|
||
|
|
<br>On the top panel, the **Block** button will turn blue.
|
||
|
|
2. Add points to the image.
|
||
|
|
3. Release the **Ctrl** key, when ready.
|
||
|
|
|
||
|
|
In case you used **Mask to polygon** when the object is finished,
|
||
|
|
you can edit it like a polygon.
|
||
|
|
|
||
|
|
You can change the number of points in the
|
||
|
|
polygon with the slider:
|
||
|
|
|
||
|
|

|
||
|
|
|
||
|
|
### AI tools: delete points
|
||
|
|
|
||
|
|
<br>To delete a point, do the following:
|
||
|
|
|
||
|
|
1. With the cursor, hover over the point you want to delete.
|
||
|
|
2. If the point can be deleted, it will enlarge and the cursor will turn into a cross.
|
||
|
|
3. Left-click on the point.
|
||
|
|
|
||
|
|
### OpenCV: intelligent scissors
|
||
|
|
|
||
|
|
To use **Intelligent scissors**, do the following:
|
||
|
|
|
||
|
|
1. On the menu toolbar, click **OpenCV** and wait for the library to load.
|
||
|
|
|
||
|
|
<br>
|
||
|
|
|
||
|
|
2. Go to the **Drawing** tab, select the label, and click on the **Intelligent scissors** button.
|
||
|
|
|
||
|
|

|
||
|
|
|
||
|
|
3. Add the first point on the boundary of the allocated object. <br> You will see a line repeating the outline of the object.
|
||
|
|
4. Add the second point, so that the previous point is within the restrictive threshold.
|
||
|
|
<br>After that a line repeating the object boundary will be automatically created between the points.
|
||
|
|

|
||
|
|
5. To finish placing points, on the top menu click **Done** (or **N** on the keyboard).
|
||
|
|
|
||
|
|
As a result, a polygon will be created.
|
||
|
|
|
||
|
|
You can change the number of points in the
|
||
|
|
polygon with the slider:
|
||
|
|
|
||
|
|

|
||
|
|
|
||
|
|
To increase or lower the action threshold, hold **Ctrl** and scroll the mouse wheel.
|
||
|
|
|
||
|
|
During the drawing process, you can remove the last point by clicking on it with the left mouse button.
|
||
|
|
|
||
|
|
### Settings
|
||
|
|
|
||
|
|
- On how to adjust the polygon,
|
||
|
|
see {{< ilink "/docs/manual/basics/CVAT-annotation-Interface/objects-sidebar#appearance" "Objects sidebar" >}}.
|
||
|
|
|
||
|
|
- For more information about polygons in general, see
|
||
|
|
{{< ilink "/docs/manual/advanced/annotation-with-polygons" "Annotation with polygons" >}}.
|
||
|
|
|
||
|
|
### Interactors models
|
||
|
|
|
||
|
|
<!--lint disable maximum-line-length-->
|
||
|
|
|
||
|
|
| Model | Tool | Description | Example |
|
||
|
|
| --------------------------------------------------------- | -------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------- |
|
||
|
|
| Segment Anything Model (SAM) | AI Tools | The Segment Anything Model (SAM) produces high <br> quality object masks, and it can be used to generate <br> masks for all objects in an image. It has been trained <br>on a dataset of 11 million images and <br>1.1 billion masks, and has strong zero-shot performance on a variety of segmentation tasks. <br><br>For more information, see: <li>[GitHub: Segment Anything](https://github.com/facebookresearch/segment-anything) <li>[Site: Segment Anything](https://segment-anything.com/)<li>[Paper: Segment Anything](https://ai.facebook.com/research/publications/segment-anything/) |  |
|
||
|
|
| Deep extreme <br>cut (DEXTR) | AI Tool | This is an optimized version of the original model, <br>introduced at the end of 2017. It uses the <br>information about extreme points of an object <br>to get its mask. The mask is then converted to a polygon. <br>For now this is the fastest interactor on the CPU. <br><br>For more information, see: <li>[GitHub: DEXTR-PyTorch](https://github.com/scaelles/DEXTR-PyTorch) <li>[Site: DEXTR-PyTorch](https://cvlsegmentation.github.io/dextr)<li>[Paper: DEXTR-PyTorch](https://arxiv.org/pdf/1711.09081.pdf) |  |
|
||
|
|
| Inside-Outside-Guidance<br>(IOG) | AI Tool | The model uses a bounding box and <br>inside/outside points to create a mask. <br>First of all, you need to create a bounding<br> box, wrapping the object. <br>Then you need to use positive <br>and negative points to say the <br>model where is <br>a foreground, and where is a background.<br>Negative points are optional. <br><br>For more information, see: <li>[GitHub: IOG](https://github.com/shiyinzhang/Inside-Outside-Guidance) <li>[Paper: IOG](https://openaccess.thecvf.com/content_CVPR_2020/papers/Zhang_Interactive_Object_Segmentation_With_Inside-Outside_Guidance_CVPR_2020_paper.pdf) |  |
|
||
|
|
| Intelligent scissors | OpenCV | Intelligent scissors is a CV method of creating <br>a polygon by placing points with the automatic <br>drawing of a line between them. The distance<br> between the adjacent points is limited by <br>the threshold of action, displayed as a <br>red square that is tied to the cursor. <br><br> For more information, see: <li>[Site: Intelligent Scissors Specification](https://docs.opencv.org/4.x/df/d6b/classcv_1_1segmentation_1_1IntelligentScissorsMB.html) |  |
|
||
|
|
|
||
|
|
<!--lint enable maximum-line-length-->
|
||
|
|
|
||
|
|
## Detectors
|
||
|
|
|
||
|
|
Detectors are a part of **AI** tools.
|
||
|
|
|
||
|
|
Use detectors to automatically
|
||
|
|
identify and locate objects in images or videos.
|
||
|
|
|
||
|
|
### Labels matching
|
||
|
|
|
||
|
|
Each model is trained on a dataset and supports only the dataset's labels.
|
||
|
|
|
||
|
|
For example:
|
||
|
|
|
||
|
|
- DL model has the label `car`.
|
||
|
|
- Your task (or project) has the label `vehicle`.
|
||
|
|
|
||
|
|
To annotate, you need to match these two labels to give
|
||
|
|
DL model a hint, that in this case `car` = `vehicle`.
|
||
|
|
|
||
|
|
If you have a label that is not on the list
|
||
|
|
of DL labels, you will not be able to
|
||
|
|
match them.
|
||
|
|
|
||
|
|
For this reason, supported DL models are suitable only for certain labels.
|
||
|
|
<br>To check the list of labels for each model, see [Detectors models](#detectors-models).
|
||
|
|
|
||
|
|
### Annotate with detectors
|
||
|
|
|
||
|
|
To annotate with detectors, do the following:
|
||
|
|
|
||
|
|
1. Click **Magic wand** , and go to the **Detectors** tab.
|
||
|
|
2. From the **Model** drop-down, select model (see [Detectors models](#detectors-models)).
|
||
|
|
3. From the left drop-down select the DL model label, from the right drop-down
|
||
|
|
select the matching label of your task.
|
||
|
|
|
||
|
|

|
||
|
|
|
||
|
|
4. (Optional) If the model returns masks, and you
|
||
|
|
need to convert masks to polygons, use the **Convert masks to polygons** toggle.
|
||
|
|
5. (Optional) You can specify a **Threshold** for the model. If not provided, the
|
||
|
|
default value from the model settings will be used.
|
||
|
|
6. Click **Annotate**.
|
||
|
|
|
||
|
|
This action will automatically annotate one frame.
|
||
|
|
For automatic annotation of multiple frames,
|
||
|
|
see {{< ilink "/docs/manual/advanced/automatic-annotation" "Automatic annotation" >}}.
|
||
|
|
|
||
|
|
### Detectors models
|
||
|
|
|
||
|
|
<!--lint disable maximum-line-length-->
|
||
|
|
|
||
|
|
| Model | Description |
|
||
|
|
| ------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
||
|
|
| Mask RCNN | The model generates polygons for each instance of an object in the image. <br><br> For more information, see: <li>[GitHub: Mask RCNN](https://github.com/matterport/Mask_RCNN) <li>[Paper: Mask RCNN](https://arxiv.org/pdf/1703.06870.pdf) |
|
||
|
|
| Faster RCNN | The model generates bounding boxes for each instance of an object in the image. <br>In this model, RPN and Fast R-CNN are combined into a single network. <br><br> For more information, see: <li>[GitHub: Faster RCNN](https://github.com/ShaoqingRen/faster_rcnn) <li>[Paper: Faster RCNN](https://arxiv.org/pdf/1506.01497.pdf) |
|
||
|
|
| YOLO v3 | YOLO v3 is a family of object detection architectures and models pre-trained on the COCO dataset. <br><br> For more information, see: <li>[GitHub: YOLO v3](https://github.com/ultralytics/yolov3) <li>[Site: YOLO v3](https://docs.ultralytics.com/#yolov3) <li>[Paper: YOLO v3](https://arxiv.org/pdf/1804.02767v1.pdf) |
|
||
|
|
| Semantic segmentation for ADAS | This is a segmentation network to classify each pixel into 20 classes. <br><br> For more information, see: <li>[Site: ADAS](https://docs.openvino.ai/2019_R1/_semantic_segmentation_adas_0001_description_semantic_segmentation_adas_0001.html) |
|
||
|
|
| Faster RCNN with Tensorflow | Faster RCNN version with Tensorflow. The model generates bounding boxes for each instance of an object in the image. <br>In this model, RPN and Fast R-CNN are combined into a single network. <br><br> For more information, see: <li>[Site: Faster RCNN with Tensorflow](https://docs.openvino.ai/2021.4/omz_models_model_faster_rcnn_inception_v2_coco.html) <li>[Paper: Faster RCNN](https://arxiv.org/pdf/1506.01497.pdf) |
|
||
|
|
| RetinaNet | Pytorch implementation of RetinaNet object detection. <br> <br><br> For more information, see: <li>[Specification: RetinaNet](https://paperswithcode.com/lib/detectron2/retinanet) <li>[Paper: RetinaNet](https://arxiv.org/pdf/1708.02002.pdf)<li>[Documentation: RetinaNet](https://detectron2.readthedocs.io/en/latest/tutorials/training.html) |
|
||
|
|
| Face Detection | Face detector based on MobileNetV2 as a backbone for indoor and outdoor scenes shot by a front-facing camera. <br> <br><br> For more information, see: <li>[Site: Face Detection 0205](https://docs.openvino.ai/latest/omz_models_model_face_detection_0205.html) |
|
||
|
|
|
||
|
|
<!--lint enable maximum-line-length-->
|
||
|
|
|
||
|
|
## Trackers
|
||
|
|
|
||
|
|
Trackers are part of **AI** and **OpenCV** tools.
|
||
|
|
|
||
|
|
Use trackers to identify and label
|
||
|
|
objects in a video or image sequence
|
||
|
|
that are moving or changing over time.
|
||
|
|
|
||
|
|
### AI tools: annotate with trackers
|
||
|
|
|
||
|
|
To annotate with trackers, do the following:
|
||
|
|
|
||
|
|
1. Click **Magic wand** , and go to the **Trackers** tab.
|
||
|
|
|
||
|
|
<br>
|
||
|
|
|
||
|
|
2. From the **Label** drop-down, select the label for the object.
|
||
|
|
3. From **Tracker** drop-down, select tracker.
|
||
|
|
4. Click **Track**, and annotate the objects with the bounding box in the first frame.
|
||
|
|
5. Go to the top menu and click **Next** (or the **F** on the keyboard)
|
||
|
|
to move to the next frame.
|
||
|
|
<br>All annotated objects will be automatically tracked.
|
||
|
|
|
||
|
|
### When tracking
|
||
|
|
|
||
|
|
- To enable/disable tracking, use **Tracker switcher** on the sidebar.
|
||
|
|
|
||
|
|

|
||
|
|
|
||
|
|
- Trackable objects have an indication on canvas with a model name.
|
||
|
|
|
||
|
|

|
||
|
|
|
||
|
|
- You can follow the tracking by the messages appearing at the top.
|
||
|
|
|
||
|
|

|
||
|
|
|
||
|
|
### OpenCV: annotate with trackers
|
||
|
|
|
||
|
|
To annotate with trackers, do the following:
|
||
|
|
|
||
|
|
1. Create basic rectangle shapes or tracks for tracker initialization
|
||
|
|
|
||
|
|
2. On the menu toolbar, click **OpenCV** and wait for the library to load.
|
||
|
|
|
||
|
|
<br>
|
||
|
|
|
||
|
|
3. From **Tracker** drop-down, select tracker and Click **Track**
|
||
|
|
|
||
|
|
<br>
|
||
|
|
|
||
|
|
4. Annotation actions window will pop-up. Setup `Target frame`
|
||
|
|
and `Convert rectangle shapes to tracks` parameters and click `Run`
|
||
|
|
|
||
|
|
{{% alert title="Note" color="primary" %}}
|
||
|
|
Tracking will be applied to all filtered rectangle annotations.
|
||
|
|
{{% /alert %}}
|
||
|
|
|
||
|
|
<br>
|
||
|
|
|
||
|
|
All annotated objects will be automatically tracked up until target frame parameter.
|
||
|
|
|
||
|
|
### Trackers models
|
||
|
|
|
||
|
|
<!--lint disable maximum-line-length-->
|
||
|
|
|
||
|
|
| Model | Tool | Description | Example |
|
||
|
|
| ----------------------------- | -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------- |
|
||
|
|
| TrackerMIL | OpenCV | TrackerMIL model is not bound to <br>labels and can be used for any <br>object. It is a fast client-side model <br>designed to track simple non-overlapping objects. <br><br>For more information, see: <li>[Article: Object Tracking using OpenCV](https://learnopencv.com/tag/mil/) |  |
|
||
|
|
| SiamMask | AI Tools | Fast online Object Tracking and Segmentation. The trackable object will <br>be tracked automatically if the previous frame <br>was the latest keyframe for the object. <br><br>For more information, see:<li> [GitHub: SiamMask](https://github.com/foolwood/SiamMask) <li> [Paper: SiamMask](https://arxiv.org/pdf/1812.05050.pdf) |  |
|
||
|
|
| Transformer Tracking (TransT) | AI Tools | Simple and efficient online tool for object tracking and segmentation. <br>If the previous frame was the latest keyframe <br>for the object, the trackable object will be tracked automatically.<br>This is a modified version of the PyTracking <br> Python framework based on Pytorch<br> <br><br>For more information, see: <li> [GitHub: TransT](https://github.com/chenxin-dlut/TransT)<li> [Paper: TransT](https://arxiv.org/pdf/2103.15436.pdf) |  |
|
||
|
|
| SAM2 Tracker | AI Agent | Advanced object tracking and segmentation using Meta's Segment Anything Model 2. <br>Available for CVAT Online and Enterprise via AI agents. <br>Supports polygons and masks with high precision tracking. <br>Requires user-side agent setup with Python 3.10+. <br><br>For more information, see: <li>{{< ilink "/docs/enterprise/segment-anything-2-tracker" "SAM2 Tracker Setup Guide" >}} <li>[SAM2 Blog: AI Agent Integration](https://www.cvat.ai/resources/blog/sam2-ai-agent-tracking) | _Example coming soon_ |
|
||
|
|
|
||
|
|
<!--lint enable maximum-line-length-->
|
||
|
|
|
||
|
|
## OpenCV: histogram equalization
|
||
|
|
|
||
|
|
**Histogram equalization** improves
|
||
|
|
the contrast by stretching the intensity range.
|
||
|
|
|
||
|
|
It increases the global contrast of images
|
||
|
|
when its usable data is represented by close contrast values.
|
||
|
|
|
||
|
|
It is useful in images with backgrounds
|
||
|
|
and foregrounds that are bright or dark.
|
||
|
|
|
||
|
|
To improve the contrast of the image, do the following:
|
||
|
|
|
||
|
|
1. In the **OpenCV** menu, go to the **Image** tab.
|
||
|
|
2. Click on **Histogram equalization** button.
|
||
|
|
<br>
|
||
|
|
|
||
|
|
**Histogram equalization** will improve
|
||
|
|
contrast on current and following
|
||
|
|
frames.
|
||
|
|
|
||
|
|
Example of the result:
|
||
|
|
|
||
|
|

|
||
|
|
|
||
|
|
To disable **Histogram equalization**, click on the button again.
|