cvat/ai-models/detector/yolo
陈增辉 aced3ee4f0 继承于2.44.3版本 2025-09-16 09:19:40 +08:00
..
README.md 继承于2.44.3版本 2025-09-16 09:19:40 +08:00
func.py 继承于2.44.3版本 2025-09-16 09:19:40 +08:00
requirements.txt 继承于2.44.3版本 2025-09-16 09:19:40 +08:00

README.md

Ultralytics YOLO

This is an implementation of a CVAT auto-annotation function that uses models from the YOLO family, as implemented in the Ultralytics library.

WARNING: While the function code is provided under the MIT license, the underlying Ultralytics library has a different licensing model. Make sure to familiarize yourself with the terms at https://www.ultralytics.com/license before using this function.

This AA function supports all numbered YOLO models implemented by the Ultralytics library, starting with YOLOv3. At the time of this writing, the most recent such model was YOLO12; however, future models should also work, provided that the API remains the same. Zero-shot models, such as YOLO-World and YOLOE, are not supported.

The AA function supports models solving the following tasks:

  • classification
  • instance segmentation
  • object detection
  • oriented object detection
  • pose estimation

To use this with CVAT CLI, use the following options:

--function-file func.py -p model=str:<model>

where <model> is the path to a pretrained model file, such as yolo12n.pt. If the file does not exist, but its name matches one of the pretrained models available in the library, that model will be automatically downloaded and used.

See the documentation at https://docs.ultralytics.com/models/ for information on available pretrained models.

This function also supports the following options:

  • -p device=str:<device> - the PyTorch device, such as cuda, on which to run the model. By default, cpu is used.

  • -p keypoint_names_path=str:<path> - path to a file with names of keypoints. Only valid for pose estimation models. By default, the 17 keypoint names from the COCO dataset (nose, left_eye, right_eye, etc.) will be used.

    Ultralytics model files don't contain keypoint names, so you will likely need to set this option if your pose estimation model was trained on a custom dataset.

    The <path> must point to a text file, with one keypoint name per line. Leading and trailing whitespace will be ignored, and so will empty lines.