Technology
Yolo-world
YOLO-World delivers real-time, open-vocabulary object detection: It's the YOLOv8-based model that detects any object from a text prompt, no retraining required.
This is YOLO-World, the zero-shot object detection framework from Tencent AI Lab (Jan 2024). It leverages the speed of Ultralytics YOLOv8 and integrates vision-language modeling (CLIP) to identify objects not included in its training set. The core innovation is its 'prompt-then-detect' strategy, which uses an offline vocabulary and a RepVL-PAN architecture for efficient feature fusion. This approach drastically cuts computational overhead versus traditional Transformer-based models. Performance is validated with benchmarks like achieving 35.4 AP at 52.0 FPS on a V100 GPU (L version), making it a high-speed, versatile tool for dynamic, real-world vision tasks.
Related technologies
Recent Talks & Demos
Showing 1-1 of 1