Segment Anything
We present the Segment Anything (SA) project: a promptable segmentation model, dataset, and task advancing foundation models in computer vision. The Segment Anything Model (SAM) processes diverse prompts (points, boxes, text) through a ViT-based encoder and real-time mask decoder (~50ms per prompt),...
| Опубликовано в: : | Инноватика-2025 : сборник материалов XXI Международной школы-конференции студентов, аспирантов и молодых ученых, 28-30 апреля 2025 г., г. Томск, Россия С. 255-262 |
|---|---|
| Главный автор: | |
| Формат: | Статья в сборнике |
| Язык: | English |
| Предметы: | |
| Online-ссылка: | https://vital.lib.tsu.ru/vital/access/manager/Repository/koha:001272904 Перейти в каталог НБ ТГУ |
| Итог: | We present the Segment Anything (SA) project: a promptable segmentation model, dataset, and task advancing foundation models in computer vision. The Segment Anything Model (SAM) processes diverse prompts (points, boxes, text) through a ViT-based encoder and real-time mask decoder (~50ms per prompt), resolving ambiguity via multi-mask outputs. Trained on SA-1B-1.1B masks from 11M licensed images collected via a scalable data engine-SAM achieves human-level mask quality (94% IoU vs. professional edits). Zero-shot evaluations across 23 benchmarks demonstrate strong performance: surpassing RITM in point-based segmentation (human-rated), 0.768 ODS edge detection on BSDS500, and 59.3 AR@1000 object proposals on LVIS. SAM enables flexible integration into systems for tasks like text-guided segmentation. We release models and data to catalyze vision foundation model research. |
|---|---|
| Библиография: | Библиогр.: 9 назв. |
| ISBN: | 9785936297311 |
