×
Meta’s SAM 2 Model A Big Step for Object Segmentation in Videos
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

The Meta Segment Anything Model 2 (SAM 2) represents a significant advancement in object segmentation for both images and videos, with the potential to revolutionize video segmentation and enable seamless application across various image and video use cases.

Key features and capabilities: SAM 2 is the first unified model for real-time, promptable object segmentation in images and videos, offering improved accuracy and performance compared to existing solutions:

  • SAM 2 achieves better video segmentation performance than current methods while requiring three times less interaction time.
  • The model can segment any object in any video or image without the need for custom adaptation, thanks to its zero-shot generalization capabilities.

Innovative research approach: Meta’s research on enabling video segmentation capabilities involves designing a new task, a model, and a dataset:

  • The promptable visual segmentation task was developed, and the SAM 2 model was designed to perform this task.
  • SAM 2 was used to create SA-V, a video object segmentation dataset an order of magnitude larger than existing datasets, which was then used to train SAM 2 to achieve state-of-the-art performance.

Open science and community engagement: In line with Meta’s open science approach, the company is sharing its research on SAM 2 with the community to encourage exploration of new capabilities and use cases:

  • The SAM 2 code and weights are being open-sourced under an Apache 2.0 license, while the evaluation code is shared under a BSD-3 license.
  • The SA-V dataset, containing ~51k real-world videos with more than 600k masklets, is being shared under a CC BY 4.0 license.
  • A web demo has been released, enabling real-time interactive segmentation of short videos and the application of video effects on model predictions.

Broader implications and future outlook: By openly sharing this research, Meta aims to contribute to accelerating progress in universal video and image segmentation and related perception tasks:

  • The release of SAM 2 and the SA-V dataset has the potential to drive innovation and the development of new applications in the field of object segmentation.
  • As the AI community explores and builds upon this research, it is expected to lead to new insights and the creation of useful experiences across various domains.
  • The open-source nature of the model and dataset will enable researchers and developers to further refine and adapt the technology to suit specific use cases and industries.
Introducing SAM 2: The next generation of Meta Segment Anything Model for videos and images

Recent News

Social network Bluesky says it won’t train AI on user posts

As social media platforms debate AI training practices, Bluesky stakes out a pro-creator stance by pledging not to use user content for generative AI.

New research explores how cutting-edge AI may advance quantum computing

AI is being leveraged to address key challenges in quantum computing, from hardware design to error correction.

Navigating the ethical minefield of AI-powered customer segmentation

AI-driven customer segmentation provides deeper insights into consumer behavior, but raises concerns about privacy and potential bias.