×
Meta’s SAM 2 Model A Big Step for Object Segmentation in Videos
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

The Meta Segment Anything Model 2 (SAM 2) represents a significant advancement in object segmentation for both images and videos, with the potential to revolutionize video segmentation and enable seamless application across various image and video use cases.

Key features and capabilities: SAM 2 is the first unified model for real-time, promptable object segmentation in images and videos, offering improved accuracy and performance compared to existing solutions:

  • SAM 2 achieves better video segmentation performance than current methods while requiring three times less interaction time.
  • The model can segment any object in any video or image without the need for custom adaptation, thanks to its zero-shot generalization capabilities.

Innovative research approach: Meta’s research on enabling video segmentation capabilities involves designing a new task, a model, and a dataset:

  • The promptable visual segmentation task was developed, and the SAM 2 model was designed to perform this task.
  • SAM 2 was used to create SA-V, a video object segmentation dataset an order of magnitude larger than existing datasets, which was then used to train SAM 2 to achieve state-of-the-art performance.

Open science and community engagement: In line with Meta’s open science approach, the company is sharing its research on SAM 2 with the community to encourage exploration of new capabilities and use cases:

  • The SAM 2 code and weights are being open-sourced under an Apache 2.0 license, while the evaluation code is shared under a BSD-3 license.
  • The SA-V dataset, containing ~51k real-world videos with more than 600k masklets, is being shared under a CC BY 4.0 license.
  • A web demo has been released, enabling real-time interactive segmentation of short videos and the application of video effects on model predictions.

Broader implications and future outlook: By openly sharing this research, Meta aims to contribute to accelerating progress in universal video and image segmentation and related perception tasks:

  • The release of SAM 2 and the SA-V dataset has the potential to drive innovation and the development of new applications in the field of object segmentation.
  • As the AI community explores and builds upon this research, it is expected to lead to new insights and the creation of useful experiences across various domains.
  • The open-source nature of the model and dataset will enable researchers and developers to further refine and adapt the technology to suit specific use cases and industries.
Introducing SAM 2: The next generation of Meta Segment Anything Model for videos and images

Recent News

Enterprises are failing to keep up with AI governance and regulatory requirements

Amid a $200 billion AI market, half of global companies lack required compliance measures as the EU's landmark regulations loom in 2024.

The Edgelord who wooed Marc Andreessen and then made millions with an automous crypto agent

Experimental chatbot's viral crypto influence grows to $40 million in holdings, sparking unplanned test of AI financial autonomy safeguards.

How to create custom emojis with Apple’s new Genmoji AI tool

Apple's new AI-driven emoji creator allows users to generate custom emojis through text descriptions, but requires latest-gen devices due to processing demands.