SAM 2++: Tracking Anything at Any Granularity
☆56Dec 15, 2025Updated 2 months ago
Alternatives and similar repositories for SAM2-Plus
Users that are interested in SAM2-Plus are comparing it to the libraries listed below
Sorting:
- [ICCV'25] Official PyTorch Implementation of "JointDiT: Enhancing RGB-Depth Joint Modeling with Diffusion Transformers"☆29Nov 27, 2025Updated 3 months ago
- ☆15May 25, 2024Updated last year
- ☆22Mar 7, 2025Updated 11 months ago
- WorldMM: Dynamic Multimodal Memory Agent for Long Video Reasoning (CVPR 2026)☆55Dec 30, 2025Updated 2 months ago
- An End-to-End Pipeline for Enhanced French Text-to-Speech with SSML Prosody Control☆31Jan 13, 2026Updated last month
- [ICCV2023] Isomer: Isomerous Transformer for Zero-Shot Video Object Segmentation☆30Nov 21, 2023Updated 2 years ago
- Robust Referring Video Object Segmentation with Cyclic Structural Consistency [ICCV 2023]☆30Mar 13, 2024Updated last year
- FlashVTG: Feature Layering and Adaptive Score Handling Network for Video Temporal Grounding. (WACV2025)☆34Apr 17, 2025Updated 10 months ago
- The official implementation of our work Hawkeye: Discovering and Grounding Implicit Anomalous Sentiment in Recon-videos via Scene-enhanc…☆12Oct 14, 2024Updated last year
- Official code of the paper "VideoMolmo: Spatio-Temporal Grounding meets Pointing"☆53Jul 5, 2025Updated 8 months ago
- [ECCV 2024] Unified Embedding Alignment for Open-Vocabulary Video Instance Segmentation☆35Jan 6, 2025Updated last year
- [ECCV 2022] Tackling Background Distraction in Video Object Segmentation☆39Jun 2, 2025Updated 9 months ago
- Finetuning & extending DiffusionDet to video & pedestrian multi-object-tracking☆13Apr 12, 2023Updated 2 years ago
- The repository of VG-Refiner paper☆17Dec 9, 2025Updated 2 months ago
- Continual Resilient (CoRe) Optimizer for PyTorch☆11Jun 10, 2024Updated last year
- [ICLR 2025] IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model☆37Nov 27, 2024Updated last year
- ☆15Sep 16, 2024Updated last year
- DisTime: Distribution-based Time Representation for Video Large Language Models.☆19Jul 10, 2025Updated 7 months ago
- A Flow Matching-based Text-to-Speech Model with Emoji-driven Style Control☆28Updated this week
- ☆10Apr 7, 2025Updated 10 months ago
- ☆11Jan 18, 2025Updated last year
- A news based stock scalper using LLM and quant approach☆15Jan 16, 2025Updated last year
- Expanded Adaptive Scaling Normalization for End to End Image Compression☆10Sep 4, 2025Updated 6 months ago
- [CVPR 2024] LoSh: Long-Short Text Joint Prediction Network for Referring Video Object Segmentation☆13Jun 17, 2024Updated last year
- Code for the paper "IFFNeRF: 6D Pose Estimation from a Single Image and a 3D Gaussian Splatting Model"☆12May 26, 2024Updated last year
- [CVPR 2024] Dual Prototype Attention for Unsupervised Video Object Segmentation☆39Apr 21, 2024Updated last year
- Official repository of the "Shatter and Gather: Learning Referring Image Segmentation with Text Supervision (ICCV'23)"☆42Jan 29, 2024Updated 2 years ago
- ☆13Jan 21, 2025Updated last year
- [CVPR 2024] "Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition"☆12Feb 27, 2024Updated 2 years ago
- non-rigid registration in NIMBLE: A Non-rigid Hand Model with Bones and Muscles☆11Sep 2, 2022Updated 3 years ago
- Very simple Linux audio recording using Alsa☆11Aug 25, 2020Updated 5 years ago
- An unofficial implementation of Lite-RTSE, a cost-effective lite model for real-time speech enhancement☆14Nov 19, 2023Updated 2 years ago
- ☆10Jan 9, 2025Updated last year
- Aggregate and Discriminate: Pseudo Clips-Guided Boundary Perception for Video Moment Retrieval☆12Nov 25, 2024Updated last year
- My research on tc358743☆11Sep 28, 2019Updated 6 years ago
- Torch7 Library - Convert NN Models To iOS Format☆11Aug 8, 2016Updated 9 years ago
- Joint magnitude estimation and phase recovery using Cycle-in-Cycle GAN for non-parallel speech enhancement☆10Jan 24, 2022Updated 4 years ago
- ☆10Dec 29, 2021Updated 4 years ago
- [3DV 2026] GIGA: Generalizable Sparse Image-driven Gaussian Humans☆18Jan 28, 2026Updated last month