MCG-NJU / SAM2-PlusLinks
SAM 2++: Tracking Anything at Any Granularity
☆49Updated this week
Alternatives and similar repositories for SAM2-Plus
Users that are interested in SAM2-Plus are comparing it to the libraries listed below
Sorting:
- (ICCV 2025) ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations☆124Updated 3 weeks ago
- Implementation of Zero-Shot Video Semantic Segmentation [CVPR 2025]☆55Updated 9 months ago
- Official implementation of DepthLM☆273Updated 2 months ago
- 4D Panoptic Scene Graph Generation (NeurIPS'23 Spotlight)☆116Updated 9 months ago
- ☆25Updated 8 months ago
- ☆35Updated 7 months ago
- Multi-SpatialMLLM Multi-Frame Spatial Understanding with Multi-Modal Large Language Models☆163Updated 2 months ago
- ECCV 2024 STMA & CVPR 2024 1st MOSE & 1st VOT Challenge & 1st LSVOS v6☆11Updated last year
- Official implementation of "Exploring Temporally-Aware Features for Point Tracking" (CVPR 2025)☆102Updated 8 months ago
- [ICLR 2025] MVTokenFlow: High-quality 4D Content Generation using Multiview Token Flow☆24Updated 8 months ago
- [EMNLP 2025 Findings] 3D-Aware Vision-Language Models Fine-Tuning with Geometric Distillation☆27Updated 6 months ago
- [CVPR'2025] EntitySAM: Segment Everything in Video☆57Updated 4 months ago
- Official implementation of “4D LangVGGT: 4D Language-Visual Geometry Grounded Transformer”☆33Updated last week
- [Arxiv'25] MGVQ: Could VQ-VAE Beat VAE? A Generalizable Tokenizer with Multi-group Quantization☆52Updated 2 months ago
- Official Code For Track Everything Everywhere Fast and Robustly☆66Updated 8 months ago
- [ICLR 2025] Where Am I and What Will I See : An Auto-Regressive Model for Spatial Localization and View Prediction☆42Updated 4 months ago
- ☆26Updated 8 months ago
- [NeurIPS 2025] Official code for JAFAR: Jack up Any Feature at Any Resolution☆206Updated 2 weeks ago
- Pytorch implementation of GaussianToken: An Effective Image Tokenizer with 2D Gaussian Splatting☆100Updated 8 months ago
- [CVPR 2025] Open-World Amodal Appearance Completion☆43Updated last month
- ☆115Updated 3 months ago
- [AAAI 2025] GFlow: Recovering 4D World from Monocular Video☆59Updated 7 months ago
- [ICCV 2025] Improving 3D Large Language Model via Robust Instruction Tuning☆64Updated last month
- Official implementation of "Seurat: From Moving Points to Depth", CVPR 2025 Highlight☆67Updated 8 months ago
- SpatialScore: Towards Unified Evaluation for Multimodal Spatial Understanding☆59Updated 5 months ago
- Scaling Properties of Diffusion Models For Perceptual Tasks (CVPR 2025)☆44Updated 7 months ago
- open-sourced video dataset with dynamic scenes and camera movements annotation☆79Updated 7 months ago
- OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models☆76Updated 2 months ago
- This is the official implementation of work HiM2SAM in PRCV25.☆23Updated 3 months ago
- [NeurIPS 2025] LangSplatV2: High-dimensional 3D Language Gaussian Splatting with 450+ FPS☆152Updated last month