☆69Apr 1, 2026Updated 2 weeks ago
Alternatives and similar repositories for DSR_Suite
Users that are interested in DSR_Suite are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICCV 2023 Workshop] The Official Implementation of The First Prize Solution for RVOS Competition☆14Jan 1, 2024Updated 2 years ago
- [CVPR 2026] TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs☆123Mar 12, 2026Updated last month
- The official implementation of A Counting-Aware Hierarchical Decoding Framework for Generalized Referring Expression Segmentation☆25Aug 17, 2025Updated 7 months ago
- Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning☆20Dec 21, 2023Updated 2 years ago
- [AAAI 2026] Official implementation of the paper ”SegDINO3D: 3D Instance Segmentation Empowered by Both Image-Level and Object-Level 2D F…☆41Jan 8, 2026Updated 3 months ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Code release for 'Struct2D: A Perception-Guided Framework for Spatial Reasoning in MLLMs' (NeurIPS 2025)☆30Oct 28, 2025Updated 5 months ago
- [NeurIPS 2023] The official implementation of SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation☆33Mar 16, 2024Updated 2 years ago
- Official implementation of "Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence"☆138Dec 18, 2025Updated 3 months ago
- [ACMMM 2024] Consistent123: One Image to Highly Consistent 3D Asset Using Case-Aware Diffusion Priors☆25Oct 22, 2024Updated last year
- ☆39Dec 19, 2025Updated 3 months ago
- [CVPR 2026] Video-as-Answer: Predict and Generate Next Video Event with Joint-GRPO☆107Feb 28, 2026Updated last month
- [CVPR 2025] LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding☆84Jul 4, 2025Updated 9 months ago
- ☆29Feb 12, 2026Updated 2 months ago
- Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning?☆91Jul 13, 2025Updated 9 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Official code release for "INPC: Implicit Neural Point Clouds for Radiance Field Rendering" and "A Bag of Tricks for Efficient Implicit N…☆33Feb 23, 2026Updated last month
- [3DV 2024] Repository for "Multi-Body Neural Scene Flow", in International Conference on 3D Vision 2024.☆14Mar 11, 2024Updated 2 years ago
- Structured Video Comprehension of Real-World Shorts☆236Sep 21, 2025Updated 6 months ago
- Official implementation of paper VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video Comprehension with Video-Text Duet Interact…☆44Feb 5, 2025Updated last year
- TEMPURA enables video-language models to reason about causal event relationships and generate fine-grained, timestamped descriptions of u…☆25Jun 4, 2025Updated 10 months ago
- ☆27Apr 11, 2025Updated last year
- ☆13Mar 28, 2025Updated last year
- ☆60Nov 12, 2025Updated 5 months ago
- More reliable Video Understanding Evaluation☆15Sep 23, 2025Updated 6 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Official code repository of Shuffle-R1☆25Feb 23, 2026Updated last month
- [CVPR2026 Highlight] Cubic Discrete Diffusion: Discrete Visual Generation on High-Dimensional Representation Tokens https://arxiv.org/abs…☆53Updated this week
- TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation☆236Aug 18, 2025Updated 7 months ago
- ☆130Mar 11, 2026Updated last month
- ICML2025☆64Aug 28, 2025Updated 7 months ago
- The official repo for "CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models"☆31Mar 26, 2026Updated 2 weeks ago
- ✨First Open-Source R1-like Video-LLM [2025/02/18]☆383Feb 23, 2025Updated last year
- ☆57Oct 3, 2024Updated last year
- ☆51Jun 4, 2025Updated 10 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- We propose a novel modular framework that learns to dynamically mix low-rank adapters (LoRAs) to improve visual analogy learning, enablin …☆73Updated this week
- This is the official implementation of RGNet: A Unified Retrieval and Grounding Network for Long Videos☆19Mar 3, 2025Updated last year
- ☆20Jan 1, 2026Updated 3 months ago
- [ICCV 2023] Compositional Feature Augmentation for Unbiased Scene Graph Generation☆15Dec 5, 2023Updated 2 years ago
- [CVPR 2026] Official release of "Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning"☆126Apr 7, 2026Updated last week
- Offical implementation of CVPR 2026 paper SpaceDrive: Infusing Spatial Awareness into VLM-based Autonomous Driving.☆55Mar 30, 2026Updated 2 weeks ago
- [CVPR 2025] Official PyTorch code of "Enhancing Video-LLM Reasoning via Agent-of-Thoughts Distillation".☆55May 25, 2025Updated 10 months ago