☆74Apr 21, 2026Updated 2 months ago
Alternatives and similar repositories for DSR_Suite
Users that are interested in DSR_Suite are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [NeurIPS2025] The official implementation of MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO☆141Oct 15, 2025Updated 8 months ago
- The official implementation of A Counting-Aware Hierarchical Decoding Framework for Generalized Referring Expression Segmentation☆26Aug 17, 2025Updated 10 months ago
- 3D BBox refinement interface used in LabelAny3D (NeurIPS 2025)☆22Jan 6, 2026Updated 5 months ago
- [AAAI 2026] Official implementation of the paper ”SegDINO3D: 3D Instance Segmentation Empowered by Both Image-Level and Object-Level 2D F…☆63Jan 8, 2026Updated 5 months ago
- [CVPR 2026] LongVideo-R1: Smart Navigation for Low-cost Long Video Understanding☆49Feb 28, 2026Updated 4 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Code release for 'Struct2D: A Perception-Guided Framework for Spatial Reasoning in MLLMs' (NeurIPS 2025)☆31Oct 28, 2025Updated 8 months ago
- [ICML 2026] Official implementation of "Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence"☆155May 1, 2026Updated 2 months ago
- 2022 秋季学期清华大学电子系数据与算法课程 OJ 参考解答☆10Jun 18, 2023Updated 3 years ago
- [ACMMM 2024] Consistent123: One Image to Highly Consistent 3D Asset Using Case-Aware Diffusion Priors☆25Oct 22, 2024Updated last year
- ☆38Dec 19, 2025Updated 6 months ago
- [CVPR 2025] LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding☆84Jul 4, 2025Updated last year
- ☆34Feb 12, 2026Updated 4 months ago
- [CVPR 2026] Video-as-Answer: Predict and Generate Next Video Event with Joint-GRPO☆120Feb 28, 2026Updated 4 months ago
- Official code release for "INPC: Implicit Neural Point Clouds for Radiance Field Rendering" and "A Bag of Tricks for Efficient Implicit N…☆37Feb 23, 2026Updated 4 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [ACL 2026 Findings, ICCV 2025 Workshop Outstanding Paper Award] VChain: Chain-of-Visual-Thought for Reasoning in Video Generation☆120Apr 8, 2026Updated 2 months ago
- [3DV 2024] Repository for "Multi-Body Neural Scene Flow", in International Conference on 3D Vision 2024.☆14Mar 11, 2024Updated 2 years ago
- Structured Video Comprehension of Real-World Shorts☆238Sep 21, 2025Updated 9 months ago
- Official implementation of paper VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video Comprehension with Video-Text Duet Interact…☆43Feb 5, 2025Updated last year
- TEMPURA enables video-language models to reason about causal event relationships and generate fine-grained, timestamped descriptions of u…☆27Jun 4, 2025Updated last year
- [ICCV 2025 Oral] Official implementation of Learning Streaming Video Representation via Multitask Training.☆92Dec 24, 2025Updated 6 months ago
- ☆31Apr 11, 2025Updated last year
- ☆13Mar 28, 2025Updated last year
- Official code repository of Shuffle-R1☆26Feb 23, 2026Updated 4 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- The official implementation of StereoPilot☆115Dec 19, 2025Updated 6 months ago
- ☆75May 2, 2026Updated 2 months ago
- VideoEval-Pro: Robust and Realistic Long Video Understanding Evaluation [TMLR26]☆17Jun 1, 2026Updated last month
- [CVPR2026 Highlight] Cubic Discrete Diffusion: Discrete Visual Generation on High-Dimensional Representation Tokens https://arxiv.org/abs…☆61Apr 10, 2026Updated 2 months ago
- Schoenfeld’s Anatomy of Mathematical Reasoning by Language Models☆27Dec 21, 2025Updated 6 months ago
- ☆136Mar 11, 2026Updated 3 months ago
- TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation☆237Aug 18, 2025Updated 10 months ago
- FASTER: Rethinking Real-Time Flow VLAs☆134May 14, 2026Updated last month
- Finetune SAM3 with LoRA — optimized for images. A simple setup for training SAM3 on image datasets. Video finetuning is not yet supported…☆227Updated this week
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Official Implementation of ICCV 2023 Paper - SegPrompt: Boosting Open-World Segmentation via Category-level Prompt Learning☆112May 28, 2025Updated last year
- We propose a novel modular framework that learns to dynamically mix low-rank adapters (LoRAs) to improve visual analogy learning, enablin…☆75Jun 22, 2026Updated last week
- ☆55Oct 3, 2024Updated last year
- ☆55Jun 4, 2025Updated last year
- ☆20Jan 1, 2026Updated 6 months ago
- This is the official implementation of RGNet: A Unified Retrieval and Grounding Network for Long Videos☆20Mar 3, 2025Updated last year
- [ICCV 2023] Compositional Feature Augmentation for Unbiased Scene Graph Generation☆16Dec 5, 2023Updated 2 years ago