dochouyi / SUCCLinks
☆11Updated last year
Alternatives and similar repositories for SUCC
Users that are interested in SUCC are comparing it to the libraries listed below
Sorting:
- [CVPR2024] This is the official implement of MP5☆106Updated last year
- [IEEE TVCG 2025] Self-supervised Learning of Event-guided Video Frame Interpolation for Rolling Shutter Frames☆12Updated 5 months ago
- [NeurIPS 2025] Official implementation of "RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics"☆195Updated 3 weeks ago
- [Nature Machine Intelligence 2025] Emulating Human-like Adaptive Vision for Efficient and Flexible Machine Visual Perception☆87Updated last week
- [NeurIPS-2024] The offical Implementation of "Instruction-Guided Visual Masking"☆39Updated last year
- [ACMMM 2025] Benchmarking MLLM Codec Ability☆32Updated last year
- [World-Model-Survey-2024] Paper list and projects for World Model☆15Updated last year
- Heterogeneous Pre-trained Transformer (HPT) as Scalable Policy Learner.☆519Updated 11 months ago
- ☆36Updated 4 months ago
- InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy☆269Updated last week
- [NeurIPS 2025] DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge☆224Updated 2 months ago
- A simple visual test-time scaling method for GUI agent grounding☆18Updated 3 months ago
- [CVPR 2025] Official implementation of paper "MoVE-KD: Knowledge Distillation for VLMs with Mixture of Visual Encoders".☆46Updated 5 months ago
- Official Implementation of Frequency-enhanced Data Augmentation for Vision-and-Language Navigation (NeurIPS2023)☆13Updated last year
- HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model☆318Updated last month
- Latest Advances on Vison-Language-Action Models.☆119Updated 8 months ago
- Latest Advances on Embodied Multimodal LLMs (or Vison-Language-Action Models).☆121Updated last year
- [ICML 2025 Oral] Official repo of EmbodiedBench, a comprehensive benchmark designed to evaluate MLLMs as embodied agents.☆214Updated last month
- Unified Vision-Language-Action Model☆226Updated last month
- [CVPR 2025 (Oral)] Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key☆86Updated last month
- 😎 A curated list of CVPR 2025 Oral paper. Total 96☆55Updated 3 months ago
- Official repository for "iVideoGPT: Interactive VideoGPTs are Scalable World Models" (NeurIPS 2024), https://arxiv.org/abs/2405.15223☆157Updated last month
- Official repo of VLABench, a large scale benchmark designed for fairly evaluating VLA, Embodied Agent, and VLMs.☆332Updated last week
- ICCV2025☆142Updated this week
- ☆138Updated last year
- [CVPR 2025] The code for paper ''Video-3D LLM: Learning Position-Aware Video Representation for 3D Scene Understanding''.☆177Updated 5 months ago
- WorldVLA: Towards Autoregressive Action World Model☆539Updated last month
- An example reproduction checklist for AAAI-26 submissions.☆105Updated 3 months ago
- [ICLR2025] Official code implementation of Video-UTR: Unhackable Temporal Rewarding for Scalable Video MLLMs☆61Updated 8 months ago
- [NeurIPS 2025 Spotlight] SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation☆203Updated 4 months ago