Dream-VL and Dream-VLA, a diffusion VLM and a diffusion VLA.
☆113Jan 14, 2026Updated 3 months ago
Alternatives and similar repositories for Dream-VLX
Users that are interested in Dream-VLX are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆15Jul 9, 2025Updated 9 months ago
- Extending context length of visual language models☆12Dec 18, 2024Updated last year
- ☆21May 24, 2024Updated last year
- ☆28Jul 23, 2025Updated 9 months ago
- [ICCV 23] A Simple Vision Transformer for Weakly Semi-supervised 3D Object Detection☆13Apr 12, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [AAAI 2026] WorldRFT: Latent World Model Planning with Reinforcement Fine-Tuning for Autonomous Driving☆36Dec 23, 2025Updated 4 months ago
- Implementation of the paper 'Spec-VLA: Speculative Decoding for Vision-Language-Action Models with Relaxed Acceptance' (EMNLP 2025)☆28Dec 16, 2025Updated 4 months ago
- [ICRA 2026] UltraDexGrasp: Learning Universal Dexterous Grasping for Bimanual Robots with Synthetic Data☆66Mar 6, 2026Updated 2 months ago
- [ACL 2026 Findings] CoV: Chain-of-View Prompting for Spatial Reasoning☆58Apr 7, 2026Updated last month
- ☆18Nov 4, 2024Updated last year
- [MM2024] FusionOcc: Multi-Modal Fusion for 3D Occupancy Prediction☆24Dec 6, 2024Updated last year
- ☆33Jun 24, 2024Updated last year
- This repository contains data, code and models for contextual noncompliance.☆25Jul 18, 2024Updated last year
- CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning☆36Aug 28, 2025Updated 8 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Official Implementation of SAGE-GRPO:Manifold-Aware Exploration for Reinforcement Learning in Video Generation☆116Apr 2, 2026Updated last month
- [ICLR2026] Spatial Reasoning with Vision-Language Models☆48Jan 26, 2026Updated 3 months ago
- OmniGAIA: Towards Native Omni-Modal AI Agents☆126Apr 2, 2026Updated last month
- Efficient Finetuning for OpenAI GPT-OSS☆24Oct 2, 2025Updated 7 months ago
- Towards Generalizable Robotic Manipulation in Dynamic Environments☆189Apr 22, 2026Updated 2 weeks ago
- ☆10Mar 13, 2023Updated 3 years ago
- [ICML 2024] Self-Infilling Code Generation☆18May 5, 2024Updated 2 years ago
- [SIGGRAPH 2026] OmniRoam: World Wandering via Long-Horizon Panoramic Video Generation☆89Apr 8, 2026Updated 3 weeks ago
- [NeurIPS 2024] Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?☆151Aug 26, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Official InfiniBench: A Benchmark for Large Multi-Modal Models in Long-Form Movies and TV Shows☆20Nov 4, 2025Updated 6 months ago
- Schoenfeld’s Anatomy of Mathematical Reasoning by Language Models☆22Dec 21, 2025Updated 4 months ago
- Dynamic config system based on python classes☆12Jan 27, 2023Updated 3 years ago
- Official code repository of Shuffle-R1☆25Feb 23, 2026Updated 2 months ago
- Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders [Technical Report]☆182Mar 30, 2026Updated last month
- Official github repo of G-LLaVA☆148Feb 20, 2025Updated last year
- Official implementation of Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More☆25Feb 25, 2025Updated last year
- ☆27Feb 26, 2023Updated 3 years ago
- Discrete Diffusion VLA: Bringing Discrete Diffusion to Action Decoding in Vision-Language-Action Policies☆61Dec 3, 2025Updated 5 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- PyTorch Implementation of "Divide, Conquer and Combine: A Training-Free Framework for High-Resolution Image Perception in Multimodal Larg…☆48Mar 2, 2026Updated 2 months ago
- ReaRAG: Knowledge-guided Reasoning Enhances Factuality of Large Reasoning Models with Iterative Retrieval Augmented Generation☆26Aug 24, 2025Updated 8 months ago
- ☆44Mar 31, 2026Updated last month
- We propose a novel modular framework that learns to dynamically mix low-rank adapters (LoRAs) to improve visual analogy learning, enablin…☆74Apr 12, 2026Updated 3 weeks ago
- A simple video streaming baseline that outperforms SOTAs.☆116Updated this week
- LLaVA-VLA: A Simple Yet Powerful Vision-Language-Action Model [ICRA 2026]☆187Mar 12, 2026Updated last month
- [EMNLP-2022 Findings] Code for paper “ProGen: Progressive Zero-shot Dataset Generation via In-context Feedback”.☆27Feb 4, 2023Updated 3 years ago