Dream-VL and Dream-VLA, a diffusion VLM and a diffusion VLA.
☆113Jan 14, 2026Updated 3 months ago
Alternatives and similar repositories for Dream-VLX
Users that are interested in Dream-VLX are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆15Jul 9, 2025Updated 9 months ago
- Extending context length of visual language models☆12Dec 18, 2024Updated last year
- Evaluating and reproducing real-world robot manipulation policies (e.g., RT-1, RT-1-X, Octo) in simulation under common setups (e.g., Goo…☆11Dec 30, 2024Updated last year
- ☆28Jul 23, 2025Updated 8 months ago
- ☆25Aug 23, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [ICCV 23] A Simple Vision Transformer for Weakly Semi-supervised 3D Object Detection☆13Apr 12, 2024Updated 2 years ago
- [AAAI 2026] WorldRFT: Latent World Model Planning with Reinforcement Fine-Tuning for Autonomous Driving☆35Dec 23, 2025Updated 3 months ago
- [ICRA 2026] UltraDexGrasp: Learning Universal Dexterous Grasping for Bimanual Robots with Synthetic Data☆64Mar 6, 2026Updated last month
- [ACL 2026 Findings] CoV: Chain-of-View Prompting for Spatial Reasoning☆54Apr 7, 2026Updated last week
- OmniGAIA: Towards Native Omni-Modal AI Agents☆89Apr 2, 2026Updated 2 weeks ago
- ☆18Nov 4, 2024Updated last year
- [MM2024] FusionOcc: Multi-Modal Fusion for 3D Occupancy Prediction☆24Dec 6, 2024Updated last year
- Official Implementation of SAGE-GRPO:Manifold-Aware Exploration for Reinforcement Learning in Video Generation☆106Apr 2, 2026Updated 2 weeks ago
- ☆33Jun 24, 2024Updated last year
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- This repository contains data, code and models for contextual noncompliance.☆25Jul 18, 2024Updated last year
- Minimal Decision Transformer Implementation written in Jax (Flax).☆17Aug 8, 2022Updated 3 years ago
- Towards Generalizable Robotic Manipulation in Dynamic Environments☆136Apr 1, 2026Updated 2 weeks ago
- Efficient Finetuning for OpenAI GPT-OSS☆23Oct 2, 2025Updated 6 months ago
- Official implementation of the paper: "A deeper look at depth pruning of LLMs"☆15Jul 24, 2024Updated last year
- ☆10Mar 13, 2023Updated 3 years ago
- nav2gpt: navigation based on llm and ros2☆17Jul 18, 2024Updated last year
- [ICML 2024] Self-Infilling Code Generation☆18May 5, 2024Updated last year
- [NeurIPS 2024] Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?☆151Aug 26, 2024Updated last year
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Official InfiniBench: A Benchmark for Large Multi-Modal Models in Long-Form Movies and TV Shows☆19Nov 4, 2025Updated 5 months ago
- Dynamic config system based on python classes☆12Jan 27, 2023Updated 3 years ago
- GSM-Plus: Data, Code, and Evaluation for Enhancing Robust Mathematical Reasoning in Math Word Problems.☆64Jul 8, 2024Updated last year
- Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders [Technical Report]☆174Mar 30, 2026Updated 2 weeks ago
- Official code repository of Shuffle-R1☆25Feb 23, 2026Updated last month
- Official repo for paper "EMMA: Efficient Multimodal Understanding, Generation, and Editing with a Unified Architecture."☆62Dec 16, 2025Updated 4 months ago
- ☆50Jan 28, 2025Updated last year
- ☆18Mar 20, 2022Updated 4 years ago
- Official github repo of G-LLaVA☆148Feb 20, 2025Updated last year
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆53Jan 31, 2026Updated 2 months ago
- Official implementation of Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More☆25Feb 25, 2025Updated last year
- ☆27Feb 26, 2023Updated 3 years ago
- The official implementation of Mantis: A Versatile Vision-Language-Action Model with Disentangled Visual Foresight☆89Jan 16, 2026Updated 3 months ago
- PyTorch Implementation of "Divide, Conquer and Combine: A Training-Free Framework for High-Resolution Image Perception in Multimodal Larg…☆48Mar 2, 2026Updated last month
- (NeurIPS '22) LISA: Learning Interpretable Skill Abstractions - A framework for unsupervised skill learning using Imitation☆29Feb 22, 2023Updated 3 years ago
- ☆44Mar 31, 2026Updated 2 weeks ago