4DVLab / FreqpolicyLinks
[NIPS 2025] FreqPolicy: Frequency Autoregressive Visuomotor Policy with Continuous Tokens
☆19Updated 3 months ago
Alternatives and similar repositories for Freqpolicy
Users that are interested in Freqpolicy are comparing it to the libraries listed below
Sorting:
- [ICCV 2025] RAGNet: Large-scale Reasoning-based Affordance Segmentation Benchmark towards General Grasping☆33Updated 2 months ago
- Geometry-Consistent Video Diffusion for Robotic Visual Policy Transfer☆28Updated 2 months ago
- [ICLR 2025] SPA: 3D Spatial-Awareness Enables Effective Embodied Representation☆172Updated 7 months ago
- [NeurIPS 2025] Source codes for the paper "MindJourney: Test-Time Scaling with World Models for Spatial Reasoning"☆124Updated 2 months ago
- The offical repo for paper "VQ-VLA: Improving Vision-Language-Action Models via Scaling Vector-Quantized Action Tokenizers" (ICCV 2025)☆108Updated 2 months ago
- Official implementation of Spatial-Forcing: Implicit Spatial Representation Alignment for Vision-language-action Model☆162Updated 2 weeks ago
- Official Implementation of Paper: WMPO: World Model-based Policy Optimization for Vision-Language-Action Models☆124Updated 3 weeks ago
- [NeurIPS 2025] DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge☆282Updated 3 weeks ago
- VLA-RFT: Vision-Language-Action Models with Reinforcement Fine-Tuning☆119Updated 3 months ago
- [CVPR 2025] Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning☆54Updated 9 months ago
- Official implementation of “4D LangVGGT: 4D Language-Visual Geometry Grounded Transformer”☆75Updated last month
- Codebase for paper "Geometry-aware 4D Video Generation for Robot Manipulation"☆71Updated 2 weeks ago
- ☆165Updated 2 weeks ago
- ☆63Updated last month
- WoW (World-Omniscient World Model) is a generative world model trained on 2 million robotic interaction trajectories, designed to imagine…☆135Updated 3 weeks ago
- [CVPR 2025]Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation☆173Updated 7 months ago
- ☆54Updated last year
- 3DAffordSplat: Efficient Affordance Reasoning with 3D Gaussians (ACM MM 25)☆67Updated 6 months ago
- Code implementation of the paper "World-in-World: World Models in a Closed-Loop World"☆122Updated last month
- [CVPR 2024] Situational Awareness Matters in 3D Vision Language Reasoning☆43Updated last year
- [NeurIPS 2025] OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding☆70Updated 3 months ago
- List of papers on video-centric robot learning☆22Updated last year
- Official repository for "Vid2World: Crafting Video Diffusion Models to Interactive World Models", https://arxiv.org/abs/2505.14357☆28Updated last month
- [NeurIPS 24] The implementation and dataset of LiveScene: Language Embedding Interactive Radiance Fields for Physical Scene Rendering and…☆60Updated 9 months ago
- Unifying 2D and 3D Vision-Language Understanding☆119Updated 6 months ago
- InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation☆91Updated 3 months ago
- Official implementation of CEED-VLA: Consistency Vision-Language-Action Model with Early-Exit Decoding.☆46Updated 4 months ago
- ☆47Updated 6 months ago
- [Official] AstraNav-Memory: Contexts Compression for Long Memory. An image-centric memory framework for lifelong embodied navigation via …☆19Updated last week
- Official implementation of "RoboTracer: Mastering Spatial Trace with Reasoning in Vision-Language Models for Robotics"☆48Updated last week