nrasajski / BehAVELinks
☆10Updated 8 months ago
Alternatives and similar repositories for BehAVE
Users that are interested in BehAVE are comparing it to the libraries listed below
Sorting:
- ☆150Updated 10 months ago
- ☆21Updated last year
- [ICCV 2023] Code for "Multi-task View Synthesis with Neural Radiance Fields"☆11Updated 2 years ago
- Code release for NeurIPS 2023 paper SlotDiffusion: Object-centric Learning with Diffusion Models☆92Updated last year
- Official implementation of "Self-Improving Video Generation"☆75Updated 6 months ago
- [ICLR 2025] Official implementation and benchmark evaluation repository of <PhysBench: Benchmarking and Enhancing Vision-Language Models …☆75Updated 5 months ago
- [ECCV 2024] STEVE in Minecraft is for See and Think: Embodied Agent in Virtual Environment☆39Updated last year
- Thinking with Videos from Open-Source Priors. We reproduce chain-of-frames visual reasoning by fine-tuning open-source video models. Give…☆181Updated last month
- [ICLR 2025] Source code for paper "A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegr…☆78Updated 11 months ago
- 👆Pytorch implementation of "Ctrl-V: Higher Fidelity Video Generation with Bounding-Box Controlled Object Motion"☆30Updated 3 months ago
- [ICML 2024] Compositional Image Decomposition with Diffusion Models☆51Updated last year
- ElasticTok: Adaptive Tokenization for Image and Video☆83Updated last year
- [ECCV2024, Oral, Best Paper Finalist] This is the official implementation of the paper "LEGO: Learning EGOcentric Action Frame Generation…☆39Updated 8 months ago
- [CVPR 2025] 3D-GRAND: Towards Better Grounding and Less Hallucination for 3D-LLMs☆51Updated last year
- SimWorld: An Open-ended Realistic Simulator for Autonomous Agents in Physical and Social Worlds☆74Updated last week
- Official PyTorch Implementation for Dual-Process Image Generation, ICCV 2025☆109Updated 2 months ago
- Official implementation of EPiC: Efficient Video Camera Control Learning with Precise Anchor-Video Guidance☆45Updated 5 months ago
- LVAS-Agent Code Base☆21Updated 7 months ago
- Code for Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense? [COLM 2024]☆25Updated last year
- ☆49Updated 2 years ago
- ☆30Updated 5 months ago
- FQGAN: Factorized Visual Tokenization and Generation☆54Updated 7 months ago
- ☆37Updated 9 months ago
- A paper list that includes world models or generative video models for embodied agents.☆25Updated 10 months ago
- [ICCV 2025] Official code for Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation☆47Updated 2 months ago
- Code for our paper: Learning Camera Movement Control from Real-World Drone Videos☆32Updated 7 months ago
- Code for the paper "GenHowTo: Learning to Generate Actions and State Transformations from Instructional Videos" published at CVPR 2024☆51Updated last year
- Official repo of the ICLR 2025 paper "MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos"☆29Updated 4 months ago
- Code for Stable Control Representations☆26Updated 7 months ago
- Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization☆24Updated 7 months ago