A Large-scale Video Action Dataset
☆410Jan 16, 2026Updated last month
Alternatives and similar repositories for Action100M
Users that are interested in Action100M are comparing it to the libraries listed below
Sorting:
- [CVPR 2026] MotionCrafter: Dense Geometry and Motion Reconstruction with a 4D VAE☆113Feb 27, 2026Updated last week
- [NeurIPS 2025] InternScenes: A Large-scale Interactive Indoor Scene Dataset with Realistic Layouts.☆228Oct 17, 2025Updated 4 months ago
- Does Understanding Inform Generation in Unified Multimodal Models? From Analysis to Path Forward☆60Nov 27, 2025Updated 3 months ago
- [ICLR 2026] UniVideo: Unified Understanding, Generation, and Editing for Videos☆438Feb 11, 2026Updated 3 weeks ago
- Code for ICCV'2025 (Best student paper honorable mention) "RayZer: A Self-supervised Large View Synthesis Model"☆415Nov 24, 2025Updated 3 months ago
- [ICCV2025 Oral] Latent Motion Token as the Bridging Language for Learning Robot Manipulation from Videos☆164Oct 1, 2025Updated 5 months ago
- [ICLR 2026] RoboInter: A Holistic Intermediate Representation Suite Towards Robotic Manipulation☆102Feb 14, 2026Updated 2 weeks ago
- [ICLR 2026] OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling☆433Feb 25, 2026Updated last week
- Official repository of paper "Mojito: LLM-Aided Motion Instructor with Jitter-Reduced Inertial Tokens".☆20May 12, 2025Updated 9 months ago
- CVPR2025: Benchmarking Large Vision-Language Models via Directed Scene Graph for Comprehensive Image Captioning☆38Mar 21, 2025Updated 11 months ago
- Tools for the Embody 3D Dataset☆218Oct 30, 2025Updated 4 months ago
- State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!☆2,177Feb 11, 2026Updated 3 weeks ago
- [ICLR 2026] Trace Anything: Representing Any Video in 4D via Trajectory Fields☆511Oct 31, 2025Updated 4 months ago
- UniLat3D: Geometry-Appearance Unified Latents for Single-Stage 3D Generation☆135Nov 19, 2025Updated 3 months ago
- ☆27Mar 3, 2025Updated last year
- Krea Realtime 14B. An open-source realtime AI video model.☆497Nov 13, 2025Updated 3 months ago
- Official implementation of the paper "Koala-36M: A Large-scale Video Dataset Improving Consistency between Fine-grained Conditions and Vi…☆237Mar 19, 2025Updated 11 months ago
- Atom3d, atomising geometry, is a mesh processing toolbox specifically designed for 3D learning.☆136Jan 17, 2026Updated last month
- Cambrian-S: Towards Spatial Supersensing in Video☆500Dec 27, 2025Updated 2 months ago
- Scaling Spatial Intelligence with Multimodal Foundation Models☆177Feb 6, 2026Updated last month
- [CVPR'25] How Do I Do That? Synthesizing 3D Hand Motion and Contacts for Everyday Interactions☆32Oct 5, 2025Updated 5 months ago
- Orient Anything, ICML 2025☆374Feb 6, 2026Updated last month
- [ICCV2025] TokenBridge: Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation. https://yuqingwang1029.github.io/To…☆152Jul 24, 2025Updated 7 months ago
- [ICCV 2025] A Simple yet Effective Pathway to Empowering LLaVA to Understand and Interact with 3D World☆373Oct 21, 2025Updated 4 months ago
- ☆437Dec 8, 2025Updated 2 months ago
- Awesome latest models, datasets and benchmarks on streaming/online video understanding.☆23Oct 19, 2025Updated 4 months ago
- Energy-based Dropout and Pruning of Deep Neural Networks☆10Oct 9, 2020Updated 5 years ago
- ☆15May 14, 2025Updated 9 months ago
- ReSemAct: Advancing Fine-Grained Robotic Manipulation via Semantic Structuring and Affordance Refinement☆17Jan 5, 2026Updated 2 months ago
- Official implementation of "NoiseAR: AutoRegressing Initial Noise Prior for Diffusion Models"☆18Jun 3, 2025Updated 9 months ago
- ☆78Nov 4, 2025Updated 4 months ago
- ☆13Sep 28, 2024Updated last year
- ☆20Nov 21, 2025Updated 3 months ago
- [ICLR 2025] This repo is the official implementation of "The Labyrinth of Links: Navigating the Associative Maze of Multi-modal LLMs".☆13Jan 25, 2025Updated last year
- [SIGGRAPH 2025] Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control☆808Jun 9, 2025Updated 8 months ago
- Causal video-action world model for generalist robot control☆708Feb 27, 2026Updated last week
- [CVPR 2026] SpatialVID: A Large-Scale Video Dataset with Spatial Annotations☆504Updated this week
- [CVPR 2026] The official implementation of The paper "Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation"☆105Updated this week
- ICML2025☆63Aug 28, 2025Updated 6 months ago