A Large-scale Video Action Dataset
β444Jan 16, 2026Updated 2 months ago
Alternatives and similar repositories for Action100M
Users that are interested in Action100M are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [CVPR 2026 Highlightπ₯] MotionCrafter: Dense Geometry and Motion Reconstruction with a 4D VAEβ152Apr 9, 2026Updated last week
- Does Understanding Inform Generation in Unified Multimodal Models? From Analysis to Path Forwardβ60Nov 27, 2025Updated 4 months ago
- [NeurIPS 2025] InternScenes: A Large-scale Interactive Indoor Scene Dataset with Realistic Layouts.β238Oct 17, 2025Updated 5 months ago
- [CVPR2026 Highlight] Cubic Discrete Diffusion: Discrete Visual Generation on High-Dimensional Representation Tokens https://arxiv.org/absβ¦β53Updated this week
- Official implementation of "Repurposing Geometric Foundation Models for Multi-view Diffusion"β171Apr 1, 2026Updated 2 weeks ago
- Serverless GPU API endpoints on Runpod - Bonus Credits β’ AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Code for ICCV'2025 (Best student paper honorable mention) "RayZer: A Self-supervised Large View Synthesis Model"β424Nov 24, 2025Updated 4 months ago
- [ICLR 2026] Trace Anything: Representing Any Video in 4D via Trajectory Fieldsβ525Oct 31, 2025Updated 5 months ago
- The official repo for "OpenMoE 2: Sparse Diffusion Language Models".β54Dec 28, 2025Updated 3 months ago
- β12Jul 22, 2025Updated 8 months ago
- [ICLR 2026] UniVideo: Unified Understanding, Generation, and Editing for Videosβ492Feb 11, 2026Updated 2 months ago
- [ICLR 2026] OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modelingβ456Mar 25, 2026Updated 3 weeks ago
- β27Jun 12, 2025Updated 10 months ago
- [ICCV2025 Oral] Latent Motion Token as the Bridging Language for Learning Robot Manipulation from Videosβ171Oct 1, 2025Updated 6 months ago
- PyTorch implementation of "HERO: Human Reaction Generation from Videos (ICCV 2025)"β32Mar 27, 2026Updated 2 weeks ago
- Virtual machines for every use case on DigitalOcean β’ AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- [CVPR 2026] SpatialVID: A Large-Scale Video Dataset with Spatial Annotationsβ536Apr 9, 2026Updated last week
- Awesome latest models, datasets and benchmarks on streaming/online video understanding.β25Oct 19, 2025Updated 5 months ago
- ReSemAct: Advancing Fine-Grained Robotic Manipulation via Semantic Structuring and Affordance Refinementβ18Jan 5, 2026Updated 3 months ago
- A list of works on video generation towards world modelβ454Mar 21, 2026Updated 3 weeks ago
- Krea Realtime 14B. An open-source realtime AI video model.β521Nov 13, 2025Updated 5 months ago
- Tools for the Embody 3D Datasetβ234Oct 30, 2025Updated 5 months ago
- An unified model for 4D human-scene reconstructionβ469Dec 30, 2025Updated 3 months ago
- [CVPR2026] Scaling Spatial Intelligence with Multimodal Foundation Modelsβ198Updated this week
- State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!β2,242Mar 12, 2026Updated last month
- Virtual machines for every use case on DigitalOcean β’ AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Official repository of paper "Mojito: LLM-Aided Motion Instructor with Jitter-Reduced Inertial Tokens".β21May 12, 2025Updated 11 months ago
- [ICLR 2026] RoboInter: A Holistic Intermediate Representation Suite Towards Robotic Manipulationβ120Feb 14, 2026Updated 2 months ago
- [SIGGRAPH 2025] Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Controlβ815Jun 9, 2025Updated 10 months ago
- [NeurIPS 2024] DN-4DGS: Denoised Deformable Network with Temporal-Spatial Aggregation for Dynamic Scene Renderingβ12Oct 22, 2024Updated last year
- β17Jul 24, 2025Updated 8 months ago
- Official repository for "Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models", https://arxiv.org/abs/2601.1983β¦β87Mar 9, 2026Updated last month
- CVPR2025: Benchmarking Large Vision-Language Models via Directed Scene Graph for Comprehensive Image Captioningβ38Mar 21, 2025Updated last year
- [ICCV2025] TokenBridge: Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation. https://yuqingwang1029.github.io/Toβ¦β155Jul 24, 2025Updated 8 months ago
- DeepVerse: 4D Autoregressive Video Generation as a World Modelβ222Aug 11, 2025Updated 8 months ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [NeurIPS 2025] Frame In-N-Out: Unbounded Controllable Image-to-Video Generationβ31Jan 5, 2026Updated 3 months ago
- β13Mar 5, 2025Updated last year
- Official implementation of the paper "Koala-36M: A Large-scale Video Dataset Improving Consistency between Fine-grained Conditions and Viβ¦β246Mar 19, 2025Updated last year
- β82Nov 4, 2025Updated 5 months ago
- Use Blender for figures.β15Feb 11, 2026Updated 2 months ago
- Official implementation of AMPLIFY: Actionless Motion Priors for Robot Learning from Videosβ47Apr 8, 2026Updated last week
- β27Mar 3, 2025Updated last year