A Large-scale Video Action Dataset
β466Jan 16, 2026Updated 4 months ago
Alternatives and similar repositories for Action100M
Users that are interested in Action100M are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [CVPR 2026 Highlightπ₯] MotionCrafter: Dense Geometry and Motion Reconstruction with a 4D VAEβ164May 18, 2026Updated last week
- Does Understanding Inform Generation in Unified Multimodal Models? From Analysis to Path Forwardβ60Nov 27, 2025Updated 5 months ago
- [NeurIPS 2025] InternScenes: A Large-scale Interactive Indoor Scene Dataset with Realistic Layouts.β247May 15, 2026Updated last week
- [CVPR2026 Highlight] Cubic Discrete Diffusion: Discrete Visual Generation on High-Dimensional Representation Tokens https://arxiv.org/absβ¦β54Apr 10, 2026Updated last month
- Code for ICCV'2025 (Best student paper honorable mention) "RayZer: A Self-supervised Large View Synthesis Model"β433Nov 24, 2025Updated 6 months ago
- Managed Database hosting by DigitalOcean β’ AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- [ICLR 2026] Trace Anything: Representing Any Video in 4D via Trajectory Fieldsβ536Oct 31, 2025Updated 6 months ago
- Official implementation of "Repurposing Geometric Foundation Models for Multi-view Diffusion"β196Apr 1, 2026Updated last month
- β12Jul 22, 2025Updated 10 months ago
- [ICLR 2026] UniVideo: Unified Understanding, Generation, and Editing for Videosβ521Feb 11, 2026Updated 3 months ago
- β78Apr 29, 2026Updated 3 weeks ago
- The official repo for "OpenMoE 2: Sparse Diffusion Language Models".β56Dec 28, 2025Updated 4 months ago
- [ICLR 2026] OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modelingβ473Apr 16, 2026Updated last month
- β28Jun 12, 2025Updated 11 months ago
- [ICCV2025 Oral] Latent Motion Token as the Bridging Language for Learning Robot Manipulation from Videosβ175Oct 1, 2025Updated 7 months ago
- AI Agents on DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- PyTorch implementation of "HERO: Human Reaction Generation from Videos (ICCV 2025)"β33Mar 27, 2026Updated last month
- Awesome latest models, datasets and benchmarks on streaming/online video understanding.β28Oct 19, 2025Updated 7 months ago
- ReSemAct: Advancing Fine-Grained Robotic Manipulation via Semantic Structuring and Affordance Refinementβ17Jan 5, 2026Updated 4 months ago
- [CVPR 2026] SpatialVID: A Large-Scale Video Dataset with Spatial Annotationsβ558Apr 22, 2026Updated last month
- A list of works on video generation towards world modelβ480Mar 21, 2026Updated 2 months ago
- Krea Realtime 14B. An open-source realtime AI video model.β549Nov 13, 2025Updated 6 months ago
- [ICLR 2026] An unified model for 4D human-scene reconstructionβ494Dec 30, 2025Updated 4 months ago
- [SIGGGRAPHASIA2025] PhySIC: Physically Plausible 3D Human-Scene Interaction and Contact from a Single Imageβ56Nov 8, 2025Updated 6 months ago
- State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!β2,284Apr 13, 2026Updated last month
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Official repository of paper "Mojito: LLM-Aided Motion Instructor with Jitter-Reduced Inertial Tokens".β24May 12, 2025Updated last year
- Tools for the Embody 3D Datasetβ251Oct 30, 2025Updated 6 months ago
- [SIGGRAPH 2025] Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Controlβ817Jun 9, 2025Updated 11 months ago
- [NeurIPS 2024] DN-4DGS: Denoised Deformable Network with Temporal-Spatial Aggregation for Dynamic Scene Renderingβ12Oct 22, 2024Updated last year
- [CVPR 2026] Scaling Spatial Intelligence with Multimodal Foundation Modelsβ262May 14, 2026Updated last week
- β17Jul 24, 2025Updated 10 months ago
- CVPR2025: Benchmarking Large Vision-Language Models via Directed Scene Graph for Comprehensive Image Captioningβ38Mar 21, 2025Updated last year
- [ICCV2025] TokenBridge: Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation. https://yuqingwang1029.github.io/Toβ¦β158Jul 24, 2025Updated 10 months ago
- [NeurIPS 2025] Frame In-N-Out: Unbounded Controllable Image-to-Video Generationβ33May 1, 2026Updated 3 weeks ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- DeepVerse: 4D Autoregressive Video Generation as a World Modelβ230Aug 11, 2025Updated 9 months ago
- Official repository for "Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models", https://arxiv.org/abs/2601.1983β¦β92Mar 9, 2026Updated 2 months ago
- Official implementation of the paper "Koala-36M: A Large-scale Video Dataset Improving Consistency between Fine-grained Conditions and Viβ¦β248Mar 19, 2025Updated last year
- β13Mar 5, 2025Updated last year
- [ICLR 2026] RoboInter: A Holistic Intermediate Representation Suite Towards Robotic Manipulationβ138Feb 14, 2026Updated 3 months ago
- The first multiplayer video world model in Minecraftβ200Mar 3, 2026Updated 2 months ago
- β85Nov 4, 2025Updated 6 months ago