worv-ai / D2ELinks
D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI
☆59Updated 2 weeks ago
Alternatives and similar repositories for D2E
Users that are interested in D2E are comparing it to the libraries listed below
Sorting:
- ☆52Updated last week
- ☆227Updated 5 months ago
- Dataset and Benchmark code for EgoEdit☆81Updated last week
- One-shot and Few-shot 3D Editing without Per-Scene Optimization☆160Updated 3 months ago
- [AAAI 2026] UltraGen☆74Updated last month
- iMontage: Unified, Versatile, Highly Dynamic Many-to-many Image Generation☆177Updated 2 weeks ago
- ☆343Updated 4 months ago
- [ICCV 2025] Enhancing spatial understanding in text-to-Image diffusion models☆90Updated 3 months ago
- OmniInsert: Mask-Free Video Insertion of Any Reference via Diffusion Transformer Models☆144Updated 2 months ago
- Official implementation for "Story2Board: A Training‑Free Approach for Expressive Storyboard Generation"☆211Updated 3 months ago
- VFXMaster: Unlocking Dynamic Visual Effect Generation via In-Context Learning☆60Updated last month
- Official repo for paper "Video-As-Prompt: Unified Semantic Control for Video Generation"☆322Updated last month
- End2End Virtual Try-on with Visual Reference☆56Updated 3 weeks ago
- The implementation of Extreme Viewpoint 4D Video Generation☆248Updated 3 months ago
- A real-time streaming conversational video system that transforms text interactions into continuous, high-fidelity video responses using …☆84Updated last week
- 🎨 A Style is Worth One Code: Unlocking Code-to-Style Image Generation with Discrete Style Space☆147Updated 2 weeks ago
- Pose Extraction & Rendering for SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose Representat…☆48Updated this week
- PlayerOne: Egocentric World Simulator☆178Updated 6 months ago
- The official implementation of ”RepVideo: Rethinking Cross-Layer Representation for Video Generation“☆123Updated 10 months ago
- OmniVCus: Feedforward Subject-driven Video Customization with Multimodal Control Conditions (NeurIPS 2025)☆84Updated 2 weeks ago
- Make self forcing endless. Add cache purging. Add prompt controllability.☆67Updated 3 months ago
- Official implementation of "DiT360: High-Fidelity Panoramic Image Generation via Hybrid Training".☆159Updated last month
- Unified Video Editing with Temporal Reasoner☆86Updated this week
- Code for CineScale, higher-resolution video generation based on Wan☆179Updated 3 months ago
- [ICCV 2025] Official implementation of the paper "DreamCube: 3D Panorama Generation via Multi-plane Synchronization".☆159Updated this week
- An official implementation of SwapAnyone.☆71Updated 9 months ago
- https://little-misfit.github.io/GRAG-Image-Editing/☆111Updated 3 weeks ago
- FIBO is a SOTA, first open-source, JSON-native text-to-image model built for controllable, predictable, and legally safe image generation…☆287Updated 2 weeks ago
- [ICCV 2025] Video-T1: Test-Time Scaling for Video Generation☆301Updated 5 months ago
- ☆106Updated 3 months ago