Kiteretsu77/This_and_That_VDM

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Kiteretsu77/This_and_That_VDM)

Kiteretsu77 / This_and_That_VDM

This is the official implementation of Video Generation part of This&That: Language-Gesture Controlled Video Generation for Robot Planning (ICRA 2025)

☆49

Alternatives and similar repositories for This_and_That_VDM

Users that are interested in This_and_That_VDM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

cfeng16 / this-and-that
View on GitHub
☆18Jul 9, 2024Updated 2 years ago
LARG / SocialNavSUB
View on GitHub
[CoRL 2025] VLM Benchmark for Social Navigation Scene Understanding
☆22Sep 3, 2025Updated 10 months ago
clear-nus / ltldog
View on GitHub
☆13Dec 17, 2025Updated 7 months ago
sky-lzy / Structured-4D-Model
View on GitHub
☆23Jul 2, 2026Updated 3 weeks ago
flow-diffusion / AVDC
View on GitHub
Official repository of Learning to Act from Actionless Videos through Dense Correspondences.
☆262Apr 25, 2024Updated 2 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
UVA-Computer-Vision-Lab / FrameINO
View on GitHub
[NeurIPS 2025] Frame In-N-Out: Unbounded Controllable Image-to-Video Generation
☆33May 1, 2026Updated 2 months ago
video-language-planning / vlp_code
View on GitHub
☆82May 23, 2025Updated last year
shim0114 / T2V-Diffusion-Search
View on GitHub
[NeurIPS 2025] Inference-Time Text-to-Video Alignment with Diffusion Latent Beam Search
☆18Feb 24, 2026Updated 5 months ago
homangab / Track-2-Act
View on GitHub
code for the paper Predicting Point Tracks from Internet Videos enables Diverse Zero-Shot Manipulation
☆105Jul 31, 2024Updated last year
world-action-verifier / wav_minigrid
View on GitHub
☆24Jul 11, 2026Updated 2 weeks ago
HumanSupportRobot / QandA
View on GitHub
Questions and Answer site
☆12Apr 12, 2022Updated 4 years ago
UT-HCRL / LEGATO
View on GitHub
Official codebase for LEGATO (Learning with a Handheld Grasping Tool)
☆73Aug 19, 2025Updated 11 months ago
Max-Fu / otter
View on GitHub
[ICML 2025] OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction
☆118Apr 14, 2025Updated last year
sled-group / RACER
View on GitHub
[ICRA 2025] RACER: Rich Language-Guided Failure Recovery Policies for Imitation Learning
☆47Oct 10, 2024Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
XuweiyiChen / UniCtrl
View on GitHub
[TMLR] Official implementation of UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free U…
☆75Nov 29, 2024Updated last year
SIGVerse / sigverse_ros_package
View on GitHub
☆11Jan 14, 2026Updated 6 months ago
jmwang0117 / Video4Robot
View on GitHub
List of papers on video-centric robot learning
☆23Nov 16, 2024Updated last year
zhourui9813 / TwinRL
View on GitHub
Official Repository of "TwinRL-VLA: Digital Twin-Driven Reinforcement Learning for Real-World Robotic Manipulation"
☆22Jul 11, 2026Updated 2 weeks ago
KAIST-Visual-AI-Group / StochSync
View on GitHub
Official implementation of StochSync: a zero-shot approach for image generation in arbitrary spaces via stochastic diffusion synchronizat…
☆21Jun 24, 2025Updated last year
boschresearch / mj-grasp-sim
View on GitHub
MuJoCo Grasping Simulator
☆18Feb 27, 2025Updated last year
LegendLeoChen / LEO-RobotAgent
View on GitHub
A general-purpose robotic agent framework based on LLMs. The LLM can independently reason, plan, and execute actions to operate diverse r…
☆22Dec 14, 2025Updated 7 months ago
genaug / genaug
View on GitHub
main augmentation script for real world robot dataset.
☆40May 18, 2023Updated 3 years ago
nature21 / magic
View on GitHub
Official code for "One-Shot Manipulation Strategy Learning by Making Contact Analogies".
☆28Feb 7, 2025Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
Hibikino-Musashi-Home / hma_wrs_sim_ws
View on GitHub
This is the workspace of the WRS simulator of Hibikino-Musashi@Home (HMA).
☆15Mar 29, 2022Updated 4 years ago
avlmaps / AVLMaps
View on GitHub
[ISER 2023] The official implementation of Audio Visual Language Maps for Robot Navigation
☆69May 11, 2024Updated 2 years ago
UMass-Embodied-AGI / COMBO
View on GitHub
Source codes for the paper "COMBO: Compositional World Models for Embodied Multi-Agent Cooperation"
☆51Mar 13, 2025Updated last year
matsuolab / isaac_hsr
View on GitHub
☆16Aug 12, 2022Updated 3 years ago
sled-group / 3D-GRAND
View on GitHub
[CVPR 2025] 3D-GRAND: Towards Better Grounding and Less Hallucination for 3D-LLMs
☆54Jun 13, 2024Updated 2 years ago
changchencc / Simple-Hierarchical-Planning-with-Diffusion
View on GitHub
☆36Jun 7, 2024Updated 2 years ago
UMass-Embodied-AGI / 3D-VLA
View on GitHub
[ICML 2024] 3D-VLA: A 3D Vision-Language-Action Generative World Model
☆629Oct 29, 2024Updated last year
codeluosiyu / toa-ui
View on GitHub
Rract,Vue,Miniprogram ui tools
☆10Aug 16, 2022Updated 3 years ago
amberxie88 / lapp
View on GitHub
Implementation of Language-Conditioned Path Planning (Amber Xie, Youngwoon Lee, Pieter Abbeel, Stephen James)
☆27Sep 1, 2023Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
NVlabs / AHA
View on GitHub
A Vision-Language-Model for Detecting and Reasoning Over Failures in Robotic Manipulation
☆71Apr 1, 2025Updated last year
OpenDriveLab / CLOVER
View on GitHub
[NeurIPS 2024] CLOVER: Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation
☆135Sep 8, 2025Updated 10 months ago
FCSC / wrs_gazebo_worlds
View on GitHub
The 'wrs_gazebo_worlds' ROS package provides a collection of Gazebo worlds and models created using the official CAD data of the World Ro…
☆15Mar 11, 2021Updated 5 years ago
rainbow979 / robodreamer
View on GitHub
☆102Sep 4, 2024Updated last year
ShuangLI59 / unified_video_action
View on GitHub
Official PyTorch Implementation of Unified Video Action Model (RSS 2025)
☆400Jul 23, 2025Updated last year
devinluo27 / comp_diffuser_release
View on GitHub
[NeurIPS 2025 Spotlight] Generative Trajectory Stitching through Diffusion Composition
☆76Sep 6, 2025Updated 10 months ago
stevenlsw / hoi-forecast
View on GitHub
[CVPR 2022] Joint hand motion and interaction hotspots prediction from egocentric videos
☆71Jan 29, 2024Updated 2 years ago