[WIP] Code for LangToMo
β20Mar 19, 2026Updated this week
Alternatives and similar repositories for LangToMo
Users that are interested in LangToMo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [Main Conference @ EACL'26] [Workshop @ NeurIPS'24] ποΈ LVNet.β42Feb 10, 2026Updated last month
- π€ [ICLR'25] Multimodal Video Understanding Framework (MVU)β57Jan 31, 2025Updated last year
- [ECCV 2024] Official Implementation of CoPT: Unsupervised Domain Adaptive Segmentation using Domain-Agnostic Text Embeddingsβ11Feb 24, 2025Updated last year
- An unofficial pytorch dataloader for Open X-Embodiment Datasets https://github.com/google-deepmind/open_x_embodimentβ25Jan 9, 2025Updated last year
- Perceptual Grouping in Contrastive Vision-Language Models (ICCV'23)β37Jan 1, 2024Updated 2 years ago
- Code for our ACL 2025 paper "Language Repository for Long Video Understanding"β36Jun 17, 2024Updated last year
- β18Dec 17, 2022Updated 3 years ago
- This is a python library. Install with "python3 -m pip install rp" then run with "python3 -m rp" or just "rp". Requires pythonβ₯3.5β13Mar 17, 2026Updated last week
- Code for the paper Seeing the Pose in the Pixels: Learning Pose-Aware Representations in Vision Transformersβ21Aug 2, 2024Updated last year
- Code for NeurIPS 2022 paper "Learning Viewpoint-Agnostic Visual Representations by Recovering Tokens in 3D Space"β20Apr 20, 2023Updated 2 years ago
- This repository contains the implementation for our work "TopoDiffusionNet: A Topology-aware Diffusion Model", accepted to ICLR 2025.β22Apr 17, 2025Updated 11 months ago
- (CVPR 2026) Long-RVOS: A Comprehensive Benchmark for Long-term Referring Video Object Segmentationβ30Feb 28, 2026Updated 3 weeks ago
- Code for LifelongMemory: Leveraging LLMs for Answering Queries in Long-form Egocentric Videosβ28Oct 27, 2025Updated 4 months ago
- Environments for Active Vision Reinforcement Learningβ29Oct 10, 2024Updated last year
- AC-DiT: Adaptive Coordination Diffusion Transformer for Mobile Manipulationβ37Feb 23, 2026Updated last month
- Video + CLIP Baseline for Ego4D Long Term Action Anticipation Challenge (CVPR 2022)β15Jul 4, 2022Updated 3 years ago
- Official repo for paper "HiMoE-VLA: Hierarchical Mixture-of-Experts for Generalist Vision-Language-Action Policies"β26Dec 12, 2025Updated 3 months ago
- Official PyTorch implementation for NeurIPS 2024 paper: Prediction with Action.β49Jan 4, 2025Updated last year
- Public implementation of Video2Act: A Dual-System Video Diffusion Policy with Robotic Spatio-Motional Modelingβ30Dec 3, 2025Updated 3 months ago
- β31Dec 18, 2025Updated 3 months ago
- Official repository for "Self-Supervised Video Transformer" (CVPR'22)β108Jun 26, 2024Updated last year
- [AAAI 2026 Oral] SemanticVLA: Semantic-Aligned Sparsification and Enhancement for Efficient Robotic Manipulationβ37Nov 24, 2025Updated 4 months ago
- [ICRA'24] Crossway Diffusion: Improving Diffusion-based Visuomotor Policy via Self-supervised Learningβ70Aug 4, 2024Updated last year
- [ECCV'24] MaxFusion: Plug & Play multimodal generation in text to image diffusion modelsβ27Nov 2, 2024Updated last year
- HD-EPIC Python script to download the entire datasets or parts of itβ18Oct 7, 2025Updated 5 months ago
- Pi0-VLA Repository of "MotionTrans: Human VR Data Enable Motion-Level Learning for Robotic Manipulation Policies"β27Mar 9, 2026Updated 2 weeks ago
- Code for our CVPR 2021 paper "Coarse-Fine Networks for Temporal Activity Detection in Videos"β57Oct 10, 2021Updated 4 years ago
- Converts MimicGen dataset into LeRobot format, to train and evaluate the ACT, BC, and diffusion policiesβ23Nov 19, 2024Updated last year
- Official PyTorch implementation for ICML 2025 paper: UP-VLA.β57Jan 20, 2026Updated 2 months ago
- Code for NeurIPS 2023 paper "Active Vision Reinforcement Learning with Limited Visual Observability"β54Oct 10, 2024Updated last year
- [ICLR 2026] Official implemetation of the paper "Policy Contrastive Decoding for Robotic Foundation Models"β26Mar 5, 2026Updated 2 weeks ago
- Official Repository of "Fibottention: Inceptive Visual Representation Learning with Diverse Attention Across Heads"β17Oct 6, 2025Updated 5 months ago
- β21Feb 23, 2025Updated last year
- CoRL25-"AimBot: A Simple Auxiliary Visual Cue to Enhance Spatial Awareness of Visuomotor Policies"β44Aug 15, 2025Updated 7 months ago
- [ECCV2022] [T-PAMI] StARformer: Transformer with State-Action-Reward Representations.β96May 21, 2023Updated 2 years ago
- code for the paper Predicting Point Tracks from Internet Videos enables Diverse Zero-Shot Manipulationβ101Jul 31, 2024Updated last year
- Official Code for SGRv2 and SGR.β33May 20, 2025Updated 10 months ago
- DTact: A Vision-Based Tactile Sensor that Measures High-Resolution 3D Geometry Directly from Darkness (ICRA'23)β20Aug 29, 2023Updated 2 years ago
- β272Mar 17, 2024Updated 2 years ago