☆57Apr 4, 2024Updated 2 years ago
Alternatives and similar repositories for EgoCOT_Dataset
Users that are interested in EgoCOT_Dataset are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆347Apr 26, 2024Updated 2 years ago
- [ACL 2024] PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain☆107Mar 14, 2024Updated 2 years ago
- [IROS24 Oral]ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models☆102Aug 22, 2024Updated last year
- [NeurIPS 2024] MSR3D: Multimodal Situated Reasoning in 3D Scenes☆74Dec 2, 2025Updated 7 months ago
- HiCRISP Full Code, containing VirtualHome, pybullet simulator and Real AGV platform.☆15Apr 8, 2024Updated 2 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- [CoRL2024] Official repo of `A3VLM: Actionable Articulation-Aware Vision Language Model`☆122Oct 7, 2024Updated last year
- ☆282Mar 17, 2024Updated 2 years ago
- Code of the ICCV 2023 paper "March in Chat: Interactive Prompting for Remote Embodied Referring Expression"☆26May 22, 2024Updated 2 years ago
- ☆31Nov 6, 2024Updated last year
- OpenEQA Embodied Question Answering in the Era of Foundation Models☆367Sep 20, 2024Updated last year
- ☆37Dec 13, 2023Updated 2 years ago
- Python library to control GX11(Dexterous Hand) and EX12(Exoskeleton Glove)☆17Aug 30, 2025Updated 10 months ago
- Codebase for ICLR 2023 paper, "SMART: Self-supervised Multi-task pretrAining with contRol Transformers"☆54Jan 26, 2024Updated 2 years ago
- GQA-OOD is a new dataset and benchmark for the evaluation of VQA models in OOD (out of distribution) settings.☆33Mar 1, 2021Updated 5 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Generative Bias for Robust Visual Question Answering ( CVPR 2023 )☆29Jul 4, 2023Updated 3 years ago
- ☆33Sep 22, 2024Updated last year
- Repository for DialFRED.☆45Sep 14, 2023Updated 2 years ago
- ☆21Oct 10, 2023Updated 2 years ago
- LLaVA combines with Magvit Image tokenizer, training MLLM without an Vision Encoder. Unifying image understanding and generation.☆38Jun 20, 2024Updated 2 years ago
- [ICCV2025 Oral] Latent Motion Token as the Bridging Language for Learning Robot Manipulation from Videos☆180Oct 1, 2025Updated 9 months ago
- Code used by the paper "What is the Role of Recurrent Neural Networks (RNNs) in an Image Caption Generator?".☆14Sep 25, 2017Updated 8 years ago
- Pytorch implementation for Egoinstructor at CVPR 2024☆28Dec 1, 2024Updated last year
- Prompter for Embodied Instruction Following☆18Nov 30, 2023Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- HandsOnVLM: Vision-Language Models for Hand-Object Interaction Prediction☆41Sep 15, 2025Updated 9 months ago
- Repository of paper: Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models☆37Sep 19, 2023Updated 2 years ago
- CamReasoner: Reinforcing Camera Movement Understanding via Structured Spatial Reasoning☆30May 23, 2026Updated last month
- Emergent Visual Grounding in Large Multimodal Models Without Grounding Supervision☆46Oct 19, 2025Updated 8 months ago
- [ICRA2023] Grounding Language with Visual Affordances over Unstructured Data☆48Oct 29, 2023Updated 2 years ago
- [ICML 2024] LEO: An Embodied Generalist Agent in 3D World☆485Apr 20, 2025Updated last year
- Code release for the paper "Goal Representations for Instruction Following: A Semi-Supervised Language Interface to Control"☆17Apr 9, 2024Updated 2 years ago
- Suite of human-collected datasets and a multi-task continuous control benchmark for open vocabulary visuolinguomotor learning.☆361Jun 23, 2026Updated last week
- ☆13Nov 1, 2023Updated 2 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Harnessing 1.4M GPT4V-synthesized Data for A Lite Vision-Language Model☆281Jun 25, 2024Updated 2 years ago
- RobotVQA is a project that develops a Deep Learning-based Cognitive Vision System to support household robots' perception while they perf…☆18Jul 26, 2024Updated last year
- Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models☆45Jun 14, 2024Updated 2 years ago
- ☆17Oct 21, 2024Updated last year
- A PyTorch re-implementation of the RT-1 (Robotics Transformer)☆52Oct 18, 2023Updated 2 years ago
- Implementation of RT1 (Robotic Transformer) in Pytorch☆453Oct 6, 2024Updated last year
- PyTorch implementation of the Hiveformer research paper☆48Jun 27, 2023Updated 3 years ago