worv-ai / canvasLinks
CANVAS: Commonsense-Aware Navigation System for Intuitive Human-Robot Interaction
☆14Updated last month
Alternatives and similar repositories for canvas
Users that are interested in canvas are comparing it to the libraries listed below
Sorting:
- Code for “Pretrained Language Models as Visual Planners for Human Assistance”☆61Updated 2 years ago
- ☆23Updated 2 years ago
- [BMVC'21] Official PyTorch Implementation of "Grounded Situation Recognition with Transformers"☆27Updated 3 years ago
- Chrome/Safari/Firefox extension for clipping arXiv articles to Notion.☆55Updated last year
- [ACL 2024 Findings] Official PyTorch Implementation code for realizing the technical part of CoLLaVO: Crayon Large Language and Vision mO…☆98Updated last year
- Subtask-Aware Visual Reward Learning from Segmented Demonstrations (ICLR 2025 accepted)☆17Updated 7 months ago
- VQVAE for video prediction☆29Updated 3 years ago
- Code for the paper "Multi-scale Diffusion Denoised Smoothing" (NeurIPS 2023)☆14Updated last year
- [ICCV 2025] Official code for Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation☆47Updated 2 months ago
- Official Implementation of ISR-DPO:Aligning Large Multimodal Models for Videos by Iterative Self-Retrospective DPO (AAAI'25)☆23Updated 9 months ago
- [ECCV2024, Oral, Best Paper Finalist] This is the official implementation of the paper "LEGO: Learning EGOcentric Action Frame Generation…☆39Updated 8 months ago
- On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and Reasoning, …☆18Updated 11 months ago
- A real-time, high-frequency, real-world desktop environment that is suitable for desktop-based ML development (agents, world models, etc.…☆14Updated 9 months ago
- [ICCV2023] EgoObjects: A Large-Scale Egocentric Dataset for Fine-Grained Object Understanding☆77Updated 2 years ago
- ☆47Updated last year
- [ICLR-2023] Rarity Score : A New Metric to Evaluate the Uncommonness of Synthesized Images☆67Updated 3 years ago
- Egocentric Video Understanding Dataset (EVUD)☆32Updated last year
- ☆37Updated 9 months ago
- Official repo of the ICLR 2025 paper "MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos"☆29Updated 4 months ago
- ☆126Updated 3 years ago
- Code release for the paper "Egocentric Video Task Translation" (CVPR 2023 Highlight)☆33Updated 2 years ago
- Official repository for the General Robust Image Task (GRIT) Benchmark☆54Updated 2 years ago
- [ICML 2024] A Touch, Vision, and Language Dataset for Multimodal Alignment☆87Updated 5 months ago
- This repository is a collection of research papers on World Models.☆41Updated 2 years ago
- [CVPR 2023] HierVL Learning Hierarchical Video-Language Embeddings☆46Updated 2 years ago
- Code for Stable Control Representations☆26Updated 7 months ago
- HD-EPIC Python script to download the entire datasets or parts of it☆14Updated last month
- SMILE: A Multimodal Dataset for Understanding Laughter☆12Updated 2 years ago
- [NeurIPS 2024] Official PyTorch implementation code for realizing the technical part of Mamba-based traversal of rationale (Meteor) to im…☆116Updated last year
- Collection of PhD Advice Links☆17Updated 3 years ago