ByZ0e / AI2Thor_keyboard_playerLinks
AI2-THOR Data Collection Tool Based On Keyboard Interaction
☆51Updated last year
Alternatives and similar repositories for AI2Thor_keyboard_player
Users that are interested in AI2Thor_keyboard_player are comparing it to the libraries listed below
Sorting:
- A comprehensive collection of resources focused on addressing and understanding hallucination phenomena in MLLMs.☆34Updated last year
- ☆22Updated 9 months ago
- [ECCV 2022] GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval☆49Updated 4 months ago
- When Learning Is Out of Reach, Reset: Generalization in Autonomous Visuomotor Reinforcement Learning☆12Updated last year
- MAPLE: Masked Pseudo-Labeling autoEncoder for Semi-supervised Point Cloud Action Recognition.☆34Updated 2 years ago
- Rethinking Video-Text Understanding Retrieval from Counterfactually Augmented Data☆39Updated 11 months ago
- Weakly supverised individual counting☆29Updated 11 months ago
- Official implementation of "Generating images with 3D annotations using diffusion models".☆49Updated 10 months ago
- Official Code of "GeReA: Question-Aware Prompt Captions for Knowledge-based Visual Question Answering"☆111Updated 9 months ago
- ☆29Updated 2 years ago
- Please visit our demonstration website for interactive demonstrations☆30Updated 9 months ago
- A collection of URDF model used in Pybullet☆36Updated 8 months ago
- An open-source library with a powerful Contrastive Language-and-Motion (CLaM) pre-training evaluator☆97Updated 3 months ago
- TransRefer3D: Entity-and-Relation Aware Transformer for Fine-Grained 3D Visual Grounding [ACM MM'21]☆23Updated 3 years ago
- ☆89Updated last year
- [ICLR 2025] Official implementation of paper "Improving Data Efficiency via Curating LLM-Driven Rating Systems"☆97Updated 3 months ago
- ☆58Updated last year
- ☆40Updated 9 months ago
- A PyTorch implementation for Temporal Textual Localization in Video via Adversarial Bi-Directional Interaction Networks☆38Updated 4 years ago
- ☆36Updated last year
- CheX-Phi3.5V is a vision-language model (VLM) for chest X-ray interpretation.☆21Updated 3 months ago
- ☆46Updated 4 months ago
- Enabling robotic manipulators to learn to imitate human arm motions from given videos.☆48Updated last year
- Domain Prompt Learning with Quaternion Networks (CVPR2024 Highlight)☆79Updated 6 months ago
- ☆51Updated last year
- my work☆27Updated 6 months ago
- NWPU足基 ATOM_LINKER 唐天扬负责 硬件组☆40Updated 3 years ago
- 通过RPN with FPN以及CRNN进行车牌检测和识别☆26Updated 6 months ago
- ☆32Updated 2 years ago
- [EMNLP 2024 Findings] Official PyTorch Implementation of "Adaptive Contrastive Search: Uncertainty-Guided Decoding for Open-Ended Text Ge…☆40Updated 5 months ago