A Holistic Embodied Cognition Benchmark
☆19Apr 3, 2025Updated 11 months ago
Alternatives and similar repositories for ECBench
Users that are interested in ECBench are comparing it to the libraries listed below
Sorting:
- ☆19Oct 28, 2025Updated 4 months ago
- ☆23Jun 5, 2025Updated 9 months ago
- Neural network methods for multimodal map reconstruction and their usage for robot navigation and control☆16Jun 11, 2024Updated last year
- [AAAI 2025] Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos☆33May 27, 2025Updated 9 months ago
- Dual Adaptive Thinking (DAT) for object navigation☆14Sep 10, 2022Updated 3 years ago
- Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning☆43Mar 2, 2026Updated 2 weeks ago
- The code for PixelRefer & VideoRefer☆345Nov 16, 2025Updated 4 months ago
- ☆11Dec 6, 2024Updated last year
- OmniStream: Mastering Perception, Reconstruction and Action in Continuous Streams☆47Mar 15, 2026Updated last week
- RGB-D fusion for two-hand reconstruction☆29Aug 6, 2024Updated last year
- Code for A Dual Semantic-Aware Recurrent Global-Adaptive Network For Vision-and-Language Navigation☆17Apr 25, 2024Updated last year
- [CVPR 2025] VISCO: Benchmarking Fine-Grained Critique and Correction Towards Self-Improvement in Visual Reasoning☆13Jun 7, 2025Updated 9 months ago
- [ECCV 2024 (Oral)] Towards Scene Graph Anticipation☆19Mar 10, 2026Updated last week
- A simple and flexible PyTorch implementation of Video StableDiffusion (ZeroScope_v2) based on diffusers.☆20Feb 15, 2024Updated 2 years ago
- [ICCV 2025 Highlight] The official repository for "2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining"☆196Mar 17, 2025Updated last year
- 一个支 持跨模态大语言模型的webui. A chatbot webui that supports various multi-modal large language models☆11May 8, 2023Updated 2 years ago
- ☆37Nov 8, 2024Updated last year
- [CVPR 2025] Official PyTorch code of "Enhancing Video-LLM Reasoning via Agent-of-Thoughts Distillation".☆55May 25, 2025Updated 9 months ago
- Egocentric Video Understanding Dataset (EVUD)☆33Jul 4, 2024Updated last year
- The code for "TokenPacker: Efficient Visual Projector for Multimodal LLM", IJCV2025☆278May 26, 2025Updated 9 months ago
- Open-Retrieval Conversational Machine Reading: A new setting & OR-ShARC dataset☆13Nov 19, 2022Updated 3 years ago
- Code for InterSpeech 2024 Paper: LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition☆18Jul 16, 2024Updated last year
- Dimple, the first Discrete Diffusion Multimodal Large Language Model☆115Jul 9, 2025Updated 8 months ago
- ☆44Oct 7, 2024Updated last year
- Official PyTorch code of GroundVQA (CVPR'24)☆64Sep 13, 2024Updated last year
- Official Implementation for paper "Pretraining A Large Language Model using Distributed GPUs: A Memory-Efficient Decentralized Paradigm"☆21Mar 10, 2026Updated last week
- [ECCV2024] Official code implementation of Merlin: Empowering Multimodal LLMs with Foresight Minds☆96Jul 4, 2024Updated last year
- [Main Conference @ EACL'26] [Workshop @ NeurIPS'24] 🎞️ LVNet.☆42Feb 10, 2026Updated last month
- Recent Advances on MLLM's Reasoning Ability☆26Apr 11, 2025Updated 11 months ago
- The code for "Label-efficient Segmentation via Affinity Propagation". [NeurIPS2023]☆67Mar 4, 2024Updated 2 years ago
- Interpreting Chest X-rays Like a Radiologist: A Benchmark with Clinical Reasoning, release the dataset and the model weight☆13May 26, 2025Updated 9 months ago
- This is the offical repository of LLAVIDAL☆23Oct 4, 2025Updated 5 months ago
- A paper list of panoptic segmentation using deep learning☆12Sep 5, 2021Updated 4 years ago
- This is the official repo of MLLM-CL.☆63Oct 10, 2025Updated 5 months ago
- The official implementation of "LiDAR2Map: In Defense of LiDAR-Based Semantic Map Construction Using Online Camera Distillation" (CVPR 20…☆91Apr 6, 2024Updated last year
- [ICCV 2025] MRGen: Segmentation Data Engine for Underrepresented MRI Modalities☆39Sep 26, 2025Updated 5 months ago
- Code for the ICML 2021 paper "Sharing Less is More: Lifelong Learning in Deep Networks with Selective Layer Transfer"☆12Aug 17, 2021Updated 4 years ago
- [ICML 2025 Spotlight] RAPID: Long-Context Inference with Retrieval-Augmented Speculative Decoding☆19Mar 2, 2025Updated last year
- ☆38Jul 16, 2025Updated 8 months ago