jhCOR / EgoOrientBenchLinks
The Official Code Repo for EgoOrientBench [CVPR25]
☆13Updated 3 months ago
Alternatives and similar repositories for EgoOrientBench
Users that are interested in EgoOrientBench are comparing it to the libraries listed below
Sorting:
- Official PyTorch Implementation for the "What if...?: Thinking Counterfactual Keywords Helps to Mitigate Hallucination in Large Multi-mod…☆18Updated 10 months ago
- On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and Reasoning, …☆16Updated 7 months ago
- Question-Aware Gaussian Experts for Audio-Visual Question Answering -- Official Pytorch Implementation (CVPR'25, Highlight)☆18Updated 2 months ago
- ☆13Updated 4 months ago
- ☆80Updated last month
- [ACL 2024 Findings] Official PyTorch Implementation code for realizing the technical part of CoLLaVO: Crayon Large Language and Vision mO…☆97Updated last year
- ☆23Updated last month
- ☆8Updated 8 months ago
- Welcome to AudioCIL, the toolbox for audio class-incremental learning with the most implemented methods.☆32Updated 7 months ago
- [NeurIPS 2024] Official PyTorch implementation code for realizing the technical part of Mamba-based traversal of rationale (Meteor) to im…☆115Updated last year
- SMILE: A Multimodal Dataset for Understanding Laughter☆13Updated 2 years ago
- ☆20Updated last month
- Benchmarking for Audio-Text and Audio-Visual Generation; Supports FAD, FD_VGG, FD_PANNs, FD_PaSST, IS_PaSST, IS_PANNs, KL_PaSST, KL_PANNs…☆24Updated 5 months ago
- SAVEn-Vid: Synergistic Audio-Visual Integration for Enhanced Understanding in Long Video Context☆5Updated 7 months ago
- UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model☆22Updated last year
- ☆33Updated 2 months ago
- WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs☆27Updated 3 months ago
- Official repository for "Boosting Audio Visual Question Answering via Key Semantic-Aware Cues" in ACM MM 2024.☆17Updated 9 months ago
- [NAACL 2024] Vision language model that reduces hallucinations through self-feedback guided revision. Visualizes attentions on image feat…☆46Updated 11 months ago
- LLaVA-MR: Large Language-and-Vision Assistant for Video Moment Retrieval☆8Updated 8 months ago
- This is an official implementation for "Block Selection Method for Using Feature Norm in Out-of-distribution Detection", CVPR 2023.☆22Updated last year
- Official PyTorch implementation of Extract Free Dense Misalignment from CLIP (AAAI'25)☆23Updated 3 months ago
- ☆24Updated last year
- ☆18Updated last year
- [ICLR 2025] Causal Graphical Models for Vision-Language Compositional Understanding☆9Updated 3 months ago
- ☆11Updated last year
- [EMNLP 2024] Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality☆17Updated 10 months ago
- KV cache compression via sparse coding☆12Updated 3 months ago
- A project for tri-modal LLM benchmarking and instruction tuning.☆42Updated 4 months ago
- ☆34Updated 2 months ago