jhCOR / EgoOrientBenchLinks
The Official Code Repo for EgoOrientBench [CVPR25]
☆13Updated 2 weeks ago
Alternatives and similar repositories for EgoOrientBench
Users that are interested in EgoOrientBench are comparing it to the libraries listed below
Sorting:
- ☆82Updated 2 months ago
- ☆25Updated 2 months ago
- Official PyTorch implementation of "Paralinguistics-Aware Speech-Empowered LLMs for Natural Conversation" (NeurIPS 2024)☆92Updated 9 months ago
- Benchmarking for Audio-Text and Audio-Visual Generation; Supports FAD, FD_VGG, FD_PANNs, FD_PaSST, IS_PaSST, IS_PANNs, KL_PaSST, KL_PANNs…☆30Updated 2 weeks ago
- [ACL 2024 Findings] Official PyTorch Implementation code for realizing the technical part of CoLLaVO: Crayon Large Language and Vision mO…☆99Updated last year
- Rare-to-Frequent (R2F), ICLR'25, Spotlight☆47Updated 4 months ago
- Welcome to AudioCIL, the toolbox for audio class-incremental learning with the most implemented methods.☆32Updated 8 months ago
- KV cache compression via sparse coding☆12Updated 3 months ago
- ☆34Updated 3 months ago
- The official implementation of MAGVLT: Masked Generative Vision-and-Language Transformer (CVPR'23)☆27Updated last year
- Korean Streaming ASR(with Denoiser and Conformer CTC)☆25Updated last year
- ☆35Updated last week
- ☆31Updated last year
- [Interspeech 2024] SyncVSR: Data-Efficient Visual Speech Recognition with End-to-End Crossmodal Audio Token Synchronization☆57Updated 5 months ago
- ☆60Updated 6 months ago
- [⭐️ WACV 2025 Oral ⭐️] PETALface: Parameter Efficient Transfer Learning for Low-resolution Face Recognition☆18Updated 2 months ago
- Official PyTorch implementation of ReWaS (AAAI'25) "Read, Watch and Scream! Sound Generation from Text and Video"☆42Updated 8 months ago
- SMILE: A Multimodal Dataset for Understanding Laughter☆13Updated 2 years ago
- Distributed Optimization Infra for learning CLIP models☆27Updated 11 months ago
- On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and Reasoning, …☆16Updated 8 months ago
- A list of current Audio-Vision Multimodal with awesome resources (paper, application, data, review, survey, etc.).☆24Updated last year
- ☆11Updated last month
- (ICLR 2025) Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation☆12Updated 4 months ago
- an implementation of FAdam (Fisher Adam) in PyTorch☆48Updated 2 months ago
- ☆15Updated 5 months ago
- ☆26Updated 7 months ago
- [Official Implementation] Acoustic Autoregressive Modeling 🔥☆71Updated last year
- ☆32Updated 4 months ago
- ☆28Updated 2 weeks ago
- Official Implementation of "Prefix tuning for Automated Audio Captioning(ICASSP 2023)"☆30Updated last year