JongSuk1 / EquiAV
☆15Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for EquiAV
- This repository contains the code for our CVPR 2022 paper on "Audio-visual Generalised Zero-shot Learning with Cross-modal Attention and …☆34Updated last year
- Vision Transformers are Parameter-Efficient Audio-Visual Learners☆85Updated last year
- ☆15Updated 2 weeks ago
- ☆45Updated 6 months ago
- [ECCV’24] Official Implementation for CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenario…☆39Updated 2 months ago
- This repo contains the official PyTorch implementation of AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image …☆76Updated 4 months ago
- ☆18Updated last month
- [ECCV2024][ICCV2023] Official PyTorch implementation of SeiT++ and SeiT☆51Updated 2 months ago
- Official codebase for "Unveiling the Power of Audio-Visual Early Fusion Transformers with Dense Interactions through Masked Modeling".☆19Updated 3 months ago
- Official This-Is-My Dataset published in CVPR 2023☆15Updated 3 months ago
- [Arxiv 2024] Official code for MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions☆24Updated this week
- Locally Hierarchical Auto-Regressive Modeling for Image Generation (HQ-Transformer)☆26Updated 8 months ago
- ☆14Updated 6 months ago
- Official Codebase of "Localizing Visual Sounds the Easy Way" (ECCV 2022)☆29Updated 2 years ago
- Official Code of ECCV 2022 paper MS-CLIP☆86Updated 2 years ago
- [ICCV 2023] Online Clustered Codebook☆145Updated last month
- [ECCV’24] Official repository for "BEAF: Observing Before-AFter Changes to Evaluate Hallucination in Vision-language Models"☆18Updated 3 weeks ago
- ☆28Updated 3 weeks ago
- A Pytorch Implementation of Finite Scalar Quantization☆80Updated 11 months ago
- Code for the paper "Hyperbolic Image-Text Representations", Desai et al, ICML 2023☆134Updated last year
- This repository contains the code for our ECCV 2022 paper "Temporal and cross-modal attention for audio-visual zero-shot learning"☆24Updated last year
- ☆19Updated 7 months ago
- Official Pytorch implementation of "Improved Probabilistic Image-Text Representations" (ICLR 2024)☆51Updated 5 months ago
- Minimal multi-gpu implementation of EDM2: "Analyzing and Improving the Training Dynamics of Diffusion Models"☆26Updated 8 months ago
- [ICLR'23] New Insights for the Stability-Plasticity Dilemma in Online Continual Learning☆16Updated last year
- ☆101Updated 4 months ago
- Official Pytorch Implementation of Our CVPR2023 Paper: "Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dyna…☆154Updated last year
- The official implementation of MAGVLT: Masked Generative Vision-and-Language Transformer (CVPR'23)☆26Updated 9 months ago
- Codebase for the paper: "TIM: A Time Interval Machine for Audio-Visual Action Recognition"☆37Updated this week