A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models (ICASSP 2024)
☆58Apr 17, 2024Updated last year
Alternatives and similar repositories for av-superb
Users that are interested in av-superb are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Super Flappy Bird in p5.js☆10Mar 8, 2021Updated 5 years ago
- ☆15Sep 9, 2021Updated 4 years ago
- Audio-Visual Corruption Modeling of our paper "Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling an…☆35Jun 20, 2023Updated 2 years ago
- The official repository of Dynamic-SUPERB.☆199Jun 24, 2025Updated 9 months ago
- Code for T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5☆19Nov 29, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- 《SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts》☆77Jun 9, 2023Updated 2 years ago
- **Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speec…☆101Apr 10, 2025Updated 11 months ago
- ☆13Sep 25, 2024Updated last year
- Word Discovery in Visually Grounded, Self-Supervised Speech Models☆26Dec 4, 2023Updated 2 years ago
- Collection of works for evaluating (and analyzing) large audio-language models (LALMs)☆40Aug 11, 2025Updated 7 months ago
- ☆24Mar 30, 2024Updated last year
- The repo host the code and model of MAViL.☆45Jul 24, 2023Updated 2 years ago
- Vision Transformers are Parameter-Efficient Audio-Visual Learners☆107Aug 11, 2023Updated 2 years ago
- 《SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks》Speech processing with prompting paradigm☆81Oct 19, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning☆54Jan 18, 2024Updated 2 years ago
- A self-supervised learning framework for audio-visual speech☆976Dec 7, 2023Updated 2 years ago
- Official Codebase of "A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition" (ICML 2023)☆12Jun 1, 2023Updated 2 years ago
- Yeast, a lite and light beamer theme☆18Dec 6, 2020Updated 5 years ago
- Code for the IEEE Signal Processing Letters 2022 paper "UAVM: Towards Unifying Audio and Visual Models".☆57Apr 20, 2023Updated 2 years ago
- The open source code of ALMTokenizer2: Towards Low bit-rate and Semantic-rich Audio Tokenizer with Flow-based Scalar Diffusion Transforme…☆45Sep 5, 2025Updated 6 months ago
- FG2021: Cross Attentional AV Fusion for Dimensional Emotion Recognition☆33Nov 29, 2024Updated last year
- AudioCodec-Hub is a Python library for encoding and decoding audio data, supporting various neural audio codec models☆25Sep 26, 2023Updated 2 years ago
- Taiwanese Translation with BERT based model and RNN. Collection of Taiwanese text corpus☆13Oct 15, 2022Updated 3 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Audio Codec Speech processing Universal PERformance Benchmark☆300Jan 8, 2026Updated 2 months ago
- Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".