HJ-Ok / AudioBERTLinks
AudioBERT π’ : Audio Knowledge Augmented Language Model (ICASSP 2025)
β41Updated last year
Alternatives and similar repositories for AudioBERT
Users that are interested in AudioBERT are comparing it to the libraries listed below
Sorting:
- β38Updated last year
- Collection of scripts from mHuBERT-147.β32Updated last year
- The official code for the SALMonπ£ benchmark (ICASSP 2025 - Oral)β48Updated 5 months ago
- A spoken version of the textual story cloze benchmarkβ20Updated 2 years ago
- [ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformerβ67Updated last year
- An official implementation of Style-Talker for Spoken Dialogue Generationβ23Updated last year
- Codebase and project page for EDMSoundβ35Updated 2 years ago
- GPT for FACodecβ13Updated last year
- [InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistencyβ59Updated last year
- β52Updated 6 months ago
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.β113Updated 2 weeks ago
- Official repository of the IEEE SLT 2024 paper "Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT"β45Updated this week
- Pushing the Limits of Zero-shot End-to-End Speech Translationβ26Updated last year
- small audio language model for reasoningβ86Updated 2 months ago
- β16Updated 2 years ago
- We introduce the LLAMA1 Test Set, a comprehensive open-domain world knowledge QA dataset for evaluating question-answering systems. We prβ¦β23Updated last year
- Interface Design for Self-Supervised Speech Models, Accepted to Interspeech2024β16Updated last year
- Accompanying repository for the paper "DiffVox: A Differentiable Model for Capturing and Analysing Professional Effects Distributions"β38Updated 3 months ago
- β19Updated last year
- β44Updated last year
- Contains the code associated with the ICLR submission for our text-to-speech diffusion modelβ57Updated 2 years ago
- Official repository of Wavehax vocoderβ66Updated last month
- This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.β51Updated 10 months ago
- ZIQI-Eval: A Music Evaluation Benchmark for Large Language Modelsβ16Updated last year
- Text-To-Speech for NotebookLMβ37Updated 6 months ago
- My vocoder experimentsβ31Updated 6 months ago
- Voice conversion with just linear regression.β32Updated 4 months ago
- β24Updated 9 months ago
- β32Updated last month
- Official Implementation of EnCLAP (ICASSP 2024)β94Updated last year