HJ-Ok / AudioBERTLinks
AudioBERT π’ : Audio Knowledge Augmented Language Model (ICASSP 2025)
β41Updated 10 months ago
Alternatives and similar repositories for AudioBERT
Users that are interested in AudioBERT are comparing it to the libraries listed below
Sorting:
- β38Updated last year
- Collection of scripts from mHuBERT-147.β32Updated last year
- [ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformerβ66Updated last year
- The official code for the SALMonπ£ benchmark (ICASSP 2025 - Oral)β47Updated 4 months ago
- An official implementation of Style-Talker for Spoken Dialogue Generationβ23Updated 11 months ago
- A spoken version of the textual story cloze benchmarkβ20Updated 2 years ago
- β48Updated 5 months ago
- Text-To-Speech for NotebookLMβ35Updated 5 months ago
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.β114Updated 11 months ago
- Codebase and project page for EDMSoundβ35Updated 2 years ago
- [InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistencyβ59Updated last year
- Official PyTorch implementation of "Paralinguistics-Aware Speech-Empowered LLMs for Natural Conversation" (NeurIPS 2024)β93Updated last year
- GPT for FACodecβ13Updated last year
- Official repository of the IEEE SLT 2024 paper "Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT"β44Updated 2 months ago
- small audio language model for reasoningβ81Updated 2 weeks ago
- Contains the code associated with the ICLR submission for our text-to-speech diffusion modelβ54Updated 2 years ago
- Official repository for NAST: Noise Aware Speech Tokenization for Speech Language Models (Interspeech 2024) https://arxiv.org/abs/2406.11β¦β46Updated last year
- β43Updated last year
- β16Updated 2 years ago
- β19Updated last year
- Official Demo Page for DiTTo-TTS: Efficient and Scalable Zero-Shot Text-to-Speech with Diffusion Transformerβ38Updated 10 months ago
- Official repository of Wavehax vocoderβ62Updated last week
- We introduce the LLAMA1 Test Set, a comprehensive open-domain world knowledge QA dataset for evaluating question-answering systems. We prβ¦β23Updated last year
- β52Updated last year
- Interface Design for Self-Supervised Speech Models, Accepted to Interspeech2024β16Updated last year
- This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.β50Updated 9 months ago
- ZIQI-Eval: A Music Evaluation Benchmark for Large Language Modelsβ15Updated last year
- Implementation of the model "AudioFlamingo" from the paper: "Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialβ¦β40Updated 10 months ago
- β16Updated 2 years ago
- (Interspeech 2023 & ICASSP 2024) Official repository for ARMHuBERT and STaRHuBERTβ39Updated last year