slp-rl / salmon
The official code for the SALMonπ£ benchmark (ICASSP 2025)
β43Updated this week
Alternatives and similar repositories for salmon:
Users that are interested in salmon are comparing it to the libraries listed below
- This repo contains the official PyTorch implementation of "Analyzing Discrete Self Supervised Speech Representation For Spoken Language Mβ¦β17Updated 2 years ago
- Official repository for NAST: Noise Aware Speech Tokenization for Speech Language Models (Interspeech 2024) https://arxiv.org/abs/2406.11β¦β44Updated 7 months ago
- A spoken version of the textual story cloze benchmarkβ14Updated last year
- Official Code for SyllableLM: Learning Coarse Semantic Units for Speech Language Modelsβ42Updated 4 months ago
- PyTorch implementation of the ICASSP-24 paper: "Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervβ¦β36Updated last year
- β36Updated 5 months ago
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.β13Updated last year
- Temporary anonymous versionβ22Updated 11 months ago
- Interface Design for Self-Supervised Speech Models, Accepted to Interspeech2024β15Updated 3 months ago
- β18Updated 9 months ago
- Official repository for "Speaking Style Conversion With Discrete Self-Supervised Units" (EMNLP 2023). https://arxiv.org/abs/2212.09730β128Updated last year
- β27Updated 7 months ago
- This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.β46Updated 5 months ago
- Contains the code associated with the ICLR submission for our text-to-speech diffusion modelβ51Updated last year
- Collection of scripts from mHuBERT-147.β24Updated 3 months ago
- [ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformerβ47Updated 3 months ago
- Streaming Vocosβ20Updated last month
- β46Updated 3 months ago
- SLMTokBench for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"β33Updated last year
- [InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistencyβ50Updated 3 months ago
- β14Updated last month
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.β76Updated last month
- Code for the paper "FLowHigh: Towards efficient and high-quality audio super-resolution with single-step flow matching"β54Updated last month
- LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation (INTERSPEECH 2024)β36Updated 8 months ago
- Implementation of Acoustic BPE (Shen et al., 2024), extended for RVQ-based Neural Audio Codecsβ50Updated 4 months ago
- (ICASSP 2025) Learning Source Disentanglement in Neural Audio Codecβ25Updated 2 months ago
- Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speechβ10Updated last year
- ARCH: Audio Representations benCHmarkβ40Updated 5 months ago
- Official repository of the IEEE SLT 2024 paper "Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT"β34Updated this week