dianwen-ng / MUFFINLinks
Multi-band Frequency Reconstruction for Neural Psychoacoustic Coding
☆18Updated 3 months ago
Alternatives and similar repositories for MUFFIN
Users that are interested in MUFFIN are comparing it to the libraries listed below
Sorting:
- ☆88Updated 10 months ago
- Source for the Interspeech 2024 Paper "Scaling up masked audio encoder learning for general audio classification"☆70Updated 3 months ago
- The project is associated with the recently-launched INTERSPEECH 2025 Workshop on Multilingual Conversational Speech Language Model (MLC-…☆40Updated 2 months ago
- Model configurations for scaling SE models in the paper "Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enha…☆33Updated last year
- Unofficial SoundStream implementation of Pytorch with training code and 16kHz pretrained checkpoint☆70Updated 2 years ago
- Pytorch implementation of "CleanMel: Mel-Spectrogram Enhancement for Improving Both Speech Quality and ASR".☆69Updated this week
- Official repository for Mamba-based Segmentation Model for Speaker Diarization☆37Updated 3 months ago
- ☆55Updated 8 months ago
- A benchmark for evaluating audio encoders on various audio tasks.☆25Updated last week
- ☆68Updated last month
- Official PyTorch implementation of "RVAE-EM: Generative speech dereverberation based on recurrent variational auto-encoder and convolutiv…☆46Updated 5 months ago
- ARCH: Audio Representations benCHmark☆46Updated 11 months ago
- COG-MHEAR Audio-Visual Speech Enhancement Challenge☆41Updated 3 months ago
- Generation scripts for EARS-WHAM and EARS-Reverb☆36Updated last month
- Official data preparation scripts for the URGENT 2024 Challenge☆82Updated 2 months ago
- ☆85Updated 3 months ago
- A low-bitrate single-codebook 16 kHz speech codec based on focal modulation☆93Updated 6 months ago
- ☆56Updated last year
- ☆65Updated 2 years ago
- Unofficial PyTorch implementation of "Autoregressive Speech Synthesis without Vector Quantization (MELLE)"☆35Updated last month
- Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction (LLM-TSE)☆42Updated last year
- ☆49Updated 11 months ago
- Official data preparation and metric evaluation scripts for the Interspeech 2025 URGENT challenge.☆62Updated 2 months ago
- Official implementation of DNSMOS Pro (accepted at INTERSPEECH 2024).☆47Updated 2 months ago
- Official repo of ICASSP 2024 paper - Generative De-Quantization for Neural Speech Codec via Latent Diffusion.☆55Updated last month
- ☆54Updated 2 years ago
- ☆32Updated 8 months ago
- BAE-NET: A LOW COMPLEXITY AND HIGH FIDELITY BANDWIDTH-ADAPTIVE NEURAL NETWORK FOR SPEECH SUPER-RESOLUTION☆69Updated 11 months ago
- PAM is a no-reference audio quality metric for audio generation tasks☆70Updated last year
- Official Implementation of TSELM: Target speaker extraction using discrete tokens and language models☆47Updated 3 months ago