dianwen-ng / MUFFINLinks
Multi-band Frequency Reconstruction for Neural Psychoacoustic Coding
☆18Updated 4 months ago
Alternatives and similar repositories for MUFFIN
Users that are interested in MUFFIN are comparing it to the libraries listed below
Sorting:
- ARCH: Audio Representations benCHmark☆46Updated last year
- Source for the Interspeech 2024 Paper "Scaling up masked audio encoder learning for general audio classification"☆72Updated 4 months ago
- Pytorch implementation of "CleanMel: Mel-Spectrogram Enhancement for Improving Both Speech Quality and ASR".☆68Updated 3 weeks ago
- Unofficial SoundStream implementation of Pytorch with training code and 16kHz pretrained checkpoint☆73Updated 2 years ago
- The project is associated with the recently-launched INTERSPEECH 2025 Workshop on Multilingual Conversational Speech Language Model (MLC-…☆42Updated 3 months ago
- A benchmark for evaluating audio encoders on various audio tasks.☆25Updated 2 weeks ago
- Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction (LLM-TSE)☆43Updated last year
- ☆43Updated 2 years ago
- Model configurations for scaling SE models in the paper "Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enha…☆35Updated last year
- ☆90Updated 11 months ago
- SCOREQ: Speech COntrastive REgression for Quality Assessment (NeurIPS 2024)☆88Updated last month
- Exploring Binary Classification Loss for Speaker Verification☆17Updated 2 years ago
- ☆67Updated 2 months ago
- Prediction of sound event bounding boxes (SEBBs)☆29Updated last year
- ☆55Updated 10 months ago
- A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline☆183Updated 8 months ago
- WavReward: Spoken Dialogue Models With Generalist Reward Evaluators☆50Updated 3 months ago
- Official PyTorch implementation of "RVAE-EM: Generative speech dereverberation based on recurrent variational auto-encoder and convolutiv…☆46Updated 5 months ago
- Official repository for Mamba-based Segmentation Model for Speaker Diarization☆39Updated 3 months ago
- ☆55Updated 9 months ago
- We propose C2SER, a novel audio-language model designed to enhance the stability and accuracy of speech emotion recognition through conte…☆37Updated 6 months ago
- ☆72Updated 3 weeks ago
- A low-bitrate single-codebook 16 kHz speech codec based on focal modulation☆94Updated 6 months ago
- wav2vec2 audio classification for prosodic boundary detection and other tasks☆42Updated 2 years ago
- A repo containing download guidance and corresponding scripts of the VoxBlink dataset.