A toolkit for researchers in the multimodal sound separation.
☆16Oct 20, 2023Updated 2 years ago
Alternatives and similar repositories for Look2hear
Users that are interested in Look2hear are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Power-Guided Grouped SRU for Real-Time Causal Audio-Visual Speech Separation☆26Nov 4, 2025Updated 6 months ago
- Source code and demo for INTERSPEECH 2024 paper: Noise-robust Speech Separation with Fast Generative Correction☆50Nov 19, 2024Updated last year
- Official implementation of A cappella: Audio-visual Singing VoiceSeparation, from BMVC21☆17May 14, 2022Updated 3 years ago
- Spherical residual vector quantization (SRVQ)☆31Aug 25, 2024Updated last year
- 记录关于AEC的论文和代码、博客以及相关资料☆15Jul 26, 2022Updated 3 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A solution to denoising and separating for two-speaker-mixed noisy speech, using a BSRNN inspired network.☆15Aug 22, 2023Updated 2 years ago
- Implementation of "Look, Listen and Recognise:character-aware audio-visual subtitling"☆20Nov 3, 2025Updated 6 months ago
- ☆23Jul 16, 2025Updated 9 months ago
- Variations of L1 SNR Loss function for training audio source separation machine learning models☆43Updated this week
- ☆21Jul 15, 2024Updated last year
- (ICASSP 2025) Learning Source Disentanglement in Neural Audio Codec☆47May 16, 2025Updated 11 months ago
- Dataset simulation for DPCCN.☆16Dec 25, 2022Updated 3 years ago
- Apply Score diffusion to improve speech signals recorded under various adverse conditions and distortions, including noise, reverberation…☆76Jul 29, 2024Updated last year
- Official data preparation and metric evaluation scripts for the Interspeech 2025 URGENT challenge.☆84May 21, 2025Updated 11 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Official Implementation of LauraTSE: Target Speaker Extraction using Auto-Regressive Decoder-Only Language Models.☆34Nov 9, 2025Updated 5 months ago
- Sound Separation, Omni modal☆28Sep 15, 2025Updated 7 months ago
- iSeparate library for the SDX2023 challenge☆15Dec 15, 2023Updated 2 years ago
- This is the official repository of ``Scalable Neural Vocoder from Range-Null Space Decomposition'', which is submitted to TPAMI.☆52Oct 11, 2025Updated 6 months ago
- The source code for the paper CrossSinger (asru2023)☆18Oct 12, 2023Updated 2 years ago
- ☆25Aug 29, 2025Updated 8 months ago
- Python scripts to create noisy and reverberant 2-speaker mixture audio with Libri-Light and WHAM☆17Nov 7, 2024Updated last year
- Code for paper Learning Audio-Visual Dereverberation☆31Aug 10, 2022Updated 3 years ago
- Keep track of good articles on speech processing, mainly on speech enhancement, include speech denoise, speech dereverberation and aec、ag…☆49Jul 17, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- This repo hosts the code and model of "Separate What You Describe: Language-Queried Audio Source Separation", Interspeech 2022☆145Oct 11, 2023Updated 2 years ago
- Feed-forward compressor experiments source code for "Differentiable All-pole Filters for Time-varying Audio Systems".☆22Jun 10, 2024Updated last year
- Offline RL experiments☆15Oct 1, 2022Updated 3 years ago
- ☆16Jul 14, 2020Updated 5 years ago
- Real-Time ASR with CNN-BiLSTM: End-to-End Live Streaming Using PyTorch Lightning⚡☆11Jan 23, 2025Updated last year
- MTalk-Bench: Evaluating Speech-to-Speech Models in Multi-Turn Dialogues via Arena-style and Rubrics Protocols☆18Nov 19, 2025Updated 5 months ago
- Official data preparation scripts for the URGENT 2024 Challenge☆88May 21, 2025Updated 11 months ago
- ☆216Dec 5, 2024Updated last year
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆118Jan 28, 2026Updated 3 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech☆11May 14, 2025Updated 11 months ago
- An efficient speech separation method☆275Apr 11, 2024Updated 2 years ago
- Evaluation script for VoxMovies dataset in PyTorch☆23Jan 12, 2024Updated 2 years ago
- Conformer block with Rotary Position Embedding, modified from lucidrains' implement☆18Sep 13, 2024Updated last year
- Implementation of "Audio xLSTMs: Learning Self-supervised audio representations with xLSTMs" in PyTorch☆20Apr 20, 2026Updated 2 weeks ago
- A description of "RealMAN: A Real-Recorded and Annotated Microphone Array Dataset for Dynamic Speech Enhancement and Localization" [NeurI…☆162Apr 29, 2025Updated last year
- ☆34Oct 23, 2025Updated 6 months ago