A toolkit for researchers in the multimodal sound separation.
☆16Oct 20, 2023Updated 2 years ago
Alternatives and similar repositories for Look2hear
Users that are interested in Look2hear are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Power-Guided Grouped SRU for Real-Time Causal Audio-Visual Speech Separation☆24Nov 4, 2025Updated 4 months ago
- Source code and demo for INTERSPEECH 2024 paper: Noise-robust Speech Separation with Fast Generative Correction☆47Nov 19, 2024Updated last year
- Official implementation of A cappella: Audio-visual Singing VoiceSeparation, from BMVC21☆17May 14, 2022Updated 3 years ago
- 记录关于AEC的论文和代码、博客以及相关资料☆15Jul 26, 2022Updated 3 years ago
- Spherical residual vector quantization (SRVQ)☆31Aug 25, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A solution to denoising and separating for two-speaker-mixed noisy speech, using a BSRNN inspired network.☆15Aug 22, 2023Updated 2 years ago
- Implementation of "Look, Listen and Recognise:character-aware audio-visual subtitling"☆20Nov 3, 2025Updated 4 months ago
- ☆22Jul 16, 2025Updated 8 months ago
- Variations of L1 SNR Loss function for training audio source separation machine learning models☆43Feb 24, 2026Updated last month
- ☆21Jul 15, 2024Updated last year
- (ICASSP 2025) Learning Source Disentanglement in Neural Audio Codec☆47May 16, 2025Updated 10 months ago
- Dataset simulation for DPCCN.☆16Dec 25, 2022Updated 3 years ago
- Apply Score diffusion to improve speech signals recorded under various adverse conditions and distortions, including noise, reverberation…☆76Jul 29, 2024Updated last year
- Official data preparation and metric evaluation scripts for the Interspeech 2025 URGENT challenge.☆82May 21, 2025Updated 10 months ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- This is the official repository of ``Scalable Neural Vocoder from Range-Null Space Decomposition'', which is submitted to TPAMI.☆47Oct 11, 2025Updated 5 months ago
- Official Implementation of LauraTSE: Target Speaker Extraction using Auto-Regressive Decoder-Only Language Models.☆33Nov 9, 2025Updated 4 months ago
- Sound Separation, Omni modal☆28Sep 15, 2025Updated 6 months ago
- iSeparate library for the SDX2023 challenge☆15Dec 15, 2023Updated 2 years ago
- The source code for the paper CrossSinger (asru2023)☆18Oct 12, 2023Updated 2 years ago
- ☆25Aug 29, 2025Updated 6 months ago
- Python scripts to create noisy and reverberant 2-speaker mixture audio with Libri-Light and WHAM☆17Nov 7, 2024Updated last year
- Keep track of good articles on speech processing, mainly on speech enhancement, include speech denoise, speech dereverberation and aec、ag…☆48Jul 17, 2024Updated last year
- Code for paper Learning Audio-Visual Dereverberation☆31Aug 10, 2022Updated 3 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- This repo hosts the code and model of "Separate What You Describe: Language-Queried Audio Source Separation", Interspeech 2022☆145Oct 11, 2023Updated 2 years ago
- Feed-forward compressor experiments source code for "Differentiable All-pole Filters for Time-varying Audio Systems".☆22Jun 10, 2024Updated last year
- ☆16Jan 11, 2026Updated 2 months ago
- Offline RL experiments☆15Oct 1, 2022Updated 3 years ago
- Real-Time ASR with CNN-BiLSTM: End-to-End Live Streaming Using PyTorch Lightning⚡☆11Jan 23, 2025Updated last year
- Official data preparation scripts for the URGENT 2024 Challenge☆87May 21, 2025Updated 10 months ago
- MTalk-Bench: Evaluating Speech-to-Speech Models in Multi-Turn Dialogues via Arena-style and Rubrics Protocols☆18Nov 19, 2025Updated 4 months ago
- ☆208Dec 5, 2024Updated last year
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆114Jan 28, 2026Updated last month
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech☆11May 14, 2025Updated 10 months ago
- An efficient speech separation method☆294Apr 11, 2024Updated last year
- Evaluation script for VoxMovies dataset in PyTorch☆23Jan 12, 2024Updated 2 years ago
- Conformer block with Rotary Position Embedding, modified from lucidrains' implement☆18Sep 13, 2024Updated last year
- A description of "RealMAN: A Real-Recorded and Annotated Microphone Array Dataset for Dynamic Speech Enhancement and Localization" [NeurI…☆155Apr 29, 2025Updated 10 months ago
- Implementation of "Audio xLSTMs: Learning Self-supervised audio representations with xLSTMs" in PyTorch☆19Updated this week
- ☆32Oct 23, 2025Updated 5 months ago