A toolkit for researchers in the multimodal sound separation.
☆16Oct 20, 2023Updated 2 years ago
Alternatives and similar repositories for Look2hear
Users that are interested in Look2hear are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Power-Guided Grouped SRU for Real-Time Causal Audio-Visual Speech Separation☆26Nov 4, 2025Updated 7 months ago
- Source code and demo for INTERSPEECH 2024 paper: Noise-robust Speech Separation with Fast Generative Correction☆50Nov 19, 2024Updated last year
- Official implementation of A cappella: Audio-visual Singing VoiceSeparation, from BMVC21☆18May 14, 2022Updated 4 years ago
- Spherical residual vector quantization (SRVQ)☆31Aug 25, 2024Updated last year
- 记录关于AEC的论文和代码、博客以及相关资料☆15Jul 26, 2022Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- A solution to denoising and separating for two-speaker-mixed noisy speech, using a BSRNN inspired network.☆15Aug 22, 2023Updated 2 years ago
- Implementation of "Look, Listen and Recognise:character-aware audio-visual subtitling"☆21Nov 3, 2025Updated 7 months ago
- Variations of L1 SNR Loss function for training audio source separation machine learning models☆44May 1, 2026Updated last month
- ☆23Jul 16, 2025Updated 10 months ago
- ☆21Jul 15, 2024Updated last year
- (ICASSP 2025) Learning Source Disentanglement in Neural Audio Codec☆48May 16, 2025Updated last year
- Dataset simulation for DPCCN.☆16Dec 25, 2022Updated 3 years ago
- Apply Score diffusion to improve speech signals recorded under various adverse conditions and distortions, including noise, reverberation…☆78Jul 29, 2024Updated last year
- Official data preparation and metric evaluation scripts for the Interspeech 2025 URGENT challenge.☆84May 21, 2025Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Official Implementation of LauraTSE: Target Speaker Extraction using Auto-Regressive Decoder-Only Language Models.☆34Nov 9, 2025Updated 7 months ago
- iSeparate library for the SDX2023 challenge☆15Dec 15, 2023Updated 2 years ago
- Sound Separation, Omni modal☆28Sep 15, 2025Updated 8 months ago
- This is the official repository of ``Scalable Neural Vocoder from Range-Null Space Decomposition'', which is submitted to TPAMI.☆53Oct 11, 2025Updated 8 months ago
- The source code for the paper CrossSinger (asru2023)☆18Oct 12, 2023Updated 2 years ago
- ☆25Aug 29, 2025Updated 9 months ago
- Python scripts to create noisy and reverberant 2-speaker mixture audio with Libri-Light and WHAM☆17Nov 7, 2024Updated last year
- Code for paper Learning Audio-Visual Dereverberation☆32Aug 10, 2022Updated 3 years ago
- Keep track of good articles on speech processing, mainly on speech enhancement, include speech denoise, speech dereverberation and aec、ag…☆49Jul 17, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- This repo hosts the code and model of "Separate What You Describe: Language-Queried Audio Source Separation", Interspeech 2022☆146Oct 11, 2023Updated 2 years ago
- Feed-forward compressor experiments source code for "Differentiable All-pole Filters for Time-varying Audio Systems".☆23Jun 10, 2024Updated 2 years ago
- Offline RL experiments☆15Oct 1, 2022Updated 3 years ago
- ☆16Jul 14, 2020Updated 5 years ago
- Real-Time ASR with CNN-BiLSTM: End-to-End Live Streaming Using PyTorch Lightning⚡☆11Jan 23, 2025Updated last year
- MTalk-Bench: Evaluating Speech-to-Speech Models in Multi-Turn Dialogues via Arena-style and Rubrics Protocols☆20Nov 19, 2025Updated 6 months ago
- Official data preparation scripts for the URGENT 2024 Challenge☆90May 21, 2025Updated last year
- MultiModal Audio Generation in Raw Waveform Space.☆151May 26, 2026Updated 2 weeks ago
- ☆221Dec 5, 2024Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆118Jan 28, 2026Updated 4 months ago
- Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech☆11May 14, 2025Updated last year
- An efficient speech separation method☆275Apr 11, 2024Updated 2 years ago
- Evaluation script for VoxMovies dataset in PyTorch☆23Jan 12, 2024Updated 2 years ago
- Conformer block with Rotary Position Embedding, modified from lucidrains' implement☆19Sep 13, 2024Updated last year
- Implementation of "Audio xLSTMs: Learning Self-supervised audio representations with xLSTMs" in PyTorch☆20Updated this week
- A description of "RealMAN: A Real-Recorded and Annotated Microphone Array Dataset for Dynamic Speech Enhancement and Localization" [NeurI…☆166Apr 29, 2025Updated last year