Mashiro009 / slidespeech_dlView external linksLinks
☆24Sep 20, 2024Updated last year
Alternatives and similar repositories for slidespeech_dl
Users that are interested in slidespeech_dl are comparing it to the libraries listed below
Sorting:
- Llama-Mimi is a speech language model that uses a unified tokenizer (Mimi) and a single Transformer decoder (Llama) to jointly model sequ…☆28Sep 20, 2025Updated 4 months ago
- Official code for paper:"Speaking Clearly: A Simplified Whisper-Based Codec for Low-Bitrate Speech Coding"☆28Jan 28, 2026Updated 2 weeks ago
- Repo for the FB AI Speech team.☆25Aug 24, 2021Updated 4 years ago
- ☆14Jun 17, 2024Updated last year
- Speech-To-Text forced-alignment Speech processing Universal PERformance Benchmark☆35May 7, 2025Updated 9 months ago
- Reimplementation of Miipher☆29Aug 16, 2023Updated 2 years ago
- ☆10Dec 22, 2023Updated 2 years ago
- Testing sets for semanticVAD☆20Feb 18, 2025Updated 11 months ago
- ☆26Updated this week
- Data manipulation and transformation for audio signal processing, powered by PyTorch☆10Sep 30, 2024Updated last year
- ☆10Oct 16, 2025Updated 3 months ago
- ☆68Dec 30, 2025Updated last month
- ☆23Dec 6, 2025Updated 2 months ago
- Room impulse response simulation for various array architectures using Monte-Carlo simulation and quaternions (Python)☆17May 25, 2025Updated 8 months ago
- LLaSE: Maximizing Acoustic Preservation for LLaMA based Speech Enhancement☆16Jul 11, 2025Updated 7 months ago
- ☆11Aug 10, 2022Updated 3 years ago
- Implementation of CGMM-MVDR beamforming used for Clarity challenge☆13Jan 14, 2022Updated 4 years ago
- NAR-BERT-ASR☆10Sep 27, 2021Updated 4 years ago
- Cross-Speaker Encoding Network for Multi-talker Speech Recognition☆11Mar 14, 2025Updated 10 months ago
- [ICASSP2023] Source code, model links and open test sets for paper SeACo-Paraformer.☆39Mar 15, 2024Updated last year
- Code for ICASSP 2024 Paper: RECAP: Retrieval-Augmented Audio Captioning☆16Jun 23, 2024Updated last year
- Official Implementation and Dataset of paper - DFADD: The Diffusion and Flow-matching based Audio Deepfake Dataset☆15Apr 7, 2025Updated 10 months ago
- This is the unofficial implementation of MFNet, from paper''a Mask Free Neural Network for Monaural Speech Enhancement''☆13Dec 20, 2024Updated last year
- ☆38Apr 3, 2025Updated 10 months ago
- unofficial implementation of "CPTNN: CROSS-PARALLEL TRANSFORMER NEURAL NETWORK FOR TIME-DOMAIN SPEECH ENHANCEMENT"☆15Nov 14, 2023Updated 2 years ago
- Implementation of the paper "Confidence estimation for attention based sequence to sequence models for speech recognition"☆16May 9, 2021Updated 4 years ago
- Voice activity detection and speaker gender segmentation audiovisual corpus☆16Jan 20, 2025Updated last year
- DWFormer: Dynamic Window Transformer for Speech Emotion Recognition(ICASSP 2023 Oral)☆69Jul 8, 2024Updated last year
- End-to-end MOdeling of ASR (Automatic Speech Recognition)☆33Feb 16, 2023Updated 2 years ago
- ☆18Aug 23, 2024Updated last year
- ☆15Aug 25, 2022Updated 3 years ago
- Descript Audio Codec - VAE Variant (.dac-vae): High-Fidelity Audio Compression with Variational Autoencoder☆31Aug 30, 2025Updated 5 months ago
- ☆18May 27, 2025Updated 8 months ago
- Code for ACL-IJCNLP 2021 paper "N-Best-ASR-Transformer: Enhancing SLU Performance using Multiple ASR Hypotheses."☆17Nov 30, 2021Updated 4 years ago
- This repository contains the training code from paper "SpidR Learning Fast and Stable Linguistic Units for Spoken Language Models Without…☆46Feb 4, 2026Updated last week
- ☆20Apr 27, 2024Updated last year
- Official implementation of RAVEn (ICLR 2023) and BRAVEn (ICASSP 2024)☆78Feb 27, 2025Updated 11 months ago
- ☆20Mar 7, 2025Updated 11 months ago
- Multimodal Home Activity Dataset with Multi-Angle Videos and Synchronized Physiological Signals☆21Dec 21, 2024Updated last year