Eps-Acoustic-Revolution-Lab / EAR_VAEView external linksLinks
This is the official implementation for εar-VAE model including inference and evaluation parts, more details coming soon...
☆56Updated this week
Alternatives and similar repositories for EAR_VAE
Users that are interested in EAR_VAE are comparing it to the libraries listed below
Sorting:
- Speech Resynthesis and Language Modeling☆27Jun 11, 2025Updated 8 months ago
- Official code for paper:"Speaking Clearly: A Simplified Whisper-Based Codec for Low-Bitrate Speech Coding"☆33Jan 28, 2026Updated 2 weeks ago
- ☆32Oct 23, 2025Updated 3 months ago
- Glow-TTS with Stochastic Duration Predictor and Stochastic Pitch Predictor☆18Jun 5, 2023Updated 2 years ago
- Variable Bitrate Residual Vector Quantization for Audio Coding☆51May 1, 2025Updated 9 months ago
- ☆36Sep 6, 2025Updated 5 months ago
- DiFlow-TTS delivers low-latency zero-shot TTS via discrete flow matching and factorized speech tokens. A compact, open framework for fast…☆52Updated this week
- PyTorch implementation of Miipher-2 [2025] which is a speech restoration model by Google DeepMind☆64Sep 22, 2025Updated 4 months ago
- Trainging, inference, and testing of the SAC speech codec model.☆96Nov 1, 2025Updated 3 months ago
- [ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer☆67Nov 1, 2024Updated last year
- ProsodyLM: Uncovering the Emerging Prosody Processing Capabilities in Speech Language Models☆34Nov 18, 2025Updated 2 months ago
- [ICML 2025 Tokenization Workshop] HH-Codec: High Compression High-fidelity Discrete Neural Codec for Spoken Language Modeling☆77Sep 28, 2025Updated 4 months ago
- semantic tokenizer for speech and music☆21Jul 6, 2025Updated 7 months ago
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"☆11Apr 10, 2025Updated 10 months ago
- ☆13Oct 3, 2025Updated 4 months ago
- T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech …☆28Nov 7, 2025Updated 3 months ago
- ☆11Nov 7, 2024Updated last year
- [ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder☆12Mar 11, 2025Updated 11 months ago
- FlexiCodec: A Dynamic Neural Audio Codec for Low Frame Rates☆40Nov 4, 2025Updated 3 months ago
- ☆68Dec 30, 2025Updated last month
- High fidelity, lightweight, end-to-end, streaming, convolution-based neural audio codec☆115Jun 23, 2025Updated 7 months ago
- ☆18Mar 17, 2025Updated 10 months ago
- Room impulse response simulation for various array architectures using Monte-Carlo simulation and quaternions (Python)☆17May 25, 2025Updated 8 months ago
- ☆22Jul 30, 2025Updated 6 months ago
- LLaSE: Maximizing Acoustic Preservation for LLaMA based Speech Enhancement☆16Jul 11, 2025Updated 7 months ago
- Llama-Mimi is a speech language model that uses a unified tokenizer (Mimi) and a single Transformer decoder (Llama) to jointly model sequ…☆28Sep 20, 2025Updated 4 months ago
- Convert a mono channel recording into binaural playback with headphones and loudspeakers☆12Dec 6, 2023Updated 2 years ago
- A repository for code used to produce the results the ICASSP 2024 paper: "SELF-SUPERVISED PRETRAINING FOR ROBUST PERSONALIZED VOICE ACTIV…☆21Nov 25, 2024Updated last year
- ☆13Mar 11, 2025Updated 11 months ago
- This is the unofficial implementation of MFNet, from paper''a Mask Free Neural Network for Monaural Speech Enhancement''☆13Dec 20, 2024Updated last year
- This is the repository for the work "BridgeVoC: Revitalizing Neural Vocoder from a Restoration Perspective".☆63Nov 5, 2025Updated 3 months ago
- Implementation of CGMM-MVDR beamforming used for Clarity challenge☆13Jan 14, 2022Updated 4 years ago
- ☆155Nov 22, 2024Updated last year
- unofficial implementation of "CPTNN: CROSS-PARALLEL TRANSFORMER NEURAL NETWORK FOR TIME-DOMAIN SPEECH ENHANCEMENT"☆15Nov 14, 2023Updated 2 years ago
- Spectral Mapping of Singing Voices: U-Net-Assisted Vocal Segmentation☆13Dec 12, 2024Updated last year
- ☆24Mar 29, 2025Updated 10 months ago
- Voice activity detection and speaker gender segmentation audiovisual corpus☆16Jan 20, 2025Updated last year
- Official PyTorch implementation of 'Rec-RIR: Monaural Blind Room Impulse Response Identification via DNN-based Reverberant Speech Reconst…☆29Dec 25, 2025Updated last month
- [INTERSPEECH 2025 Oral]Official code for "Accelerating Diffusion-based Text-to-Speech Model Training with Dual Modality Alignment"☆64Jun 16, 2025Updated 8 months ago