Evaluate your agent memory on real-world dialogues, not LLM-simulated dialogues.
☆37Jul 3, 2025Updated 8 months ago
Alternatives and similar repositories for REALTALK
Users that are interested in REALTALK are comparing it to the libraries listed below
Sorting:
- ☆11Aug 29, 2025Updated 6 months ago
- transcribe guitar solo audio to midi-like tab.☆12May 18, 2022Updated 3 years ago
- DOSE: Diffusion Dropout with Adaptive Prior for Speech Enhancement, Conference on Neural Information Processing Systems (NeurIPS), 2023☆59May 16, 2025Updated 9 months ago
- ☆36Sep 6, 2025Updated 6 months ago
- Forced alignment decoder for Whisper.☆14Mar 13, 2024Updated last year
- Code from blog 'Searching by Music: Leveraging Vector Search for Music Information Retrieval'☆16Nov 16, 2023Updated 2 years ago
- Descript Audio Codec - VAE Variant (.dac-vae): High-Fidelity Audio Compression with Variational Autoencoder☆31Aug 30, 2025Updated 6 months ago
- PyTorch implementation of Miipher-2 [2025] which is a speech restoration model by Google DeepMind☆64Sep 22, 2025Updated 5 months ago
- Official implementation of the paper: "NeoBabel: A Multilingual Open Tower for Visual Generation"☆23Aug 4, 2025Updated 7 months ago
- [ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching☆45Feb 9, 2025Updated last year
- Estimating musical surprisal/information content in Audio☆23Jan 19, 2026Updated last month
- PyTorch implementation of the ICASSP-24 paper: "Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Superv…☆38Jan 6, 2024Updated 2 years ago
- ☆49Apr 1, 2025Updated 11 months ago
- ☆47Aug 31, 2024Updated last year
- The official repo of "WhiStress: Enriching Transcriptions with Sentence Stress Detection" (Interspeech 2025)☆36Jul 24, 2025Updated 7 months ago
- ASR text preprocessing utility☆21Aug 5, 2024Updated last year
- Speech Resynthesis and Language Modeling☆27Jun 11, 2025Updated 8 months ago
- ☆43Jan 13, 2025Updated last year
- Generative Modeling with Bayesian Sample Inference☆24May 17, 2025Updated 9 months ago
- Source code for the EMNLP 2025 paper “DM-Codec: Distilling Multimodal Representations for Speech Tokenization”☆56Jun 1, 2025Updated 9 months ago
- ☆50Aug 27, 2024Updated last year
- MuChin: A Chinese Colloquial Description Benchmark for Evaluating Language Models in the Field of Music☆26Jan 7, 2026Updated last month
- An End-to-End Pipeline for Enhanced French Text-to-Speech with SSML Prosody Control☆31Jan 13, 2026Updated last month
- Audio-FLAN☆160Sep 23, 2025Updated 5 months ago
- A trainer for SNAC (Multi-Scale Neural Audio Codec) has replaced the decoder with Vocos.☆66Oct 28, 2024Updated last year
- A TTS Trained on Universal Audio.☆41Jun 6, 2025Updated 9 months ago
- ☆24Apr 25, 2023Updated 2 years ago
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆62Nov 1, 2024Updated last year
- ☆33Dec 23, 2025Updated 2 months ago
- ☆130Feb 9, 2026Updated 3 weeks ago
- ☆32Jan 9, 2024Updated 2 years ago
- Implementation of Acoustic BPE (Shen et al., 2024), extended for RVQ-based Neural Audio Codecs☆77Dec 3, 2025Updated 3 months ago
- Source code for ICLR 2021 paper : Pre-training Text-to-Text Transformers for Concept-Centric Common Sense☆27Sep 16, 2021Updated 4 years ago
- faster inference☆28Jan 20, 2025Updated last year
- ☆24Sep 10, 2025Updated 5 months ago
- Explore how to get a VQ-VAE models efficiently!☆68Jul 24, 2025Updated 7 months ago
- [NeurIPS' 25] Benchmark for evaluating TTS models on complex prosodic, expressiveness, and linguistic challenges.☆189Dec 9, 2025Updated 2 months ago
- MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows☆124Sep 2, 2025Updated 6 months ago
- Single Channel Speech Enhancement Methods and Toolbox☆39Feb 26, 2026Updated last week