avishaiElmakies / unsupervised_speech_segmentation_using_slmView external linksLinks
☆19Jan 8, 2025Updated last year
Alternatives and similar repositories for unsupervised_speech_segmentation_using_slm
Users that are interested in unsupervised_speech_segmentation_using_slm are comparing it to the libraries listed below
Sorting:
- T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech …☆28Nov 7, 2025Updated 3 months ago
- A spoken version of the textual story cloze benchmark☆20Aug 6, 2023Updated 2 years ago
- A lightweight audio codec based on a single quantizer☆69Aug 15, 2025Updated 6 months ago
- ☆32Oct 23, 2025Updated 3 months ago
- The official repo of the paper "StressTest: Can YOUR Speech LM Handle the Stress?"☆20Jul 9, 2025Updated 7 months ago
- [ICASSP 2024] Official code for FreGrad☆35May 13, 2024Updated last year
- Unofficial pytorch implementation of VISinger: Variational Inference with Adversarial Learning for End-to-end Singing Voice Synthesis (IC…☆19May 12, 2023Updated 2 years ago
- Russian phonetical transcription☆11Nov 19, 2025Updated 2 months ago
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆26Aug 5, 2024Updated last year
- Official repository for "Speaking Style Conversion With Discrete Self-Supervised Units" (EMNLP 2023). https://arxiv.org/abs/2212.09730☆131Dec 8, 2023Updated 2 years ago
- CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval☆13Jun 27, 2025Updated 7 months ago
- 🎵 muse: Music Separation☆11Feb 14, 2024Updated 2 years ago
- [ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder☆12Mar 11, 2025Updated 11 months ago
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"☆11Apr 10, 2025Updated 10 months ago
- Neural model for prediction of stress position in Russian words☆12Jun 22, 2025Updated 7 months ago
- Pybind11 bindings for Kaldi☆15Feb 1, 2026Updated 2 weeks ago
- ☆18Mar 17, 2025Updated 10 months ago
- Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment☆12Feb 5, 2025Updated last year
- Official PyTorch implementation of (ICME2025 oral) "AutoStyle-TTS: Retrieval-Augmented Generation based Automatic Style Matching Text-to-…☆17Feb 1, 2026Updated 2 weeks ago
- ☆18Feb 4, 2026Updated last week
- Llama-Mimi is a speech language model that uses a unified tokenizer (Mimi) and a single Transformer decoder (Llama) to jointly model sequ…☆28Sep 20, 2025Updated 4 months ago
- An upgrade framework for train and validate compare with icefall using Lightning.☆15Mar 26, 2025Updated 10 months ago
- Official code for paper:"Speaking Clearly: A Simplified Whisper-Based Codec for Low-Bitrate Speech Coding"☆33Jan 28, 2026Updated 2 weeks ago
- WebRTC-based real-time audio streaming with Faster Whisper ASR integration for live speech-to-text transcription.☆13Sep 27, 2024Updated last year
- ☆33Jan 14, 2023Updated 3 years ago
- DiTTo-TTS: Diffusion Transformers for Scalable Text-to-Speech without Domain-Specific Factors☆35Feb 11, 2025Updated last year
- Repo of the paper "Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model""☆14Jun 28, 2024Updated last year
- Sing any popular song with your voice☆11Jul 10, 2022Updated 3 years ago
- ☆11Sep 5, 2025Updated 5 months ago
- (WIP)long form speech generatoins☆31Apr 2, 2025Updated 10 months ago
- ☆13Mar 11, 2025Updated 11 months ago
- Official Implementation and Dataset of paper - DFADD: The Diffusion and Flow-matching based Audio Deepfake Dataset☆15Apr 7, 2025Updated 10 months ago
- Forced alignment decoder for Whisper.☆14Mar 13, 2024Updated last year
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆15May 16, 2025Updated 8 months ago
- [InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistency☆59Oct 23, 2024Updated last year
- Code release for "TinySpeech: Attention Condensers for Deep Speech Recognition Neural Networks on Edge Devices"☆20Jun 7, 2025Updated 8 months ago
- ☆24Mar 29, 2025Updated 10 months ago
- C++ version of pyannote audio overlapped speech detection pipeline☆13Feb 14, 2024Updated 2 years ago
- ☆13Dec 7, 2022Updated 3 years ago