skrbnv / javadLinks
☆61Updated 8 months ago
Alternatives and similar repositories for javad
Users that are interested in javad are comparing it to the libraries listed below
Sorting:
- [EMNLP Main '25] LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation☆128Updated 4 months ago
- Very fast, accurate speaker diarization☆129Updated this week
- ☆378Updated last year
- SlamKit is an open source tool kit for efficient training of SpeechLMs. It was used for "Slamming: Training a Speech Language Model on On…☆217Updated 4 months ago
- ☆304Updated 2 months ago
- Collection of Open Source Speech Data☆161Updated last week
- Automatic Speech Recognition in Python using ONNX models☆126Updated last month
- Official implementation of the TTS model Lina-Speech☆170Updated 8 months ago
- ☆310Updated last year
- ☆131Updated last week
- Real-time Speech-Text Foundation Model Toolkit (wip)☆247Updated 6 months ago
- Speaker Diarization with Transformers☆69Updated 3 months ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆68Updated 3 weeks ago
- 🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨☆125Updated last month
- A TTS model capable of generating ultra-realistic dialogue in one pass.☆124Updated 2 months ago
- LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM☆280Updated 4 months ago
- Efficient approach to speaker diarization using voice characteristics extraction☆101Updated 3 months ago
- Automatically cleaning, enhancing, segmenting, filtering, and formatting a dataset to fine tune or train a voice model.☆42Updated 2 weeks ago
- VoiceRestore: Flow-Matching Transformers for Universal Speech Restoration☆185Updated 5 months ago
- Open TTS models, built for streaming on the edge☆43Updated 6 months ago
- 🔊 Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. 🎧👥📊 Advanced audio processing.☆252Updated last year
- Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event …☆404Updated last year
- A simple, hackable text-to-speech system in PyTorch and MLX☆174Updated 2 months ago
- VoXtream is a Full-Stream Zero-shot TTS model with Extremely Low Latency☆139Updated this week
- Open-source reproducible benchmarks from Argmax☆59Updated last week
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆103Updated 11 months ago
- This is an implementation for train hifigan part of XTTSv2 model using Coqui/TTS.☆85Updated 10 months ago
- A TTS model capable of generating ultra-realistic dialogue in one pass.☆212Updated 5 months ago
- Whisper-Flamingo [Interspeech 2024] and mWhisper-Flamingo [IEEE SPL 2025] for Audio-Visual Speech Recognition and Translation☆181Updated 2 months ago
- ONNX Inference of Pyannote Segmentation☆93Updated 9 months ago