skrbnv / javadLinks
β62Updated 8 months ago
Alternatives and similar repositories for javad
Users that are interested in javad are comparing it to the libraries listed below
Sorting:
- Official implementation of the TTS model Lina-Speechβ170Updated 9 months ago
- ποΈ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets β¨β127Updated 2 months ago
- [EMNLP Main '25] LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximationβ131Updated 5 months ago
- Collection of Open Source Speech Dataβ161Updated 3 weeks ago
- Very fast, accurate speaker diarizationβ155Updated last week
- β314Updated 3 weeks ago
- β310Updated last year
- Automatic Speech Recognition in Python using ONNX modelsβ137Updated last week
- β378Updated last year
- Speaker Diarization with Transformersβ69Updated 4 months ago
- A TTS model capable of generating ultra-realistic dialogue in one pass.β126Updated 3 months ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GPβ¦β102Updated last year
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.β68Updated this week
- SlamKit is an open source tool kit for efficient training of SpeechLMs. It was used for "Slamming: Training a Speech Language Model on Onβ¦β219Updated 5 months ago
- A simple, hackable text-to-speech system in PyTorch and MLXβ176Updated 2 months ago
- VoiceRestore: Flow-Matching Transformers for Universal Speech Restorationβ186Updated 6 months ago
- Audio tokenization, in the fastest way possible!β53Updated last year
- Automatically cleaning, enhancing, segmenting, filtering, and formatting a dataset to fine tune or train a voice model.β43Updated last month
- Open TTS models, built for streaming on the edgeβ43Updated 7 months ago
- Open-source reproducible benchmarks from Argmaxβ64Updated last week
- A package for NeuCodec: a 50hz, 0.8kbps, 24kHz audio codec.β102Updated 2 weeks ago
- β145Updated last week
- β49Updated last week
- Efficient approach to speaker diarization using voice characteristics extractionβ102Updated 4 months ago
- Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event β¦β404Updated last year
- Real-time Speech-Text Foundation Model Toolkit (wip)β247Updated 7 months ago
- VoXtream is a Full-Stream Zero-shot TTS model with Extremely Low Latencyβ159Updated 2 weeks ago
- β262Updated last year
- An unofficial PyTorch implementation of VALL-Eβ88Updated 2 months ago
- π Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. π§π₯π Advanced audio processing.β252Updated last year