Microservice that generates subtitles for TUM-Live
☆18Feb 19, 2026Updated 2 weeks ago
Alternatives and similar repositories for gocast-voice-service
Users that are interested in gocast-voice-service are comparing it to the libraries listed below
Sorting:
- Soniox Compare. Compare real-time voice AI side by side. No glossy charts, just results.☆21Jul 15, 2025Updated 7 months ago
- Robust Speech Recognition via Large-Scale Weak Supervision☆19Dec 1, 2022Updated 3 years ago
- [SLT'24] Mamba-based Decoder-Only Approach for Speech Recognition☆18Dec 1, 2024Updated last year
- Simple Python library, distributed via binary wheels with few direct dependencies, for easily using wav2vec 2.0 models for speech recogni…☆23Aug 16, 2021Updated 4 years ago
- ☆29Feb 4, 2025Updated last year
- Accelerate Whisper tasks such as transcription, by multiprocesing through parallelization☆25Oct 29, 2022Updated 3 years ago
- Zero-shot Domain-sensitive Speech Recognition with Prompt-conditioning Fine-tuning (ASRU2023)☆27Oct 10, 2023Updated 2 years ago
- ☆32May 17, 2024Updated last year
- [ICASSP2023] Source code, model links and open test sets for paper SeACo-Paraformer.☆44Mar 15, 2024Updated last year
- Cantonese Text to Speech with VITS implementation☆37Apr 8, 2023Updated 2 years ago
- End-to-end MOdeling of ASR (Automatic Speech Recognition)☆33Feb 16, 2023Updated 3 years ago
- ☆32Dec 4, 2022Updated 3 years ago
- ☆37Mar 26, 2024Updated last year
- KittenTTS is an ultra-lightweight, CPU-friendly text-to-speech model with 15M params for real-time, high-quality voices. Open source, fas…☆23Updated this week
- Official repository of the work "Low-complexity Unsupervised Audio Anomaly Detection exploiting Separable Convolutions and Angular Loss" …☆10Nov 6, 2024Updated last year
- eCMU: An Efficient Phase-aware Framework for Music Source Separation with Conformer (IEEE RIVF23)☆10Oct 30, 2024Updated last year
- ☆11Aug 11, 2023Updated 2 years ago
- SChunk-Encoder (Transformer or Conformer) for streaming E2E ASR☆11Oct 21, 2022Updated 3 years ago
- Grapheme-to-phoneme tool for corpus conversion, where phonemes match Phoible inventories☆19Apr 10, 2025Updated 10 months ago
- ☆13Oct 9, 2025Updated 5 months ago
- A tool to collect/validate audio recordings from workers on Amazon Mechanical Turk. Written in Python/Flask. (originally hosted on github…☆14Dec 19, 2022Updated 3 years ago
- Learning an Interpretable End-to-End Network for Real-Time Acoustic Beamforming☆15Aug 20, 2024Updated last year
- Whisper finetuning☆16Apr 9, 2025Updated 11 months ago
- A video/image converter to audio waveform for oscilloscopes☆14Feb 26, 2026Updated last week
- Russian phonetical transcription☆11Nov 19, 2025Updated 3 months ago
- Grapheme to phoneme model for PyTorch☆43Jul 21, 2022Updated 3 years ago
- A python script COMMAND LINE utility to AUTO GENERATE SUBTITLE FILE (using free Vosk Speech Recognition API) and TRANSLATED SUBTITLE FILE…☆11May 5, 2024Updated last year
- Code for the paper "RIR-in-a-Box : Estimating Room Acoustics from 3D Mesh Data through Shoebox Approximation" presented at Interspeech 20…☆16Sep 1, 2024Updated last year
- The implementation for "Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions"☆50Apr 7, 2025Updated 11 months ago
- Onset-and-Offset-Aware Sound Event Detection☆21Feb 10, 2025Updated last year
- Getting confidences from any end-to-end systems☆11May 24, 2023Updated 2 years ago
- ☆13Oct 3, 2025Updated 5 months ago
- Code for the paper: MACE: Leveraging Audio for Evaluating Audio Captioning Systems☆13Jan 16, 2025Updated last year
- The project for speech translation☆12Sep 28, 2023Updated 2 years ago
- ☆10Apr 17, 2024Updated last year
- ☆11Nov 3, 2023Updated 2 years ago
- Arabic Grapheme-to-Phoneme (G2P) Conversion☆13Mar 15, 2025Updated 11 months ago
- ☆13Apr 14, 2024Updated last year
- 🎵 muse: Music Separation☆11Feb 14, 2024Updated 2 years ago