A desktop application that transcribes audio from files, microphone input or YouTube videos with the option to translate the content and create subtitles.
☆343Oct 15, 2024Updated last year
Alternatives and similar repositories for audiotext
Users that are interested in audiotext are comparing it to the libraries listed below
Sorting:
- Modern Desktop Application offering a suite of tools for audio/video text recognition and a variety of other useful utilities.☆58Aug 12, 2024Updated last year
- A real-time transcription and translation tool implemented in Python based on the fast-whisper library.☆14Feb 11, 2026Updated 3 weeks ago
- Native UI for the Whispering Tiger project - https://github.com/Sharrnah/whispering (live transcription / translation)☆315Jan 25, 2026Updated last month
- Transform youtube URL into text 100x faster with whisperx☆20May 8, 2023Updated 2 years ago
- Speaker diarization service☆28Feb 24, 2026Updated last week
- 100% in-browser, hands-free AI voice chat using Whisper, WebLLM, and Supertonic TTS☆145Dec 11, 2025Updated 2 months ago
- Using YouTube to prepare a speech recognition dataset for any language☆10Mar 30, 2021Updated 4 years ago
- Repository for reproducing result in journal "Self-supervised learning for Speech Emotion Recognition"☆10Mar 15, 2023Updated 2 years ago
- Showcase my own method of deploying LibreChat, hope it helps for anyone who is checking this☆12Jan 28, 2025Updated last year
- Docker for building an environment for Dutch online and offline ASR.☆12Feb 2, 2021Updated 5 years ago
- All-in-one Speech Transcription☆10Jan 25, 2026Updated last month
- ☆11Nov 5, 2021Updated 4 years ago
- YouTube GPT is an Android app that allows users to generate video summaries using A.I models. Also capable of answering cross-questions r…☆23Oct 7, 2023Updated 2 years ago
- steps to perform text-based speaker diarization with kaldi toolkit☆12Nov 2, 2018Updated 7 years ago
- Simple Kaldi recipe for forced alignment☆11Jul 16, 2023Updated 2 years ago
- Thai Grapheme to Phoneme (G2P) Wiktionary Corpus☆13Jul 25, 2022Updated 3 years ago
- ☆10Sep 19, 2022Updated 3 years ago
- Robust Speech Recognition via Large-Scale Weak Supervision☆29Dec 16, 2023Updated 2 years ago
- An open-source, browser-based transcript viewer and manager. Upload, transcribe, and chat with meeting recordings using AI. Features meet…☆66May 2, 2025Updated 10 months ago
- A simple command line tool to calculate WER for ASR.☆14Oct 14, 2024Updated last year
- An automatic speech recognition environment for Icelandic based on Kaldi☆14Oct 12, 2017Updated 8 years ago
- ☆13Oct 27, 2021Updated 4 years ago
- Transfer learning approach to pronunciation scoring☆12Jan 17, 2024Updated 2 years ago
- Train a fiwGAN or ciwGAN model using your own training data☆14Oct 13, 2022Updated 3 years ago
- Middleware for AI Agents that verifies grounding and prevents hallucinations. Returns structured retry suggestions for self-correction.☆50Dec 11, 2025Updated 2 months ago
- CD4AutoML: Continuous Delivery for AutoML with Amazon SageMaker Autopilot and Amazon Step Functions☆13Dec 12, 2020Updated 5 years ago
- S3PRL for Speech Emotion Recognition (see s3prl > downstream)☆15Feb 28, 2026Updated last week
- WebRTC-based real-time audio streaming with Faster Whisper ASR integration for live speech-to-text transcription.☆13Sep 27, 2024Updated last year
- Open source framework for voice and multimodal conversational AI☆32Jan 13, 2025Updated last year
- SoundTranscriber can be used to generate automatic transcription / automatic subtitles for audio/video files through a friendly graphical…☆32Sep 4, 2025Updated 6 months ago
- Official implementation of the paper "Distilling a Pretrained Language Model to a Multilingual ASR Model" (Interspeech 2022)☆12Mar 12, 2024Updated last year
- An open-source self-hosted purple team management web application.☆27Dec 10, 2025Updated 2 months ago
- PolEval 2021 Task 1☆15Jun 28, 2022Updated 3 years ago
- Code release for "TinySpeech: Attention Condensers for Deep Speech Recognition Neural Networks on Edge Devices"☆21Jun 7, 2025Updated 9 months ago
- Produce intelligence by means of natural selection without objective/reward optimization☆15Sep 29, 2021Updated 4 years ago
- OCTRA is a web-application for the orthographic transcription of audio files.☆39Feb 17, 2026Updated 2 weeks ago
- This is a mirror of https://gitlab.com/tiro-is/tiro-speech-core☆15Jun 19, 2023Updated 2 years ago
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Aug 18, 2023Updated 2 years ago
- a cross-platform and customizable vlc video player that can generate subtitles using WhisperX model☆14Oct 6, 2023Updated 2 years ago