sidhantls / lexpod-speaker-predictionLinks
Speaker prediction for captions on the Lex Fridman podcast
☆27Updated last year
Alternatives and similar repositories for lexpod-speaker-prediction
Users that are interested in lexpod-speaker-prediction are comparing it to the libraries listed below
Sorting:
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆119Updated 2 years ago
- Accelerate Whisper tasks such as transcription, by multiprocesing through parallelization☆25Updated 3 years ago
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation☆151Updated last year
- A live speech recognition using Facebooks wav2vec 2.0 model.☆375Updated last year
- Batch Support for OpenAI Whisper☆96Updated last year
- On-device voice activity detection (VAD) powered by deep learning☆241Updated last week
- A minimalistic automatic speech recognition streamlit based webapp powered by OpenAI's Whisper "State of the Art" models☆67Updated 3 years ago
- Putting flows on top of neural transducers for better TTS☆64Updated last month
- Zero-shot Audio Classification using Whisper☆79Updated 3 years ago
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.☆137Updated 2 years ago
- Experiments to test different speech recognition systems for SEPIA Framework☆62Updated 2 years ago
- Deep Learning - one shot learning for speaker recognition using Filter Banks☆170Updated last year
- 🐸STT integration examples☆130Updated 3 years ago
- openvino version of openai/whisper☆180Updated 2 years ago
- Speaker diarization service☆25Updated 6 months ago
- Open TTS models, built for streaming on the edge☆44Updated 9 months ago
- A curated list of awesome voice activity detection☆71Updated last year
- Zero-shot Domain-sensitive Speech Recognition with Prompt-conditioning Fine-tuning (ASRU2023)☆27Updated 2 years ago
- On-device noise suppression powered by deep learning☆78Updated last week
- OpenAI Whisper Prompt Examples☆53Updated 2 years ago
- EMNLP 23 - Integrating Whisper Encoder to LLaMA Decoder for Generative ASR Error Correction☆266Updated last year
- Open models for Coqui STT☆149Updated 2 years ago
- 🐸 - A general purpose model trainer, as flexible as it gets☆231Updated last year
- Efficient approach to speaker diarization using voice characteristics extraction☆105Updated 6 months ago
- one script for xls-r/xlsr/whisper fine-tuning☆42Updated 2 years ago
- Listen to any audio stream on your machine and print out the transcribed or translated audio.☆119Updated 2 years ago
- This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.☆27Updated last year
- Stable timestamps and confidence score for words of OpenAI's Whisper outputs down to word-level.☆25Updated 3 years ago
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelines☆100Updated last year
- Reproducible experimental protocols for multimedia (audio, video, text) database☆112Updated last month