Picovoice / falconLinks
On-device speaker diarization powered by deep learning
☆59Updated this week
Alternatives and similar repositories for falcon
Users that are interested in falcon are comparing it to the libraries listed below
Sorting:
- On-device voice activity detection (VAD) powered by deep learning☆237Updated this week
- On-device noise suppression powered by deep learning☆77Updated this week
- A curated list of awesome voice activity detection☆69Updated last year
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆153Updated last year
- ONNX Inference of Pyannote Segmentation☆97Updated 11 months ago
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelines☆99Updated last year
- ☆44Updated last year
- Reproducible experimental protocols for multimedia (audio, video, text) database☆107Updated last week
- ☆72Updated 2 months ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆91Updated 2 years ago
- Very fast, accurate speaker diarization☆186Updated last week
- Speaker diarization model☆32Updated 2 years ago
- An automatic speech recognition API☆76Updated 3 weeks ago
- Companion repo for the paper "PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings…☆99Updated 11 months ago
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆119Updated 2 years ago
- C++ library for converting text to phonemes for Piper☆137Updated 5 months ago
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.☆137Updated 2 years ago
- ☆93Updated last month
- Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event …☆409Updated last year
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆69Updated last month
- Application for viewing Rich Transcription Time Marked (RTTM) files in an interactive way☆47Updated 2 years ago
- On-device streaming text-to-speech engine powered by deep learning☆121Updated this week
- A package for NeuCodec: a 50hz, 0.8kbps, 24kHz audio codec.☆127Updated 2 months ago
- Code for our INTERSPEECH paper Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection☆100Updated 8 months ago
- 🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨☆129Updated 4 months ago
- ☆65Updated last year
- This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at…☆436Updated 4 months ago
- ☆375Updated last month
- Various speech datasets made available to the public☆129Updated last year
- Google's SoundStorm: Efficient Parallel Audio Generation☆131Updated 2 years ago