retkowsky / audio_embeddings
Audio search using Azure Cognitive Search
☆20Updated last year
Related projects ⓘ
Alternatives and complementary repositories for audio_embeddings
- Speaker Diarization with Transformers☆59Updated 5 months ago
- Towards Building Text-To-Speech Systems for the Next Billion Users - Microsoft Research Intern Work - Accepted at ICASSP 2023☆47Updated last year
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelines☆84Updated 6 months ago
- Repository contains code to fine-tune WhisperASR model☆23Updated last year
- ☆64Updated last year
- ☆152Updated last year
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆45Updated last week
- ☆52Updated 2 weeks ago
- This package is the Python implementation of Deepgram's WebVTT and SRT formatting. Given a transcription, this package can return a valid…☆18Updated last month
- Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor…☆53Updated 7 months ago
- Audio tokenization, in the fastest way possible!☆45Updated 2 months ago
- VoiceRestore: Flow-Matching Transformers for Universal Speech Restoration☆81Updated last month
- OpenAI Whisper Prompt Examples☆39Updated last year
- A huggingface pipeline to train a gpt model based on the transcript obtained byt the Open AI whisper model☆15Updated last year
- Joint speech-language model - respond directly to audio!☆30Updated 6 months ago
- Repository for fine-tuning Transformers 🤗 based seq2seq speech models in JAX/Flax.☆34Updated last year
- This is a fork of the original fairseq repository (version 0.12.2) with added classes for training mHuBERT-147.☆12Updated 5 months ago
- ☆105Updated last month
- ☆20Updated 9 months ago
- This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.☆27Updated 8 months ago
- ☆14Updated 3 weeks ago
- A minimalistic automatic speech recognition streamlit based webapp powered by OpenAI's Whisper "State of the Art" models☆65Updated 2 years ago
- Teach ChatGPT the Alda music programming language, show it some superb code, and consult with it to compose a melody.☆47Updated last year
- The YouTube Text-To-Speech dataset is comprised of waveform audio extracted from YouTube videos alongside their English transcriptions☆50Updated 3 years ago
- Identifying individual speakers in an audio stream based on the unique characteristics found in individual voices using Python☆16Updated last year
- Using short models to classify long texts☆20Updated last year
- Transcription with speaker diarization pipeline☆85Updated last year
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.☆133Updated last year
- ☆84Updated 7 months ago
- Sing an idea ➡️ AI music sample🔥🎶☆90Updated 6 months ago