CorentinJ / transcription-diffLinks
A python library to find differences between audio and transcriptions
☆19Updated 2 years ago
Alternatives and similar repositories for transcription-diff
Users that are interested in transcription-diff are comparing it to the libraries listed below
Sorting:
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆69Updated 2 months ago
- A lightweight Python library for running TTS models with a unified API.☆21Updated 10 months ago
- Sing an idea ➡️ AI music sample🔥🎶☆119Updated last year
- Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor…☆60Updated last year
- Bringing large-language models and chat to web browsers. Everything runs inside the browser with no server support.☆15Updated last year
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆22Updated last year
- Site for sharing MusicGen + AudioGen Prompts and Creations☆48Updated 9 months ago
- Cog wrapper for collabora/WhisperSpeech☆25Updated last year
- Examples of apps built with Nendo, the AI Audio Tool Suite☆55Updated last year
- Seamless Voice Interactions with LLMs☆12Updated 2 years ago
- ☆17Updated last year
- Open TTS models, built for streaming on the edge☆44Updated 10 months ago
- 🐜🔧 A minimalistic tool to fine-tune your LLMs☆18Updated 2 years ago
- Auto-Video maker handling many AI's☆11Updated last year
- ☆12Updated last year
- Voxtral: Convert Mistral into a end2end SpeechLM. No information bottleneck, preserves prosody, learns interruptions from data. Unlike GP…☆40Updated 10 months ago
- AgentParse is a high-performance parsing library designed to map various structured data formats (such as Pydantic models, JSON, YAML, an…☆17Updated 3 months ago
- Fork of AudioLDM as a TuneFlow plugin☆43Updated 2 years ago
- ☆62Updated last year
- OmniByteFormer is a generalized Transformer model that can process any type of data by converting it into byte sequences, bypassing tradi…☆15Updated this week
- Transform unstructured documents into actionable, structured data with enterprise-grade precision and reliability, ready for large-scale …☆19Updated 3 months ago
- Make Kanye sing any song ya want 🎤🔥☆25Updated 2 years ago
- An open source community implementation of the model MELLE from the paper: "Autoregressive Speech Synthesis without Vector Quantization"☆14Updated last month
- Tokenizer for Text to Speech (TTS) models☆13Updated last year
- ☆106Updated 2 years ago
- Extract information, summarize, ask questions, and search videos using OpenAI's Vision API 🚀🎦☆62Updated 2 years ago
- NewsAgent is an enterprise-grade news aggregation agent designed to fetch, query, and summarize news from multiple sources at scale.☆23Updated 3 months ago
- Generate visual podcasts about novels using open source models☆25Updated 2 years ago
- Speech to text to speech using Elevenlabs☆28Updated 2 years ago
- HuggingChat like UI in Gradio☆70Updated 2 years ago