dreji18 / Fine-tune-Speech-Recognition
Tutorial on how to train a custom voice recognition model using Hugging face models.
☆10Updated last year
Related projects ⓘ
Alternatives and complementary repositories for Fine-tune-Speech-Recognition
- ☆18Updated 3 months ago
- ☆35Updated last year
- An JS web client for connecting to Pipecat bots with voice and vision☆37Updated 4 months ago
- This project includes a Python script for fine-tuning a text-to-speech (TTS) model. The script utilizes custom datasets and use CUDA for …☆13Updated last month
- AI-powered YouTube Notes Generator: Create detailed notes from YouTube videos. Streamlit UI for easy use.☆40Updated 4 months ago
- Video Translation with LipSync with OpenAi's whisper for ASR, YourTTS for TTS, and Wav2lip for lip sync.☆15Updated last year
- Transcription and Diarization based on OpenAI's Whisper☆19Updated last year
- Record audio and save a transcription to your system's clipboard with ctranslate2 and faster-whisper.☆69Updated last month
- multilingual RAG☆12Updated 9 months ago
- Exploring what's possible with the ChatGPT Code Interpreter and how to use it effectively☆11Updated last year
- A video editing and recording JavaScript tool for creating voiceovers☆25Updated 3 years ago
- Real-Time Whisper Voice Recognition with vosk model feedback.☆105Updated last year
- 🍳 AyaMCooking is a Voice-to-Voice Mutli-lingual RAG Agent that makes a perfect sous chef for your kitchen, in upto 10 Languages 🤌🧑🍳☆19Updated 3 weeks ago
- AI Lip Syncing application, deployed on Streamlit☆29Updated 8 months ago
- Real time audio to audio translation over sockets. With virtual microphones, you can use this in any video conferencing software you'd li…☆18Updated 3 months ago
- The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained mode…☆10Updated 3 months ago
- MinimalGPT is a concise, adaptable, and streamlined code framework that encompasses the essential components necessary for the constructi…☆21Updated 7 months ago
- Using langchain module to generate RAG prompt for open AI☆12Updated last year
- A library for real-time Speech to Text (STT), and Text to Speech (TTS) capability☆30Updated 11 months ago
- On-device speaker recognition engine powered by deep learning☆27Updated this week
- Modern Desktop Application offering a suite of tools for audio/video text recognition and a variety of other useful utilities.☆43Updated 3 months ago
- Whisper2Summarize is an application that uses Whisper for audio processing and GPT for summarization. It generates summaries of audio tra…☆49Updated last year
- Input a YouTube video link or upload a video file and get a video with subtitles.☆103Updated 2 months ago
- Caption, translate, and optionally record in real time "what you hear" from speakers and microphone. Never miss part of the conversation …☆14Updated 8 months ago
- Voice data <= 10 mins can also be used to train a good VC model!☆11Updated 11 months ago
- Extract structured data from any unstructured web page☆40Updated 7 months ago
- Model : Give me a silent video...And I'm gonna to tell you what's happening in the video.. Will also add a new relevant background audio …☆21Updated 2 years ago
- WhisperAnywhere: Effortless speech-to-text everywhere on your Mac. Use a hotkey to dictate in any app, powered by Whisper AI and Groq API…☆19Updated 2 months ago
- All the serverless code necessary to convert the audio of a Youtube video in one language to a different language using AWS☆37Updated last year
- A UI to view your chromaDB quickly.☆22Updated 8 months ago