winstxnhdw / CapGenLinks
A fast CPU-first video/audio transcriber for generating caption files with Whisper and CTranslate2, hosted on Hugging Face Spaces.
☆10Updated this week
Alternatives and similar repositories for CapGen
Users that are interested in CapGen are comparing it to the libraries listed below
Sorting:
- The Full-stack web framework to meet the developer's expectation.☆16Updated 2 years ago
- Automatically generate a lip-synced avatar based off of a transcript and audio☆13Updated 2 years ago
- Code for the paper "Free-View Expressive Talking Head Video Editing" (ICASSP 2023)☆11Updated last year
- App edit image like mini photoshop using python, pyqt5, deeplearning☆12Updated 2 years ago
- Simple, Unified Repository for Retrieval-based Voice Conversion☆17Updated last year
- Modify-Anything is based on yolov5,yolov8 for video and image detection. Segment-anything,lama_cleaner is applied to segment, modify, era…☆17Updated 2 years ago
- python GET raw or rendered HTML (for humans)☆13Updated 5 years ago
- Python text-to-speech library with built-in voice effects and support for multiple TTS engines☆24Updated 8 months ago
- Reflex select component which allows the user to search for options and create new ones.☆14Updated last year
- An open source NLP as a service project focused on providing state of the art systems with ease. Training and inference by simple docker …☆20Updated last year
- ☆17Updated 2 years ago
- Koel Labs innovates open-source speech research, inclusive speech technologies, and real-time pronunciation feedback for language learner…☆16Updated this week
- Engage in conversation with your virtual self using AI techniques like NLP, voice cloning, and computer vision. Get accurate answers with…☆84Updated 2 years ago
- Talking Face Generation system☆19Updated 2 years ago
- 🔊😊 A fastapi voice-assistant framework to quickly prototype LLM-powered voice assistants in <5 minutes.☆30Updated last year
- Transcription and diarization (speaker identification)☆34Updated 2 years ago
- Scripts, data and researches related to cow weight and breed prediction☆13Updated 3 months ago
- This repository is the project page for "Point Anywhere: Directed Object Estimation from Omnidirectional Images", including source code …☆12Updated 2 years ago
- Implementation of SoundtStream from the paper: "SoundStream: An End-to-End Neural Audio Codec"☆13Updated 10 months ago
- Sample and Computation Redistribution for Efficient Face Detection☆15Updated last year
- This is not remotely close to a finished product, and does not intend to nor does this claim to be working fine-tuning code for MaskGCT. …☆12Updated 11 months ago
- Translate any text using GPT.☆17Updated 2 years ago
- A simple Python wrapper around for Tiktok API .☆22Updated 6 months ago
- An offline CPU-first low-resource chat application to perform RAG on your corpus of data. Powered by OpenChat and CTranslate2.☆14Updated 6 months ago
- Multivoice: Enhance your foreign-language movie and TV show experience with personalized dubbed versions. Our project uses voice cloning …☆26Updated 2 years ago
- ☆12Updated last year
- Bringing large-language models and chat to web browsers. Everything runs inside the browser with no server support.☆14Updated last year
- Demo FastAPI WebSocket Audio☆41Updated 5 years ago
- DoyenTalker uses deep learning techniques to generate personalized avatar videos that speak user-provided text in a specified voice. The …☆13Updated last year
- WebRTC-based real-time audio streaming with Faster Whisper ASR integration for live speech-to-text transcription.☆13Updated last year