winstxnhdw / CapGenLinks
A fast CPU-first video/audio transcriber for generating caption files with Whisper and CTranslate2, hosted on Hugging Face Spaces.
☆10Updated this week
Alternatives and similar repositories for CapGen
Users that are interested in CapGen are comparing it to the libraries listed below
Sorting:
- Simple, Unified Repository for Retrieval-based Voice Conversion☆17Updated last year
- ☆13Updated last year
- ☆16Updated last year
- Talking Face Generation; Time–Spatial Consistency☆9Updated 11 months ago
- Automatically generate a lip-synced avatar based off of a transcript and audio☆13Updated 2 years ago
- Bringing large-language models and chat to web browsers. Everything runs inside the browser with no server support.☆14Updated last year
- Implementation of SoundtStream from the paper: "SoundStream: An End-to-End Neural Audio Codec"☆12Updated 6 months ago
- Code for the paper "Free-View Expressive Talking Head Video Editing" (ICASSP 2023)☆10Updated last year
- Translate any text using GPT.☆16Updated 2 years ago
- A python library to find differences between audio and transcriptions☆20Updated last year
- Browser automation for creating new pages in WordPress☆13Updated 2 months ago
- App edit image like mini photoshop using python, pyqt5, deeplearning☆11Updated 2 years ago
- This is not remotely close to a finished product, and does not intend to nor does this claim to be working fine-tuning code for MaskGCT. …☆12Updated 8 months ago
- FastAPI backend to upload files to S3☆27Updated 5 years ago
- Engage in conversation with your virtual self using AI techniques like NLP, voice cloning, and computer vision. Get accurate answers with…☆84Updated 2 years ago
- ☆16Updated 3 months ago
- Convert any image into a Region Adjacency Graph (RAG)☆12Updated 5 years ago
- [NCMMSC'2024] Emotion-Aware Prosodic Phrasing for Expressive Text-to-Speech☆22Updated 11 months ago
- Code for paper: "Privately generating tabular data using language models".☆15Updated 2 years ago
- ☆15Updated last year
- DoyenTalker uses deep learning techniques to generate personalized avatar videos that speak user-provided text in a specified voice. The …☆12Updated 10 months ago
- AgentParse is a high-performance parsing library designed to map various structured data formats (such as Pydantic models, JSON, YAML, an…☆14Updated 3 weeks ago
- Django Dynamic API - Open-Source Library | AppSeed☆16Updated 9 months ago
- A Python neural network made with TensorFlow that converts one person's voice into another.☆10Updated 4 years ago
- ☆17Updated last year
- A composition of offline tools to achieve high quality multilingual speech to text transcription☆19Updated 2 months ago
- An offline CPU-first low-resource chat application to perform RAG on your corpus of data. Powered by OpenChat and CTranslate2.☆14Updated 2 months ago
- An open source NLP as a service project focused on providing state of the art systems with ease. Training and inference by simple docker …☆20Updated 10 months ago
- ☆16Updated last year
- GeventMP - Gevent Multiprocessing Extension☆20Updated last week