winstxnhdw / CapGenLinks
A fast CPU-first video/audio transcriber for generating caption files with Whisper and CTranslate2, hosted on Hugging Face Spaces.
☆10Updated last week
Alternatives and similar repositories for CapGen
Users that are interested in CapGen are comparing it to the libraries listed below
Sorting:
- Automatically generate a lip-synced avatar based off of a transcript and audio☆13Updated 2 years ago
 - Sample and Computation Redistribution for Efficient Face Detection☆15Updated last year
 - FastAPI backend to upload files to S3☆27Updated 5 years ago
 - Simple, Unified Repository for Retrieval-based Voice Conversion☆17Updated last year
 - 🤖 Quantum-powered excuse generator for developers. Blame bugs on cosmic rays, AI sentience, or Schrödinger’s intern.☆26Updated 2 months ago
 - DocQues answers queries on longer and multiple documents build on GPT-Index and GPT-3☆13Updated 2 years ago
 - key/value store for Python based on Cloudflare workers☆33Updated 4 months ago
 - A python library to find differences between audio and transcriptions☆19Updated last year
 - ☆12Updated last year
 - Blazingly fast Markdown parser for Python written in Rust.☆36Updated this week
 - ☆60Updated last month
 - Website to compare Python package downloads☆40Updated last month
 - A lightweight Python package for managing multi-agent orchestration. Easily define agents with custom instructions, tools, containers, an…☆51Updated 2 months ago
 - Code for the paper "Free-View Expressive Talking Head Video Editing" (ICASSP 2023)☆10Updated last year
 - Seamless Voice Interactions with LLMs☆12Updated 2 years ago
 - Monetize.ai is a web-based chatbot that provides personalized investment advice using GPT-3.5 and Yahoo Finance API. It's built using Fla…☆16Updated 2 years ago
 - GeventMP - Gevent Multiprocessing Extension☆20Updated 2 months ago
 - Code for paper: "Privately generating tabular data using language models".☆15Updated 2 years ago
 - DoyenTalker uses deep learning techniques to generate personalized avatar videos that speak user-provided text in a specified voice. The …☆13Updated last year
 - A simple uv workspace☆17Updated 6 months ago
 - This repository is the project page for "Point Anywhere: Directed Object Estimation from Omnidirectional Images", including source code …☆11Updated 2 years ago
 - Redis Queue Dashboard based on FastAPI☆115Updated last month
 - A Python neural network made with TensorFlow that converts one person's voice into another.☆10Updated 4 years ago
 - Reflex select component which allows the user to search for options and create new ones.☆14Updated 11 months ago
 - Speaker diarization service☆24Updated 4 months ago
 - A library to convert Pydantic models to TypedDict☆36Updated last year
 - ☆16Updated last year
 - Brainwave is a state-of-the-art neural decoder that transforms electroencephalogram (EEG) and brain signals into multimodal outputs inclu…☆12Updated 3 weeks ago
 - Effective frame sampling for ML applications.☆24Updated 2 months ago
 - XTTS: Multilingual Voice Cloning TTS Model by Coqui Deployed to Replicate☆25Updated 2 years ago