winstxnhdw / CapGenLinks
A fast CPU-first video/audio transcriber for generating caption files with Whisper and CTranslate2, hosted on Hugging Face Spaces.
☆10Updated this week
Alternatives and similar repositories for CapGen
Users that are interested in CapGen are comparing it to the libraries listed below
Sorting:
- Automatically generate a lip-synced avatar based off of a transcript and audio☆13Updated 2 years ago
- Engage in conversation with your virtual self using AI techniques like NLP, voice cloning, and computer vision. Get accurate answers with…☆83Updated last year
- Video chat apps with computer vision filters built on top of Streamlit☆50Updated 2 years ago
- Fast and accurate natural language detection. Detector written in Python. Nito-ELD, ELD.☆17Updated last year
- An awesome list that curates the best Flet tools, tutorials, blogs and more.☆11Updated 2 years ago
- Simple, Unified Repository for Retrieval-based Voice Conversion☆17Updated last year
- Talking Face Generation; Time–Spatial Consistency☆9Updated 10 months ago
- Voice cloning using coqui-TTS☆11Updated last year
- A simple Python wrapper around for Tiktok API .☆22Updated last month
- WebRTC-based real-time audio streaming with Faster Whisper ASR integration for live speech-to-text transcription.☆12Updated 9 months ago
- Browser automation for creating new pages in WordPress☆13Updated last month
- Website to compare Python package downloads☆34Updated this week
- This repository is the project page for "Point Anywhere: Directed Object Estimation from Omnidirectional Images", including source code …☆11Updated last year
- senselab is a Python package that simplifies building pipelines for biometric (e.g. speech, voice, video, etc) analysis.☆23Updated 2 weeks ago
- Chakra Implementation in Reflex☆39Updated 3 weeks ago
- StimulerVoiceX is a denoising and speech enhancement system. It uses deep learning techniques to remove noise from speech signals and imp…☆12Updated last year
- http-streaming-playground☆10Updated last year
- Implementation of SoundtStream from the paper: "SoundStream: An End-to-End Neural Audio Codec"☆12Updated 5 months ago
- ☆15Updated last year
- Use XML tags for long context prompting using Claude's multi-document structure.☆26Updated 2 months ago
- ☆13Updated last year
- App edit image like mini photoshop using python, pyqt5, deeplearning☆11Updated 2 years ago
- Code for "Weakly-supervised Fingerspelling Recognition in British Sign Language Videos", BMVC 2022.☆12Updated 2 years ago
- python GET raw or rendered HTML (for humans)☆13Updated 5 years ago
- A simple voice conversion tool☆17Updated 3 years ago
- Talking Face Generation system☆19Updated last year
- Speaker diarization service☆23Updated 3 weeks ago
- [NCMMSC'2024] Emotion-Aware Prosodic Phrasing for Expressive Text-to-Speech☆22Updated 10 months ago
- Hyperlinks for pydantic models☆16Updated 5 months ago
- 🧪 Data Science | ⚒️ MLOps | ⚙️ DataOps : Talks about 🦄☆19Updated last month