thetobysiu / deepstoryView external linksLinks
Deepstory turns a text/generated text into a video where the character is animated to speak your story using his/her voice.
☆102Nov 23, 2025Updated 2 months ago
Alternatives and similar repositories for deepstory
Users that are interested in deepstory are comparing it to the libraries listed below
Sorting:
- DoyenTalker uses deep learning techniques to generate personalized avatar videos that speak user-provided text in a specified voice. The …☆14Sep 20, 2024Updated last year
- Speech to Facial Animation using GANs☆40Nov 3, 2021Updated 4 years ago
- SadTalker gradio_demo.py file with code section that allows you to set the eye blink and pose reference videos for the software to use wh…☆11Jun 20, 2023Updated 2 years ago
- An implementation of the Wav2Letter Speech-to-Text model using PyTorch.☆14Mar 8, 2023Updated 2 years ago
- Self-supervised neural network for music recommendations.☆18Jul 6, 2023Updated 2 years ago
- Fast and differentiable hidden Markov model in C++☆19Jan 20, 2023Updated 3 years ago
- Live real-time avatars from your webcam in the browser. No dedicated hardware or software installation needed. A pure Google Colab wrappe…☆369Mar 18, 2025Updated 10 months ago
- ☆17Mar 23, 2025Updated 10 months ago
- ☆18Jan 18, 2024Updated 2 years ago
- Using a single image and just 10 seconds of sample audio, our project enables you to create a video where it appears as if you're speakin…☆40Sep 13, 2023Updated 2 years ago
- Google Street View with Stable Diffusion + ControlNet☆21Jun 27, 2025Updated 7 months ago
- ☆43Jan 5, 2024Updated 2 years ago
- Avatar Generation For Characters and Game Assets Using Deep Fakes☆232Aug 18, 2024Updated last year
- Word2Wave: a framework for generating short audio samples from a text prompt using WaveGAN and COALA.☆119Dec 13, 2021Updated 4 years ago
- A simple voice conversion tool☆19Mar 10, 2022Updated 3 years ago
- Toolbox for easy and qualitative one-shot voice conversion☆46Dec 5, 2021Updated 4 years ago
- Faster Talking Face Animation on Xeon CPU☆130Nov 14, 2023Updated 2 years ago
- Neural style transfer☆21Jul 29, 2021Updated 4 years ago
- Pytorch implementation of the TecoGan video super resolution model.☆17Mar 1, 2022Updated 3 years ago
- A simple application of DTW Algorithm in isolate word speech recognition.☆17Mar 9, 2020Updated 5 years ago
- Code for the Interspeech 2023 paper "A Joint Model for Pronunciation Assessment and Mispronunciation Detection and Diagnosis with Multi-t…☆25Nov 9, 2023Updated 2 years ago
- This repository is the implementation of the paper, "Score-balanced Loss for Multi-aspect Pronunciation Assessment" (Interspeech 2023).☆22Apr 29, 2024Updated last year
- All the hands-on code examples for my workshop "Building AI Agents with Autogen"☆12Sep 12, 2025Updated 5 months ago
- Docker image and scripts for training finetuned or completely personal Kaldi speech models. Particularly for use with kaldi-active-gramma…☆21Jan 24, 2022Updated 4 years ago
- Mispronunciation Detection using a pretrained and finetuned wav2vec2 model for phoneme recognition and diagnosis and feedback using large…☆48May 6, 2024Updated last year
- This repository aims to collect the articles and codes for the Visual Storytelling (VIST) task. VIST is a vision-and-language task. It ai…☆25Mar 3, 2021Updated 4 years ago
- Recognize speech from an audio file and convert it into animation FBX☆24Mar 7, 2022Updated 3 years ago
- PyTorch implementation for NED (CVPR 2022). It can be used to manipulate the facial emotions of actors in videos based on emotion labels …☆160Oct 6, 2022Updated 3 years ago
- ☆10Feb 23, 2023Updated 2 years ago
- A collection of utilities for handling IPA phones.☆26Sep 24, 2023Updated 2 years ago
- The model implementations for T5 encoder decoder soft prompt tuning for text generation.☆25Dec 5, 2022Updated 3 years ago
- Vecna is a Python chatbot which recommends songs and movies depending upon your feelings☆11Jun 28, 2022Updated 3 years ago
- A neural network based file sorter. Trains an autoencoder to sort images or audio based on the similarity of their encodings, or uses the…☆31Jun 24, 2023Updated 2 years ago
- SMASHED is a toolkit designed to apply transformations to samples in datasets, such as fields extraction, tokenization, prompting, batchi…☆35May 24, 2024Updated last year
- Official repo of Text-Free Learning of a Natural Language Interface for Pretrained Face Generators☆66Dec 13, 2023Updated 2 years ago
- PyTorch Implementation of STGAN for Cloud Removal in Satellite Images.☆31Feb 10, 2020Updated 6 years ago
- Facial Sketch Render, ICASSP 2021☆28Jun 11, 2021Updated 4 years ago
- Orchestrating AI for stunning lip-synced videos. Effortless workflow, exceptional results, all in one place.☆75Jun 19, 2025Updated 7 months ago
- ☆15Jul 23, 2025Updated 6 months ago