XJTLUSURF20240123 / EmoMA-Net
☆10Updated 9 months ago
Alternatives and similar repositories for EmoMA-Net
Users that are interested in EmoMA-Net are comparing it to the libraries listed below
Sorting:
- Bringing large-language models and chat to web browsers. Everything runs inside the browser with no server support.☆14Updated last year
- A text to audio pipeline using Riffusion (a finetuned stablediffusion model) and using RAVE a audio to audio AutoEncoder.☆16Updated last year
- This is a demo application showing how a dynamic video can be previewed in the browser using the Creatomate Preview SDK.☆12Updated last year
- Attempt at cog wrapper for a SDXL CLIP Interrogator☆10Updated last year
- ☆19Updated 8 months ago
- Cog wrapper for FalconsAi / nsfw_image_detection☆16Updated last year
- Voice data <= 10 mins can also be used to train a good VC model!☆12Updated last year
- Text-to-Music Generation with Rectified Flow Transformer☆8Updated 8 months ago
- Auto-Video maker handling many AI's☆10Updated last year
- ☆8Updated 8 months ago
- Cog template for Stable Diffusion 3 (ComfyUI implementation)☆17Updated 10 months ago
- This project includes a Python script for fine-tuning a text-to-speech (TTS) model. The script utilizes custom datasets and use CUDA for …☆13Updated 7 months ago
- Easily create video datasets with auto-captioning for Hunyuan-Video LoRA finetuning☆12Updated last month
- ☆12Updated last year
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io☆16Updated last year
- ☆14Updated 2 months ago
- Cloud Video Renderer SDK using layerhub components☆16Updated 2 years ago
- A mono-repo to house the various supported Transport options to be used with Pipecat's client-js package☆21Updated this week
- ☆19Updated last year
- A service which wraps and chains video and audio Hugging Face Spaces together☆14Updated 8 months ago
- [IEEE/CVF CVPR'2022] "ST-MFNet: A Spatio-Temporal Multi-Flow Network for Frame Interpolation", Duolikun Danier, Fan Zhang, David Bull☆13Updated last year
- Prepare spectrograms from audio for training a Riffusion model☆15Updated 2 years ago
- Site for sharing MusicGen + AudioGen Prompts and Creations☆42Updated last month
- Automatically generate a lip-synced avatar based off of a transcript and audio☆13Updated 2 years ago
- Guide: from fragile multi-agent app to prod ready with orra - code and resources.☆12Updated last month
- High-performance ASR tool using Faster Whisper, supporting custom models, multi-language transcription, and real-time processing feedback…☆10Updated 6 months ago
- ☆15Updated last month
- ☆11Updated last year
- XTTS: Multilingual Voice Cloning TTS Model by Coqui Deployed to Replicate☆24Updated last year
- ☆26Updated 8 months ago