ivcylc / qa-mdt

OpenMusic: SOTA Text-to-music (TTM) Generation

☆479

Related projects ⓘ

Alternatives and complementary repositories for qa-mdt

haidog-yaqub / EzAudio
High-quality Text-to-Audio Generation with Efficient Diffusion Transformer
☆238Updated last week
HelloVision / HelloMeme
The official HelloMeme GitHub site
☆171Updated last week
edwko / OuteTTS
Interface for OuteTTS models.
☆406Updated 2 weeks ago
Text-to-Audio / Make-An-Audio
PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio Generative Model
☆752Updated 5 months ago
open-mmlab / FoleyCrafter
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师，给你的无声视频添加生动而且同步的音效 😝
☆466Updated 3 months ago
gpt-omni / mini-omni2
Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。
☆1,565Updated 2 weeks ago
shaopengw / Awesome-Music-Generation
Awesome music generation model——MG²
☆112Updated this week
Vchitect / Vchitect-2.0
Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models
☆647Updated 2 months ago
yerfor / MimicTalk
MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes; NeurIPS 2024; Official code
☆424Updated last month
GTSinger / GTSinger
Dataset and code of GTSinger(NeurIPS 2024 Spotlight): A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing…
☆225Updated 3 weeks ago
KdaiP / StableTTS
Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3
☆365Updated 2 months ago
EmilianPostolache / stable-audio-controlnet
Fine-tune Stable Audio Open with DiT ControlNet.
☆177Updated 2 months ago
facebookresearch / spiritlm
Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".
☆781Updated 3 weeks ago
lucidrains / e2-tts-pytorch
Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch
☆355Updated 2 weeks ago
NJU-PCALab / RAG-Diffusion
Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement 🔥
☆307Updated this week
Eddycrack864 / UVR5-UI
Ultimate Vocal Remover 5 with Gradio UI. Separate an audio file into various stems, using multiple models
☆214Updated last week
revdotcom / reverb
Open source inference code for Rev's model
☆333Updated this week
okio-ai / nendo-platform
Nendo is an open source platform for AI-driven audio management, intelligence, and generation.
☆117Updated 8 months ago
RoyalCities / RC-stable-audio-tools
Generative models for conditional audio generation
☆117Updated this week
OpenT2S / LlamaVoice
LlamaVoice is a llama-based large voice generation model, providing inference and training ability.
☆222Updated 2 months ago
jishengpeng / ControlSpeech
ControlSpeech: Towards Simultaneous Zero-shot Speaker Cloning and Zero-shot Language Style Control With Decoupled Codec
☆196Updated 2 months ago
misya11p / amt-apc
AMT-APC: AMT-APC: Automatic Piano Cover by Fine-Tuning an Automatic Music Transcription Model
☆54Updated 2 weeks ago
AMAAI-Lab / mustango
Mustango: Toward Controllable Text-to-Music Generation
☆342Updated 3 months ago
Jeff-LiangF / streamv2v
Official Pytorch implementation of StreamV2V.
☆450Updated 2 months ago
HilaManor / AudioEditingCode
☆139Updated last month
PolyAI-LDN / pheme
☆253Updated 8 months ago
JackAILab / ConsistentID
Customized ID Consistent for human
☆845Updated 3 months ago
kepengxu / PGTFormer
[IJCAI'24] Beyond Alignment: Blind Video Face Restoration via Parsing-Guided Temporal-Coherent Transformer
☆185Updated 2 months ago
happylittlecat2333 / Auffusion
Official codes and models of the paper "Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generati…
☆156Updated 7 months ago