iva-mzsun/MM-LDM

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/iva-mzsun/MM-LDM)

iva-mzsun / MM-LDM

☆11

Alternatives and similar repositories for MM-LDM

Users that are interested in MM-LDM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

SonyResearch / SVG_baseline
View on GitHub
to release the source code for reproducing the results reported in our paper: https://arxiv.org/abs/2409.17550
☆14Nov 15, 2024Updated last year
yataoz / face_reenact_GDPW
View on GitHub
Code repository for the BMVC 2022 paper: Geometry Driven Progressive Warping for One Shot Face Animation
☆12Jan 6, 2023Updated 3 years ago
lzhangbj / ASVA
View on GitHub
[ECCV 2024 Oral] Audio-Synchronized Visual Animation
☆61Mar 15, 2026Updated 2 months ago
yuPeiyu98 / Diffusion-Amortized-MCMC
View on GitHub
[NeurIPS 2023] Learning Energy-Based Prior Model with Diffusion-Amortized MCMC
☆13Mar 1, 2026Updated 2 months ago
OpenNLPLab / TAVGBench
View on GitHub
Demo page of TAVGBench: Benchmarking Text to Audible-Video Generation
☆15Apr 7, 2025Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
leezisheng / SEMG_Assistant
View on GitHub
The sEMG acquisition device is equipped with the upper computer software and written in Python language to complete the data receiving of…
☆16Feb 13, 2023Updated 3 years ago
guyyariv / TempoTokens
View on GitHub
[AAAI 2024] The official PyTorch implementation of "Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation"
☆128May 18, 2026Updated last week
dkurzend / ClipClap-GZSL
View on GitHub
Audio-Visual Generalized Zero-Shot Learning using Large Pre-Trained Models
☆23Apr 15, 2024Updated 2 years ago
Ava4Everr / CodeHS-Java-APCSA
View on GitHub
Just a copy of https://github.com/RobynE23/CodeHS-Java-APCSA, but I added folders and some extra files that didn't exist. Another option …
☆27Jan 23, 2024Updated 2 years ago
naver-ai / rewas
View on GitHub
Official PyTorch implementation of ReWaS (AAAI'25) "Read, Watch and Scream! Sound Generation from Text and Video"
☆44Dec 13, 2024Updated last year
razzu / cmc-cmkm
View on GitHub
☆11May 30, 2022Updated 3 years ago
LTT-O / Awesome-Talking-Head-Generation
View on GitHub
Something about Talking Head Generation
☆32Sep 5, 2023Updated 2 years ago
biolab-put / putemg_examples
View on GitHub
Example scripts for putEMG electromyographic database processing
☆11Dec 28, 2020Updated 5 years ago
jdh-algo / MHAD-Dataset
View on GitHub
Multimodal Home Activity Dataset with Multi-Angle Videos and Synchronized Physiological Signals
☆22Dec 21, 2024Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
amazon-science / avgen-eval-toolkit
View on GitHub
☆19Feb 5, 2026Updated 3 months ago
hanna-xu / CRetinex
View on GitHub
☆14Oct 19, 2025Updated 7 months ago
YoonjinXD / T-FOLEY
View on GitHub
Implementation of the paper, T-FOLEY: A Controllable Waveform-Domain Diffusion Model for Temporal-Event-Guided Foley Sound Synthesis, ac…
☆34May 25, 2024Updated 2 years ago
light1726 / SpeechTripleNet
View on GitHub
The implementation of paper "SpeechTripleNet: End-to-End Disentangled Speech Representation Learning for Content, Timbre and Prosody"
☆34Nov 23, 2023Updated 2 years ago
FlowAlign / FlowAlign
View on GitHub
Official repository of FlowAlign
☆36May 18, 2026Updated last week
J-Seo / K-HALU
View on GitHub
K-HALU: Multiple Answer Korean Hallucination Benchmark for Large Language Models
☆38Dec 30, 2025Updated 4 months ago
pingponglabs / FaceAnime
View on GitHub
☆10Apr 22, 2021Updated 5 years ago
microsoft / experiential_rl
View on GitHub
The official codebase for "Experiential Reinforcement Learning" - https://arxiv.org/pdf/2602.13949v1
☆68May 8, 2026Updated 3 weeks ago
CodeDance-VL / CodeDance
View on GitHub
☆22Mar 17, 2026Updated 2 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
on1262 / facialanimation
View on GitHub
Source code for: Expressive Speech-driven Facial Animation with controllable emotions
☆40Jan 4, 2024Updated 2 years ago
researchmm / MM-Diffusion
View on GitHub
[CVPR'23] MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation
☆454Jun 5, 2024Updated last year
moonsliu / Pro-Motion
View on GitHub
Plan, Posture and Go: Towards Open-World Text-to-Motion Generation
☆42Nov 19, 2024Updated last year
rd20karim / M2T-Segmentation
View on GitHub
[NCA] Official implementation of the paper Motion2Language, Unsupervised learning of synchronized semantic motion segmentation
☆13Sep 9, 2024Updated last year
yangzhao1230 / newPCMDM
View on GitHub
☆13Nov 20, 2023Updated 2 years ago
hyc2026 / StoryTeller
View on GitHub
☆83Mar 10, 2025Updated last year
dzh19990407 / PPMN
View on GitHub
ACM MM 2022 - PPMN: Pixel-Phrase Matching Network for One-Stage Panoptic Narrative Grounding
☆11Aug 12, 2022Updated 3 years ago
Siwensun / CoMA
View on GitHub
CoMA: Compositional Human Motion Generation with Multi-modal Agents
☆16Jul 31, 2025Updated 9 months ago
xsy27 / Mospa-Acoustic-driven-Motion-Generation
View on GitHub
☆31Dec 3, 2025Updated 5 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
will-s-h / trust-sampling
View on GitHub
Constrained Diffusion with Trust Sampling
☆12Dec 18, 2024Updated last year
Zehui127 / 1d-swin
View on GitHub
The implementation of 1d-swin, an efficient transformer for capturing hierarchical 1-dimentional long range sequence
☆23Jul 3, 2023Updated 2 years ago
yujmo / CZU_MHAD
View on GitHub
CZU-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and 10 wearable inertial sensors
☆26Jun 2, 2022Updated 3 years ago
RyannDaGreat / peekaboo
View on GitHub
Peekaboo: Text to Image Diffusion Models are Zero-Shot Segmentors
☆31Jun 2, 2024Updated last year
arslanpi / Deep-learning-based-human-action-recognition-using-human-skeletons
View on GitHub
☆19Aug 28, 2019Updated 6 years ago
crystal-zq-wang / VATT
View on GitHub
Video Audio Translation Tool - automatically subtitles and dubs videos
☆13Mar 16, 2020Updated 6 years ago
pqpq17 / Awesome-LLM-Reasoning-on-Medicine
View on GitHub
The Official Repo for Paper: Aligning Clinical Needs and AI Capabilities: A Survey on LLMs for Medical Reasoning
☆23Apr 7, 2026Updated last month