Ceaglex / LoVA

The code and weight for LoVA. LoVA is a novel model for Long-form Video-to-Audio generation. Based on the Diffusion Transformer (DiT) architecture, LoVA proves to be more effective at generating long-form audio compared to existing autoregressive models and UNet-based diffusion models.
12Updated 3 weeks ago

Alternatives and similar repositories for LoVA:

Users that are interested in LoVA are comparing it to the libraries listed below