The code and weight for LoVA. LoVA is a novel model for Long-form Video-to-Audio generation. Based on the Diffusion Transformer (DiT) architecture, LoVA proves to be more effective at generating long-form audio compared to existing autoregressive models and UNet-based diffusion models.
☆14Feb 27, 2025Updated last year
Alternatives and similar repositories for LoVA
Users that are interested in LoVA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 🌟 SwarmAgent: A framework for simulating social group dynamics using multi-agent collaboration, aiding insights into collective behavior…☆13Dec 5, 2023Updated 2 years ago
- ☆21Aug 18, 2024Updated last year
- Implementation of Frieren: Efficient Video-to-Audio Generation Network with Rectified Flow Matching (NeurIPS'24)☆59Apr 3, 2025Updated last year
- Benchmarking for Audio-Text and Audio-Visual Generation; Supports FAD, FD_VGG, FD_PANNs, FD_PaSST, IS_PaSST, IS_PANNs, KL_PaSST, KL_PANNs…☆65Feb 14, 2026Updated 2 months ago
- Tools for the evaluation of audio captioning.☆19May 23, 2020Updated 5 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆130Updated this week
- UMETTS: A Unified Framework for Emotional Text-to-Speech Synthesis with Multimodal Prompts☆41Jun 12, 2025Updated 10 months ago
- source code for NAACL2022 main conference "Dynamic Programming in Rank Space: Scaling Structured Inference with Low-Rank HMMs and PCFGs"☆10Sep 26, 2022Updated 3 years ago
- PyTorch Implementation of Stepwise Monotonic Multihead Attention similar to Enhancing Monotonicity for Robust Autoregressive Transformer …☆39May 16, 2021Updated 4 years ago
- Evaluation of generated videos on the FETV benchmark☆10Apr 6, 2025Updated last year
- [EMNLP'2024 Findings] Explore generated documents for enhanced IR with LLMs. We enhance BM25 to surpass strong dense retriever on many da…☆15Mar 28, 2025Updated last year
- 同济大学数据挖掘课程期末作业:股票走势预测☆10Jan 11, 2021Updated 5 years ago
- Code for Research Project TLDR☆25Jul 28, 2025Updated 8 months ago
- Official Repository of "Taming Masked Diffusion Language Models via Consistency Trajectory Reinforcement Learning with Fewer Decoding Ste…☆27Mar 9, 2026Updated last month
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Official Code For EMNLP2025 Findings: {DLPO : Towards a Robust, Efficient, and Generalizable Prompt Optimization Framework from a Deep-Le…☆10Dec 25, 2025Updated 3 months ago
- An CUDA-based library for computed tomography (CT) reconstruction with differentiable operators.☆18Mar 25, 2026Updated 2 weeks ago
- ☆17Dec 12, 2023Updated 2 years ago
- ☆30Jun 30, 2020Updated 5 years ago
- Code for "From Ideal to Real: Unified and Data-Efficient Dense Prediction for Real-World Scenarios"☆28Jul 7, 2025Updated 9 months ago
- LVAS-Agent Code Base☆22Apr 15, 2025Updated 11 months ago
- An open source community implementation of the model from the paper: "Movie Gen: A Cast of Media Foundation Models". Join our community …☆58Apr 6, 2026Updated last week
- [AAAI 2025] Empowering LLMs with Pseudo-Untrimmed Videos for Audio-Visual Temporal Understanding☆34Mar 21, 2025Updated last year
- ☆12Sep 27, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- We introduce Reasoning via Video, a new paradigm that uses maze-solving video generation to probe multimodal reasoning; our VR-Bench show…☆58Feb 4, 2026Updated 2 months ago
- a text-conditional diffusion probabilistic model capable of generating high fidelity audio.☆193May 29, 2024Updated last year
- ☆14Apr 25, 2025Updated 11 months ago
- ☆55Feb 28, 2026Updated last month
- Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline (CVPR 2023)☆72Jan 4, 2026Updated 3 months ago
- DataMosaic: Explainable and Verifiable Document-Based Data Analytics☆20Jun 30, 2025Updated 9 months ago
- Parallel waveform generation with DiffusionGAN☆17Mar 26, 2022Updated 4 years ago
- Official implementation of our paper at ACL 2023: Pre-training Multi-party Dialogue Models with Latent Discourse Inference☆10Jul 10, 2023Updated 2 years ago
- "Stochasticity in Neural ODEs: An Empirical Study". Experiments from the paper☆13Apr 27, 2020Updated 5 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- T2VScore: Towards A Better Metric for Text-to-Video Generation☆81Apr 10, 2024Updated 2 years ago
- [EMNLP 2024] Official implementation of "Hierarchical Deconstruction of LLM Reasoning: A Graph-Based Framework for Analyzing Knowledge Ut…☆23Dec 4, 2024Updated last year
- Python implementation of STATIS for analysis of several data tables☆12Aug 14, 2017Updated 8 years ago
- This is the official implementation of the ICML 2023 paper - Can Forward Gradient Match Backpropagation ?☆13May 31, 2023Updated 2 years ago
- ☆10Nov 18, 2022Updated 3 years ago
- ☆10Mar 30, 2022Updated 4 years ago
- Measuring General Intelligence With Generated Games (Preprint)☆25Jul 30, 2025Updated 8 months ago