OpenNLPLab/TAVGBench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/OpenNLPLab/TAVGBench)

OpenNLPLab / TAVGBench

Demo page of TAVGBench: Benchmarking Text to Audible-Video Generation

☆15

Alternatives and similar repositories for TAVGBench

Users that are interested in TAVGBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

OpenNLPLab / ETSC-Exact-Toeplitz-to-SSM-Conversion
View on GitHub
[EMNLP 2023] Official implementation of the algorithm ETSC: Exact Toeplitz-to-SSM Conversion our EMNLP 2023 paper - Accelerating Toeplitz…
☆14Oct 17, 2023Updated 2 years ago
lzhangbj / ASVA
View on GitHub
[ECCV 2024 Oral] Audio-Synchronized Visual Animation
☆60Mar 15, 2026Updated 4 months ago
mzsun01 / MM-LDM
View on GitHub
☆11Apr 12, 2024Updated 2 years ago
RobinGueguen / Just4RIR
View on GitHub
Library for room acoustic analyse (linked with Blender)
☆12Dec 6, 2018Updated 7 years ago
YufeiWang777 / CU-Net
View on GitHub
Official implementation of "CU-Net: LiDAR Depth-only Completion with Coupled U-Net", RAL 2022.
☆17Oct 13, 2022Updated 3 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
ariesssxu / vta-ldm
View on GitHub
☆61Jun 15, 2025Updated last year
QC-LY / UniBind
View on GitHub
The source code for "UniBind: LLM-Augmented Unified and Balanced Representation Space to Bind Them All"
☆49Apr 4, 2024Updated 2 years ago
SonyResearch / SVG_baseline
View on GitHub
to release the source code for reproducing the results reported in our paper: https://arxiv.org/abs/2409.17550
☆14Nov 15, 2024Updated last year
facebookresearch / worldsense
View on GitHub
WorldSense benchmark for grounded reasoning in language models
☆25Nov 28, 2023Updated 2 years ago
HUIZ-A / SVA
View on GitHub
☆20Apr 26, 2024Updated 2 years ago
matthias-wright / jax-fid
View on GitHub
FID computation in Jax/Flax.
☆29Jul 17, 2024Updated 2 years ago
zhengxuJosh / SAM4SS
View on GitHub
SAM4SS: Tailoring SAM and SAM2 for Semantic Segmentation
☆11Jul 31, 2024Updated last year
yzxing87 / Seeing-and-Hearing
View on GitHub
[CVPR 2024] Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners
☆155Jul 6, 2024Updated 2 years ago
lcn-kul / xls-r-analysis-sqa
View on GitHub
Analysis of XLS-R for Speech Quality Assessment
☆15Feb 10, 2025Updated last year
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
IIGROUP / HVISNet
View on GitHub
Code and Data for Real-time Human-Centric Segmentation for Complex Video Scenes
☆17Feb 8, 2024Updated 2 years ago
cambridgeltl / ECNMT
View on GitHub
Emergent Communication Pretraining for Few-Shot Machine Translation
☆13Dec 3, 2020Updated 5 years ago
dkurzend / ClipClap-GZSL
View on GitHub
Audio-Visual Generalized Zero-Shot Learning using Large Pre-Trained Models
☆23Apr 15, 2024Updated 2 years ago
WikiChao / DAVIS
View on GitHub
[🏆 IJCV 2025 & ACCV 2024 Best Paper Honorable Mention] Official pytorch implementation of the paper "High-Quality Visually-Guided Sound …
☆33Mar 30, 2026Updated 3 months ago
Yuanshi9815 / LiteFocus
View on GitHub
[Interspeech 2024] LiteFocus is a tool designed to accelerate diffusion-based TTA model, now implemented with the base model AudioLDM2.
☆34Mar 11, 2025Updated last year
suimuc / MTV_Framework
View on GitHub
☆23Oct 15, 2025Updated 9 months ago
polarispw / 22-fall-CPUSample
View on GitHub
☆10Dec 21, 2022Updated 3 years ago
SeanGuo063 / NDVG
View on GitHub
A implementation of paper NDVG
☆37Nov 26, 2022Updated 3 years ago
EleutherAI / pilev2
View on GitHub
☆13Jan 20, 2023Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
zhengxuJosh / DPPASS
View on GitHub
☆12Jul 24, 2023Updated 3 years ago
naver-ai / rewas
View on GitHub
Official PyTorch implementation of ReWaS (AAAI'25) "Read, Watch and Scream! Sound Generation from Text and Video"
☆44Dec 13, 2024Updated last year
danqu130 / RPEFlow
View on GitHub
RPEFlow: Multimodal Fusion of RGB-PointCloud-Event for Joint Optical Flow and Scene Flow Estimation (ICCV 2023)
☆30Sep 9, 2024Updated last year
sarthaxxxxx / Awesome-Continual-Test-Time-Adaptation
View on GitHub
A curated list of papers, resources and future directions on Continual Test-Time Adaptation (CTTA).
☆15Jul 13, 2026Updated 2 weeks ago
Arvid-pku / ALCUNA
View on GitHub
[EMNLP 2023] ALCUNA: Large Language Models Meet New Knowledge
☆30Oct 30, 2023Updated 2 years ago
jordipons / AudioSetOntologyTree
View on GitHub
Tree visualization of the AudioSet Ontology - https://github.com/audioset/ontology
☆18Aug 8, 2024Updated last year
seung0h / web-3dgs
View on GitHub
real-time web visualizer for 3D gaussian splatting
☆10Jan 31, 2025Updated last year
zhengxuJosh / 360SFUDA
View on GitHub
Code for Panoramic Semantic Segmentation
☆16Apr 26, 2024Updated 2 years ago
mulander / ada.tmbundle
View on GitHub
TextMate support for Ada
☆17Jul 5, 2012Updated 14 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
lxa9867 / QSD
View on GitHub
[CVPR 2024] "Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition"
☆12Feb 27, 2024Updated 2 years ago
Reagan1311 / Mask2IV
View on GitHub
Mask2IV: Interaction-Centric Video Generation via Mask Trajectories (AAAI 2026)
☆17Jun 8, 2026Updated last month
zhengxuJosh / AnySeg
View on GitHub
Code & Weights for “Learning Robust Anymodal Segmentor with Unimodal and Cross-modal Distillation”
☆15Dec 6, 2024Updated last year
stoneMo / OneAVM
View on GitHub
Official Codebase of "A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition" (ICML 2023)
☆12Jun 1, 2023Updated 3 years ago
NEUIR / P3Ranker
View on GitHub
[SIGIR '22] Code for our SIGIR 2022 accepted paper : P3 Ranker: Mitigating the Gaps between Pre-training and Ranking Fine-tuning with Pr…
☆18Sep 24, 2023Updated 2 years ago
Bizilizi / VGGSounder
View on GitHub
VGGSounder, a multi-label audio-visual classification dataset with modality annotations.
☆17Jun 30, 2026Updated 3 weeks ago
Lliar-liar / Daily-Omni
View on GitHub
This is the official repository of Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across Modalities
☆42Apr 28, 2026Updated 2 months ago