Awesome Audio-Visual Intelligence, Survey of Audio-Visual Intelligence
☆80May 8, 2026Updated last month
Alternatives and similar repositories for Awesome-AVI
Users that are interested in Awesome-AVI are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The Source Code for OmniVideoBench @ICLR 2026☆73Feb 12, 2026Updated 4 months ago
- [ICLR 2026] MMDuet2: Enhancing Proactive Interaction of Video MLLMs with Multi-Turn Reinforcement Learning☆38Jan 14, 2026Updated 5 months ago
- ☆109Jun 5, 2026Updated 3 weeks ago
- ☆17Feb 26, 2024Updated 2 years ago
- ☆20Apr 23, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Official Code for Neural Systematic Binder☆34Mar 27, 2023Updated 3 years ago
- A non-official re-implementation of article "[ECCV 18] Image Inpainting for Irregular Holes Using Partial Convolutions"☆12Mar 1, 2025Updated last year
- ☆29Mar 10, 2026Updated 3 months ago
- SimX-OR: Extending Any Simulation Benchmark to Evaluate the Observational Robustness of VLA Models☆33Nov 4, 2025Updated 7 months ago
- ☆86May 2, 2026Updated 2 months ago
- ☆65Dec 10, 2025Updated 6 months ago
- ☆15Mar 6, 2024Updated 2 years ago
- [CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts☆18Apr 2, 2025Updated last year
- [𝐍𝐚𝐭𝐮𝐫𝐞 𝐂𝐨𝐦𝐩𝐮𝐭𝐚𝐭𝐢𝐨𝐧𝐚𝐥 𝐒𝐜𝐢𝐞𝐧𝐜𝐞] ⚡️ PSE/PSRN: Fast and efficient symbolic expression discovery through paralleliz…☆22May 17, 2026Updated last month
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- The official implementation of "Well Begun is Half Done: Low-resource Preference Alignment by Weak-to-Strong Decoding"☆22Jun 26, 2025Updated last year
- [CVPR'2025] MonoInstance: Enhancing Monocular Priors via Multi-view Instance Alignment for Neural Rendering and Reconstruction☆18Dec 1, 2025Updated 7 months ago
- [MICCAI 2025] GL-LCM: Global-Local Latent Consistency Models for Fast High-Resolution Bone Suppression in Chest X-Ray Images☆17Mar 12, 2026Updated 3 months ago
- ☆20Jun 10, 2025Updated last year
- ☆28Mar 17, 2026Updated 3 months ago
- [ICLR 2026 🔥 ] Official implementation of "UniLiP: Adapting CLIP for Unified Multimodal Understanding, Generation and Editing"☆151Jan 26, 2026Updated 5 months ago
- daVinci-Agency: Unlocking Long-Horizon Agency Data-Efficiently☆38Feb 4, 2026Updated 4 months ago
- [𝐍𝐚𝐭𝐮𝐫𝐞 𝐂𝐨𝐦𝐦𝐮𝐧𝐢𝐜𝐚𝐭𝐢𝐨𝐧𝐬] 🤖💡 LiveIdeaBench: Evaluating LLMs' Scientific Creativity and Idea Generation with Minimal C…☆29Apr 21, 2026Updated 2 months ago
- ☆58May 7, 2026Updated last month
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- A collection of awesome think with videos papers.☆99Dec 1, 2025Updated 7 months ago
- ☆70Apr 13, 2026Updated 2 months ago
- ☆47Dec 16, 2025Updated 6 months ago
- [ACL2026 oral] Uni-MMMU : A Massive Multi-discipline Multimodal Unified Benchmark☆25Apr 13, 2026Updated 2 months ago
- [ACL 2023] VSTAR is a multimodal dialogue dataset with scene and topic transition information☆16Oct 27, 2024Updated last year
- Code for the paper BiST: Bi-directional Spatio-Temporal Reasoning for Video-Grounded Dialogues (EMNLP20)☆11Jun 16, 2025Updated last year
- Official Repository of LatentSeek☆83Jun 6, 2025Updated last year
- OmniGAIA: Towards Native Omni-Modal AI Agents☆134Apr 2, 2026Updated 3 months ago
- BPfold: Deep generalizable prediction of RNA secondary structure via base pair motif energy.☆35May 27, 2026Updated last month
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Pathology Foundation Models Meet Semantic Segmentation☆43May 10, 2026Updated last month
- [ICML 2026] Reasoning in Parallelism via Self-Distilled RL☆114Updated this week
- The code for paper: Hierarchical Document Refinement for Long-context Retrieval-augmented Generation [ACL2025 Oral]☆46Aug 25, 2025Updated 10 months ago
- ☆15Nov 10, 2025Updated 7 months ago
- [CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts☆23Apr 10, 2026Updated 2 months ago
- (NIPS 2025) OpenOmni: Official implementation of Advancing Open-Source Omnimodal Large Language Models with Progressive Multimodal Align…☆142May 9, 2026Updated last month
- Cross-Self KV Cache Pruning for Efficient Vision-Language Inference☆10Dec 15, 2024Updated last year