AVoCaDO-Captioner / AVoCaDOView external linksLinks
https://avocado-captioner.github.io/
☆29Oct 16, 2025Updated 3 months ago
Alternatives and similar repositories for AVoCaDO
Users that are interested in AVoCaDO are comparing it to the libraries listed below
Sorting:
- WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs☆38Jan 26, 2026Updated 2 weeks ago
- ☆17Oct 10, 2024Updated last year
- The Source Code for OmniVideoBench☆57Updated this week
- Temporary fork of Foundry with Tempo support☆59Updated this week
- A collection of helpful contracts and libraries for use with Tempo for Foundry☆57Feb 2, 2026Updated last week
- Go SDK for the Tempo blockchain☆59Updated this week
- video-SALMONN 2 is a powerful audio-visual large language model (LLM) that generates high-quality audio-visual video captions, which is d…☆150Jan 28, 2026Updated 2 weeks ago
- Official Code for "ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning"☆80Dec 5, 2025Updated 2 months ago
- ☆13Jul 3, 2024Updated last year
- Experimental implementation of regions in WebVTT building on Anne's WebVTT parser.☆14Oct 19, 2014Updated 11 years ago
- TOD-Flow: Modeling the Structure of Task-Oriented Dialogues☆13Feb 7, 2024Updated 2 years ago
- PainterVRAM lets you reserve a slice of GPU memory before ComfyUI starts processing, preventing out-of-memory crashes. Switch between man…☆28Jan 2, 2026Updated last month
- ☆20Nov 21, 2025Updated 2 months ago
- NCS-like audio visualizer for AviUtl☆19May 22, 2025Updated 8 months ago
- ☆15Sep 11, 2025Updated 5 months ago
- Sound Separation, Omni modal☆28Sep 15, 2025Updated 4 months ago
- Public code release for the paper "Reawakening knowledge: Anticipatory recovery from catastrophic interference via structured training"☆11Oct 27, 2025Updated 3 months ago
- A python tool help to interact with chatgpt.☆10Dec 11, 2022Updated 3 years ago
- Reinforcing Text-Rich Video Reasoning with Visual Rumination☆27Nov 24, 2025Updated 2 months ago
- Collection of usefull scripts for RunPod pods☆15Jan 26, 2024Updated 2 years ago
- a Video Quality Analysis Toolkit☆13May 16, 2025Updated 8 months ago
- FNIN: A Fourier Neural Operator-based Numerical Integration Network for Surface-form-gradients☆13Jan 22, 2025Updated last year
- FamilyTool benchmark☆12Sep 10, 2025Updated 5 months ago
- Information Extraction related tools and models☆10Mar 16, 2023Updated 2 years ago
- ☆13May 15, 2025Updated 8 months ago
- Scripts for KGIRNet model for ESWC☆10Jul 6, 2023Updated 2 years ago
- ☆12Jan 2, 2024Updated 2 years ago
- Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation☆28Dec 10, 2025Updated 2 months ago
- This is the official repository of Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across Modalities☆36Jul 4, 2025Updated 7 months ago
- Official codebase for the NeurIPS 2023 paper: Towards Last-layer Retraining for Group Robustness with Fewer Annotations. https://arxiv.or…☆11May 15, 2024Updated last year
- ☆16Mar 22, 2025Updated 10 months ago
- This is the implementation of paper "Learning to Ask Conversational Questions by Optimizing Levenshtein Distance".☆10Jul 5, 2021Updated 4 years ago
- ☆12Jan 25, 2024Updated 2 years ago
- LLaVA-Next for STVG☆18Dec 5, 2025Updated 2 months ago
- ☆17Aug 21, 2025Updated 5 months ago
- Collection of papers about video-audio understanding☆22Dec 26, 2025Updated last month
- [NAACL 2025🔥] MEDA: Dynamic KV Cache Allocation for Efficient Multimodal Long-Context Inference☆17Jun 19, 2025Updated 7 months ago
- ☆29Feb 4, 2026Updated last week
- Support for Apache Cordova Apps in Codename One☆13May 31, 2023Updated 2 years ago