MCG-NJU / AWTLinks
[NeurIPS 2024] AWT: Transferring Vision-Language Models via Augmentation, Weighting, and Transportation
β109Updated last year
Alternatives and similar repositories for AWT
Users that are interested in AWT are comparing it to the libraries listed below
Sorting:
- π₯ π₯ π₯ [NeurIPS 2024] Official Implementation of Hawk: Learning to Understand Open-World Video Anomaliesβ223Updated 7 months ago
- [Pattern Recognition 2025] Cross-Modal Adapter for Vision-Language Retrievalβ136Updated 3 months ago
- Multi-granularity Correspondence Learning from Long-term Noisy Videos [ICLR 2024, Oral]β117Updated last year
- [CVPR 2024] Official implementation of "Universal Segmentation at Arbitrary Granularity with Language Instruction"β284Updated last year
- CoS: Chain-of-Shot Prompting for Long Video Understandingβ52Updated 9 months ago
- [ICLR'24] Democratizing Fine-grained Visual Recognition with Large Language Modelsβ186Updated last year
- [SIGIR'2024 Best Paper Honorable Mention] Official repository for "LDRE: LLM-based Divergent Reasoning and Ensemble for Zero-Shot Composeβ¦β60Updated 8 months ago
- (ICCV 2025) Enhance CLIP and MLLM's fine-grained visual representations with generative models.β73Updated 4 months ago
- [AAAI 2026] β¨ TSPO: Temporal Sampling Policy Optimization for Long-form Video Language Understandingβ94Updated this week
- [ICCV 2023] Spectrum-guided Multi-granularity Referring Video Object Segmentation.β110Updated 7 months ago
- LoR-VP: Low-Rank Visual Prompting for Efficient Vision Model Adaptation (ICLR 2025)β36Updated 9 months ago
- [NeurIPS 2024] Matryoshka Query Transformer for Large Vision-Language Modelsβ118Updated last year
- Official Repository of OmniCaptionerβ165Updated 6 months ago
- [ACM CSUR 2025] Out-of-Distribution Detection: A Task-Oriented Survey of Recent Advancesβ150Updated 2 months ago
- High Quality Video Reasoning Segmentationβ103Updated 2 months ago
- [MM'24 Oral] Prior Knowledge Integration via LLM Encoding and Pseudo Event Regulation for Video Moment Retrievalβ130Updated last year
- [ICML 2025] Official repository for paper "Scaling Video-Language Models to 10K Frames via Hierarchical Differential Distillation"β186Updated last month
- [ACM MM'2024] Official repository for "Semantic Editing Increment Benefits Zero-Shot Composed Image Retrieval"β41Updated 10 months ago
- [AAAI 2026 Oralπ₯] Official code for Decomposing the Neurons: Activation Sparsity via Mixture of Experts for Continual Test Time Adaptatiβ¦β70Updated last year
- β71Updated 11 months ago
- The official repository of SEED-GRPO: Semantic Entropy Enhanced GRPO for Uncertainty-Aware Policy Optimizationβ92Updated last month
- (NeurIPS 2024) Official PyTorch implementation of LOVA3β90Updated 7 months ago
- [ECCV2022,oral] Identifying Hard Noise in Long-Tailed Sample Distributionβ73Updated 3 years ago
- β125Updated last month
- Evaluation of Text-to-Video Generation Models: A Dynamics Perspective[NeurIPS 2024].β276Updated 11 months ago
- β87Updated last year
- **Deep Video Discovery (DVD)** is a deep-research style question answering agent designed for understanding extra-long videos.β303Updated 2 weeks ago
- [Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics]: VisuoThink: Empowering LVLM Reasoning with Mulβ¦β97Updated 3 months ago
- [NeurIPS 2025] Efficient Reasoning Vision Language Modelsβ415Updated 2 months ago
- (ECCV 2024) Empowering Multimodal Large Language Model as a Powerful Data Generatorβ114Updated 7 months ago