[ICML 2024] CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers
☆34Dec 30, 2024Updated last year
Alternatives and similar repositories for CrossGET
Users that are interested in CrossGET are comparing it to the libraries listed below
Sorting:
- [ICML 2023] UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers☆105Dec 30, 2024Updated last year
- Belief Revision based Caption Re-ranker with Visual Semantic Information. COLING 2022☆11Apr 13, 2025Updated 10 months ago
- [ICML 2025] LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models☆17Nov 4, 2025Updated 3 months ago
- ☆11Sep 7, 2020Updated 5 years ago
- PyTorch code for the Findings of EMNLP 2021 paper "Does Vision-and-Language Pretraining Improve Lexical Grounding?"☆11Sep 26, 2021Updated 4 years ago
- 🏠🔍 Auto check for new apartments in Hamburg from various real estate provides☆16Jun 2, 2024Updated last year
- CaptionQA: Is Your Caption as Useful as the Image Itself?☆32Jan 19, 2026Updated last month
- [NIPS2023]Implementation of Foundation Model is Efficient Multimodal Multitask Model Selector☆37Mar 7, 2024Updated last year
- The official implementation of the paper DBQ-SSD: Dynamic Ball Query for Efficient 3D Object Detection (ICLR 2023)☆18Sep 17, 2023Updated 2 years ago
- Official implementation for the paper "Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine Translation", publish…☆20Jun 3, 2024Updated last year
- [Arxiv2022] Revitalize Region Feature for Democratizing Video-Language Pre-training☆22Mar 19, 2022Updated 3 years ago
- ☆20May 3, 2025Updated 9 months ago
- [ICCV 2023] With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning.☆19Jun 7, 2024Updated last year
- Using image captions with LLM for zero-shot VQA☆18Mar 14, 2024Updated last year
- Membrane Potential Batch Normalization for Spiking Neural Networks☆20Aug 18, 2023Updated 2 years ago
- ☆25Jul 10, 2023Updated 2 years ago
- 【IEEE T-IV】A systematic survey of multi-modal and multi-task visual understanding foundation models for driving scenarios☆51May 26, 2024Updated last year
- [AAAI 2022] Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding☆91Nov 16, 2022Updated 3 years ago
- Official Implementation of "Read-only Prompt Optimization for Vision-Language Few-shot Learning", ICCV 2023☆55Aug 19, 2023Updated 2 years ago
- Public repository for the ECCV 2024 paper "Train Till You Drop: Towards Stable and Robust Source-free Unsupervised 3D Domain Adaptation".☆26Aug 5, 2025Updated 6 months ago
- [AAAI 2025] More Text, Less Point: Towards 3D Data-Efficient Point-Language Understanding☆26May 27, 2025Updated 9 months ago
- ☆25Oct 5, 2023Updated 2 years ago
- [ICCV 23]An approach to enhance the efficiency of Vision Transformer (ViT) by concurrently employing token pruning and token merging tech…☆104Jul 14, 2023Updated 2 years ago
- ☆27Oct 7, 2021Updated 4 years ago
- Vecna is a Python chatbot which recommends songs and movies depending upon your feelings☆12Jun 28, 2022Updated 3 years ago
- Official implementation of the paper "ALADIN: Distilling Fine-grained Alignment Scores for Efficient Image-Text Matching and Retrieval"☆27Dec 6, 2023Updated 2 years ago
- Implementation of our paper, 'Unifying Two-Stream Encoders with Transformers for Cross-Modal Retrieval.'☆28Dec 3, 2023Updated 2 years ago
- [BMVC 2023] Semantic Adversarial Attacks via Diffusion Models☆24Nov 30, 2023Updated 2 years ago
- Research code for CVPR 2022 paper: "EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching"☆26Oct 20, 2022Updated 3 years ago
- Offical implementation of "Inherent Redundancy in Spiking Neural Networks" (ICCV2023)☆29Jan 7, 2024Updated 2 years ago
- [TIP2023] The code of “Plug-and-Play Regulators for Image-Text Matching”☆34Apr 11, 2024Updated last year
- Fine-Grained Evaluation of Large Vision-Language Models in Autonomous Driving (ICCV 2025)☆36May 29, 2025Updated 9 months ago
- Repository related to Cranfield's AAI MSCs GDP☆11Apr 8, 2023Updated 2 years ago
- [ICLR 2025] Official PyTorch Implementation for CPE: Concept Pinpoint Eraser for Text-to-image Diffusion Models via Residual Attention Ga…☆12Apr 7, 2025Updated 10 months ago
- Evaluation benchmark for the task of Semantic Image Translation. Contains code to run FlexIT (CVPR 2022)☆34Mar 25, 2022Updated 3 years ago
- [CVPR 2024] LaMPilot: An Open Benchmark Dataset for Autonomous Driving with Language Model Programs☆36Jan 21, 2026Updated last month
- Official repository for the NuScenes-MQA. This paper is accepted by LLVA-AD Workshop at WACV 2024.☆35Dec 21, 2023Updated 2 years ago
- Let ChatGPT (Large Language Models) Serve As Data Annotator and Zero-shot/few-shot Information Extractor.☆32Mar 18, 2023Updated 2 years ago
- [ECCV 2024] TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes☆129Mar 1, 2025Updated last year