alwynpan / uom-comp90024Links
Demo Code for Subject COMP90024
☆12Updated 6 months ago
Alternatives and similar repositories for uom-comp90024
Users that are interested in uom-comp90024 are comparing it to the libraries listed below
Sorting:
- Project Description☆23Updated last year
- [ICML 2024 Spotlight] "Sample-specific Masks for Visual Reprogramming-based Prompting"☆12Updated 9 months ago
- Stanford Cars dataset by classes folder☆16Updated 11 months ago
- Multi-Stage Vision Token Dropping: Towards Efficient Multimodal Large Language Model☆35Updated 9 months ago
- official repo for paper "[CLS] Token Tells Everything Needed for Training-free Efficient MLLMs"☆23Updated 5 months ago
- ☆58Updated 5 months ago
- ☆11Updated 9 months ago
- [ICML'25] Official implementation of paper "SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference".☆171Updated 4 months ago
- [ECCV 2024] API: Attention Prompting on Image for Large Vision-Language Models☆103Updated last year
- [ICLR 2025] See What You Are Told: Visual Attention Sink in Large Multimodal Models☆54Updated 8 months ago
- [NeurIPS 2024] Official Repository of Multi-Object Hallucination in Vision-Language Models☆31Updated 11 months ago
- [ICLR 2025] "Noisy Test-Time Adaptation in Vision-Language Models"☆12Updated 7 months ago
- Code release for VTW (AAAI 2025 Oral)☆50Updated 3 months ago
- ☆12Updated 10 months ago
- [CVPR2025] The implementation of the paper "OODD: Test-time Out-of-Distribution Detection with Dynamic Dictionary".☆18Updated 5 months ago
- Everything to the Synthetic: Diffusion-driven Test-time Adaptation via Synthetic-Domain Alignment, arXiv 2024 / CVPR 2025☆34Updated 7 months ago
- [CVPR 2025] Devils in Middle Layers of Large Vision-Language Models: Interpreting, Detecting and Mitigating Object Hallucinations via Att…☆43Updated last week
- SmartCLIP: A training method to improve CLIP with both short and long texts☆24Updated 4 months ago
- 🚀 Global Compression Commander: Plug-and-Play Inference Acceleration for High-Resolution Large Vision-Language Models☆33Updated 2 months ago
- [CVPR' 25] Interleaved-Modal Chain-of-Thought☆89Updated last week
- ☆103Updated 6 months ago
- Code for our ICML'24 on multimodal dataset distillation☆40Updated last year
- [CVPR 2025] Official implementation of paper "MoVE-KD: Knowledge Distillation for VLMs with Mixture of Visual Encoders".☆41Updated 4 months ago
- MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer☆47Updated last year
- Code for paper: Reinforced Vision Perception with Tools☆53Updated 2 weeks ago
- Pytorch implementation for "Erasing the Bias: Fine-Tuning Foundation Models for Semi-Supervised Learning" (ICML 2024)☆23Updated 5 months ago
- [CVPR 2025] An Implementation of the paper "Pre-Instruction Data Selection for Visual Instruction Tuning"☆15Updated 4 months ago
- Official code for paper: [CLS] Attention is All You Need for Training-Free Visual Token Pruning: Make VLM Inference Faster.☆93Updated 3 months ago
- Collected the world's best computer vision labs and lecture materials.☆14Updated 7 months ago
- [ICML 2024] "Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models"☆56Updated last year