LeapLabTHU / Attention-Mediators
[ECCV 2024] Efficient Diffusion Transformer with Step-wise Dynamic Attention Mediators
☆33Updated 2 months ago
Related projects ⓘ
Alternatives and complementary repositories for Attention-Mediators
- Official implementation of Dynamic Perceiver☆41Updated last year
- ☆16Updated 3 weeks ago
- [ECCV 2024] AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation☆31Updated 2 months ago
- [IEEE TIP] Fine-grained Recognition with Learnable Semantic Data Augmentation☆27Updated 11 months ago
- [NeurIPS 2022] Latency-aware Spatial-wise Dynamic Networks☆24Updated last year
- [ICML 2024] SimPro: A Simple Probabilistic Framework Towards Realistic Long-Tailed Semi-Supervised Learning☆25Updated last month
- [IEEE TPAMI] Latency-aware Unified Dynamic Networks for Efficient Image Recognition☆42Updated 6 months ago
- Code release for Deep Incubation (https://arxiv.org/abs/2212.04129)☆91Updated last year
- A PyTorch implementation of the paper "Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis"☆34Updated 5 months ago
- [arXiv] Cross-Modal Adapter for Text-Video Retrieval☆55Updated 2 years ago
- Jittor implementation of Vision Transformer with Deformable Attention☆30Updated 2 years ago
- [ECCV 2022] Learning to Weight Samples for Dynamic Early-exiting Networks☆32Updated last year
- ☆26Updated 2 years ago
- This is a repo to track the latest autoregressive visual generation papers.☆50Updated this week
- Official code of paper Understanding, Predicting and Better Resolving Q-Value Divergence in Offline-RL☆21Updated last year
- [NeurIPS 2023] Rank-DETR for High Quality Object Detection☆87Updated last year
- [TPAMI 2024] Probabilistic Contrastive Learning for Long-Tailed Visual Recognition☆64Updated last month
- ☆36Updated last year
- [ECCV 2024] Official PyTorch implementation of DreamLIP: Language-Image Pre-training with Long Captions☆106Updated 3 weeks ago
- [CVPR2024] GSVA: Generalized Segmentation via Multimodal Large Language Models☆94Updated 2 months ago
- [CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"☆64Updated last month
- The official code of the paper "PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction".☆45Updated 3 weeks ago
- ☆84Updated 11 months ago
- LLaVA-NeXT-Image-Llama3-Lora, Modified from https://github.com/arielnlee/LLaVA-1.6-ft☆39Updated 4 months ago
- Implementation of "VL-Mamba: Exploring State Space Models for Multimodal Learning"☆78Updated 8 months ago
- ☆109Updated 5 months ago
- Official implementation of paper "SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference" proposed by Pekin…☆56Updated last month
- Adapting LLaMA Decoder to Vision Transformer☆27Updated 6 months ago
- ☆77Updated last month
- The official implementation of 《MLLMs-Augmented Visual-Language Representation Learning》☆31Updated 8 months ago