hchautran / PiToMeLinks
Speed up Transformers With Spectrum-Preserving Token Merging
☆43Updated 6 months ago
Alternatives and similar repositories for PiToMe
Users that are interested in PiToMe are comparing it to the libraries listed below
Sorting:
- Official Pytorch implementation of "Vision Transformers Don't Need Trained Registers"☆93Updated 2 months ago
- ☆103Updated 4 months ago
- More dimensions = More fun☆25Updated last year
- The official github repo for "Diffusion Language Models are Super Data Learners".☆107Updated 3 weeks ago
- Code for NOLA, an implementation of "nola: Compressing LoRA using Linear Combination of Random Basis"☆56Updated last year
- Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025☆30Updated 4 months ago
- Official Code for Paper: Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation☆121Updated 2 months ago
- An open source implementation of CLIP (With TULIP Support)☆162Updated 3 months ago
- Awesome list of papers that extend Mamba to various applications.☆136Updated 2 months ago
- Official implementation of RMoE (Layerwise Recurrent Router for Mixture-of-Experts)☆22Updated last year
- Official PyTorch Implementation of "The Hidden Attention of Mamba Models"☆226Updated last year
- Code for "Scaling Language-Free Visual Representation Learning" paper (Web-SSL).☆179Updated 4 months ago
- ☆49Updated 7 months ago
- [ECCV'24 Workshops Oral] DALDA: Data Augmentation Leveraging Diffusion Model and LLM with Adaptive Guidance Scaling☆31Updated 9 months ago
- Model Merging with SVD to Tie the KnOTS [ICLR 2025]☆63Updated 5 months ago
- PyTorch code for "ADEM-VL: Adaptive and Embedded Fusion for Efficient Vision-Language Tuning"☆20Updated 10 months ago
- Official PyTorch implementation and models for paper "Diffusion Beats Autoregressive in Data-Constrained Settings". We find diffusion mod…☆86Updated last week
- FlexTok: Resampling Images into 1D Token Sequences of Flexible Length☆248Updated 3 months ago
- Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens (arXiv 2025)☆151Updated last month
- This is the repository for "SELECT: A Large-Scale Benchmark of Data Curation Strategies for Image Recognition"☆16Updated 10 months ago
- Geometric-Mean Policy Optimization☆68Updated last month
- (CVPR 2025) Official implementation to DELT: A Simple Diversity-driven EarlyLate Training for Dataset Distillation which outperforms SOTA…☆24Updated last week
- Code for ICML 2025 Paper "Highly Compressed Tokenizer Can Generate Without Training"☆166Updated 2 months ago
- This repo contains evaluation code for the paper "BLINK: Multimodal Large Language Models Can See but Not Perceive". https://arxiv.or…☆137Updated last year
- [CVPR] MergeVQ: A Unified Framework for Visual Generation and Representation with Token Merging and Quantization☆42Updated last month
- ConceptAttention: A method for interpreting multi-modal diffusion transformers.☆322Updated 4 months ago
- [ICLR 2025] Large (Vision) Language Models are Unsupervised In-Context Learners☆19Updated 2 months ago
- 🔥 Official implementation of "Generate, but Verify: Reducing Visual Hallucination in Vision-Language Models with Retrospective Resamplin…☆42Updated last month
- Open source implementation of "Vision Transformers Need Registers"☆188Updated 3 weeks ago
- [ICCV 2025] Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.☆149Updated last month