hchautran / PiToMeLinks
Speed up Transformers With Spectrum-Preserving Token Merging
☆43Updated 6 months ago
Alternatives and similar repositories for PiToMe
Users that are interested in PiToMe are comparing it to the libraries listed below
Sorting:
- Code for "Scaling Language-Free Visual Representation Learning" paper (Web-SSL).☆173Updated 3 months ago
- Official Pytorch implementation of "Vision Transformers Don't Need Trained Registers"☆83Updated last month
- An open source implementation of CLIP (With TULIP Support)☆162Updated 2 months ago
- ConceptAttention: A method for interpreting multi-modal diffusion transformers.☆317Updated 3 months ago
- More dimensions = More fun☆23Updated last year
- ☆39Updated last year
- ☆101Updated 4 months ago
- Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025☆30Updated 3 months ago
- Code for NOLA, an implementation of "nola: Compressing LoRA using Linear Combination of Random Basis"☆55Updated 11 months ago
- Official Code for Paper: Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation☆117Updated last month
- FlexTok: Resampling Images into 1D Token Sequences of Flexible Length☆238Updated 2 months ago
- Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens (arXiv 2025)☆120Updated last week
- ☆194Updated this week
- Data distillation benchmark☆67Updated last month
- [CVPR] MergeVQ: A Unified Framework for Visual Generation and Representation with Token Merging and Quantization☆42Updated 3 weeks ago
- Code for ICML 2025 Paper "Highly Compressed Tokenizer Can Generate Without Training"☆156Updated 2 months ago
- Official PyTorch Implementation of "The Hidden Attention of Mamba Models"☆226Updated last year
- This is the repository for "SELECT: A Large-Scale Benchmark of Data Curation Strategies for Image Recognition"☆16Updated 10 months ago
- [ICCV 2025] Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.☆146Updated last month
- [ECCV 2024] Official Release of SILC: Improving vision language pretraining with self-distillation☆45Updated 10 months ago
- The official implementation of Recurrent Diffusion for Large-Scale Parameter Generation.☆60Updated 5 months ago
- [NeurIPS 2024] Official Repository of Multi-Object Hallucination in Vision-Language Models☆29Updated 8 months ago
- ☆43Updated 9 months ago
- Model Merging with SVD to Tie the KnOTS [ICLR 2025]☆62Updated 4 months ago
- [ICLR 2025] Large (Vision) Language Models are Unsupervised In-Context Learners☆19Updated 2 months ago
- [ICLR 2025] Official Pytorch Implementation of "Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN" by Pengxia…☆25Updated 2 weeks ago
- Dimple, the first Discrete Diffusion Multimodal Large Language Model☆89Updated last month
- [ICLR 2025] Source code for paper "A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegr…☆76Updated 8 months ago
- PyTorch code for "ADEM-VL: Adaptive and Embedded Fusion for Efficient Vision-Language Tuning"☆20Updated 9 months ago
- ☆89Updated 2 months ago