hchautran / PiToMeLinks
Speed up Transformers With Spectrum-Preserving Token Merging
☆51Updated 10 months ago
Alternatives and similar repositories for PiToMe
Users that are interested in PiToMe are comparing it to the libraries listed below
Sorting:
- [NeurIPS '25 Spotlight] Official Pytorch implementation of "Vision Transformers Don't Need Trained Registers"☆149Updated 2 months ago
- ☆107Updated 8 months ago
- An open source implementation of CLIP (With TULIP Support)☆163Updated 7 months ago
- Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025☆31Updated 7 months ago
- Code for "Scaling Language-Free Visual Representation Learning" paper (Web-SSL).☆190Updated 7 months ago
- Model Merging with SVD to Tie the KnOTS [ICLR 2025]☆80Updated 8 months ago
- Official Code for Paper: Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation☆130Updated 5 months ago
- The official github repo for "Diffusion Language Models are Super Data Learners".☆208Updated last month
- [ICCV 2025] Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.☆166Updated 2 months ago
- Data distillation benchmark☆71Updated 6 months ago
- FlexTok: Resampling Images into 1D Token Sequences of Flexible Length☆276Updated 6 months ago
- Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens (arXiv 2025)☆205Updated 4 months ago
- Dimple, the first Discrete Diffusion Multimodal Large Language Model☆112Updated 5 months ago
- [TMLR 2025 J2C] TextRegion: Text-Aligned Region Tokens from Frozen Image-Text Models☆49Updated last month
- [NeurIPS 2024] Official Repository of Multi-Object Hallucination in Vision-Language Models☆33Updated last year
- [ICLR 2025] Source code for paper "A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegr…☆79Updated last year
- Official Implementation of LaViDa: :A Large Diffusion Language Model for Multimodal Understanding☆178Updated last week
- Official PyTorch implementation and models for paper "Diffusion Beats Autoregressive in Data-Constrained Settings". We find diffusion mod…☆113Updated last month
- Code for ICML 2025 Paper "Highly Compressed Tokenizer Can Generate Without Training"☆193Updated 6 months ago
- Geometric-Mean Policy Optimization☆95Updated 3 weeks ago
- [NeurIPS'25] dKV-Cache: The Cache for Diffusion Language Models☆123Updated 6 months ago
- ConceptAttention: A method for interpreting multi-modal diffusion transformers.☆354Updated last month
- [ICCV 25]SpectralAR: Spectral Autoregressive Visual Generation☆35Updated 6 months ago
- Code for NOLA, an implementation of "nola: Compressing LoRA using Linear Combination of Random Basis"☆57Updated last year
- More dimensions = More fun☆26Updated last year
- ☆39Updated 6 months ago
- Sparse autoencoders for vision☆52Updated last week
- [Fully open] [Encoder-free MLLM] Vision as LoRA☆356Updated 6 months ago
- ☆40Updated last year
- The official implementation of Recurrent Diffusion for Large-Scale Parameter Generation.☆75Updated 2 months ago