hchautran / PiToMe
Speed up Transformers With Spectrum-Preserving Token Merging
☆33Updated last month
Alternatives and similar repositories for PiToMe:
Users that are interested in PiToMe are comparing it to the libraries listed below
- Official PyTorch Implementation of "The Hidden Attention of Mamba Models"☆216Updated 10 months ago
- Model Merging with SVD to Tie the KnOTS [ICLR 2025]☆46Updated 2 months ago
- LibMoE: A LIBRARY FOR COMPREHENSIVE BENCHMARKING MIXTURE OF EXPERTS IN LARGE LANGUAGE MODELS☆35Updated 2 months ago
- Code for NOLA, an implementation of "nola: Compressing LoRA using Linear Combination of Random Basis"☆53Updated 7 months ago
- ☆40Updated 2 months ago
- [ECCV'24 Workshops Oral] DALDA: Data Augmentation Leveraging Diffusion Model and LLM with Adaptive Guidance Scaling☆29Updated 4 months ago
- Code accompanying the paper "Massive Activations in Large Language Models"☆150Updated last year
- [EMNLP 2023 Main] Sparse Low-rank Adaptation of Pre-trained Language Models☆72Updated last year
- Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆86Updated 3 weeks ago
- Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.☆124Updated 2 months ago
- [ACL 2024] Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models☆86Updated 10 months ago
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆37Updated 5 months ago
- Initialization using Update Approximation is a Silver Bullet for Extremely Efficient Low-Rank Fine-Tuning☆44Updated last month
- Code and benchmark for the paper: "A Practitioner's Guide to Continual Multimodal Pretraining" [NeurIPS'24]☆54Updated 3 months ago
- [NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Models☆209Updated 3 weeks ago
- Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models☆75Updated 6 months ago
- [Under Review] Official PyTorch implementation code for realizing the technical part of Phantom of Latent representing equipped with enla…☆56Updated 5 months ago
- Code for this paper "HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts via HyperNetwork"☆31Updated last year
- AdaMerging: Adaptive Model Merging for Multi-Task Learning. ICLR, 2024.☆72Updated 5 months ago
- ☆20Updated 10 months ago
- ☆14Updated last month
- LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture☆200Updated 2 months ago
- ☆142Updated 6 months ago
- Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Mode…☆102Updated 6 months ago
- Code & Dataset for Paper: "Distill Visual Chart Reasoning Ability from LLMs to MLLMs"☆51Updated 5 months ago
- Conference schedule, top papers, and analysis of the data for NeurIPS 2023!☆119Updated last year
- Official implementation of Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More☆17Updated last month
- [CVPR2024] ModaVerse: Efficiently Transforming Modalities with LLMs☆29Updated 8 months ago
- Awesome list of papers that extend Mamba to various applications.☆132Updated 3 months ago
- Continual Forgetting for Pre-trained Vision Models (CVPR 2024)☆62Updated 2 months ago