BryceZhuo / PolyCom
The official implementation of ICLR 2025 paper "Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models".
☆13Updated this week
Alternatives and similar repositories for PolyCom:
Users that are interested in PolyCom are comparing it to the libraries listed below
- [CVPR] MergeVQ: A Unified Framework for Visual Generation and Representation with Token Merging and Quantization☆21Updated 3 weeks ago
- [CVPR2025] Breaking the Low-Rank Dilemma of Linear Attention☆16Updated last month
- Official implementation of RMoE (Layerwise Recurrent Router for Mixture-of-Experts)☆20Updated 8 months ago
- State Space Models☆69Updated 11 months ago
- Official implementation of Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More☆17Updated 2 months ago
- ☆48Updated last year
- ☆41Updated 5 months ago
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆37Updated 6 months ago
- HGRN2: Gated Linear RNNs with State Expansion☆54Updated 8 months ago
- More dimensions = More fun☆22Updated 8 months ago
- [NeurIPS 2023] Make Your Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning☆31Updated last year
- Official PyTorch Implementation for Task Vectors are Cross-Modal☆22Updated 4 months ago
- ☆36Updated 9 months ago
- ☆25Updated 6 months ago
- PyTorch implementation of StableMask (ICML'24)☆12Updated 9 months ago
- Official implementation of ECCV24 paper: POA☆24Updated 8 months ago
- Official Implementation of DiffCLIP: Differential Attention Meets CLIP☆26Updated last month
- Control LLM☆14Updated 2 weeks ago
- Scaling Multi-modal Instruction Fine-tuning with Tens of Thousands Vision Task Types☆17Updated last week
- Adapting LLaMA Decoder to Vision Transformer☆28Updated 11 months ago
- Code for the paper "Cottention: Linear Transformers With Cosine Attention"☆17Updated 6 months ago
- Code for "C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing"☆14Updated 2 weeks ago
- [EMNLP 2024] Official code for "Beyond Embeddings: The Promise of Visual Table in Multi-Modal Models"☆17Updated 6 months ago
- ☆17Updated 3 months ago
- [ACL 2023] PuMer: Pruning and Merging Tokens for Efficient Vision Language Models☆29Updated 6 months ago
- ☆18Updated this week
- Project for SNARE benchmark☆11Updated 10 months ago
- Collect papers about Mamba (a selective state space model).☆14Updated 8 months ago
- Evaluation and dataset construction code for the CVPR 2025 paper "Vision-Language Models Do Not Understand Negation"☆20Updated this week
- ☆57Updated 2 months ago