alvin-zyl / CoLA
Implementation of CoLA: Compute-Efficient Pre-Training of LLMs via Low-Rank Activation
☆17Updated 2 months ago
Alternatives and similar repositories for CoLA:
Users that are interested in CoLA are comparing it to the libraries listed below
- ☆50Updated last year
- Bayesian low-rank adaptation for large language models☆23Updated 11 months ago
- Code for ICLR 2025 Paper "What is Wrong with Perplexity for Long-context Language Modeling?"☆51Updated 3 weeks ago
- Source code of "Task arithmetic in the tangent space: Improved editing of pre-trained models".☆101Updated last year
- A curated list of awesome resources dedicated to Scaling Laws for LLMs☆71Updated 2 years ago
- Official repository of "Localizing Task Information for Improved Model Merging and Compression" [ICML 2024]☆42Updated 5 months ago
- [NeurIPS 2023] Github repository for "Composing Parameter-Efficient Modules with Arithmetic Operations"☆60Updated last year
- Code for paper: Aligning Large Language Models with Representation Editing: A Control Perspective☆28Updated 2 months ago
- Code associated with Tuning Language Models by Proxy (Liu et al., 2024)☆107Updated last year
- ☆93Updated last year
- ☆39Updated last year
- ☆13Updated 11 months ago
- ☆16Updated 10 months ago
- Representation Surgery for Multi-Task Model Merging. ICML, 2024.☆44Updated 6 months ago
- Preprint: Asymmetry in Low-Rank Adapters of Foundation Models☆35Updated last year
- Model merging is a highly efficient approach for long-to-short reasoning.☆40Updated 3 weeks ago
- TokenSkip: Controllable Chain-of-Thought Compression in LLMs☆120Updated last month
- Codebase for ICML submission "DOGE: Domain Reweighting with Generalization Estimation"☆17Updated last year
- Test-time-training on nearest neighbors for large language models☆39Updated last year
- [ICML 2024] Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity; Lu Yin*, Ajay Jaiswal*, Shiwei Liu, So…☆16Updated 10 months ago
- The code of “Improving Weak-to-Strong Generalization with Scalable Oversight and Ensemble Learning”☆16Updated last year
- Official code for SEAL: Steerable Reasoning Calibration of Large Language Models for Free☆14Updated last week
- official code for paper Probing the Decision Boundaries of In-context Learning in Large Language Models. https://arxiv.org/abs/2406.11233…☆17Updated 7 months ago
- ☆23Updated last month
- LongProc: Benchmarking Long-Context Language Models on Long Procedural Generation☆23Updated 2 weeks ago
- Source code of ACL 2023 Main Conference Paper "PAD-Net: An Efficient Framework for Dynamic Networks".☆9Updated last year
- ☆66Updated 3 years ago
- Learning adapter weights from task descriptions☆16Updated last year
- A Kernel-Based View of Language Model Fine-Tuning https://arxiv.org/abs/2210.05643☆75Updated last year
- Official repository for ICLR 2024 Spotlight paper "Large Language Models Are Not Robust Multiple Choice Selectors"☆38Updated 10 months ago