YefanZhou / TempBalance
[NeurIPS 2023 Spotlight] Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training
☆34Updated 3 weeks ago
Alternatives and similar repositories for TempBalance:
Users that are interested in TempBalance are comparing it to the libraries listed below
- ☆52Updated 3 weeks ago
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆72Updated 6 months ago
- Official repository of "Localizing Task Information for Improved Model Merging and Compression" [ICML 2024]☆44Updated 6 months ago
- Source code of "Task arithmetic in the tangent space: Improved editing of pre-trained models".☆101Updated last year
- Code for "A Sober Look at Progress in Language Model Reasoning" paper☆41Updated 2 weeks ago
- ☆49Updated last year
- Test-time-training on nearest neighbors for large language models☆40Updated last year
- An effective weight-editing method for mitigating overly short reasoning in LLMs, and a mechanistic study uncovering how reasoning length…☆10Updated 2 weeks ago
- Preprint: Asymmetry in Low-Rank Adapters of Foundation Models☆36Updated last year
- ☆29Updated last month
- Code for "Reasoning to Learn from Latent Thoughts"☆93Updated last month
- LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning☆31Updated last year
- Code for paper: Aligning Large Language Models with Representation Editing: A Control Perspective☆29Updated 3 months ago
- [NeurIPS 2024 Spotlight] EMR-Merging: Tuning-Free High-Performance Model Merging☆58Updated 2 months ago
- What Makes a Reward Model a Good Teacher? An Optimization Perspective☆26Updated 3 weeks ago
- PaCE: Parsimonious Concept Engineering for Large Language Models (NeurIPS 2024)☆35Updated 6 months ago
- Code accompanying the paper "Massive Activations in Large Language Models"☆159Updated last year
- Representation Surgery for Multi-Task Model Merging. ICML, 2024.☆44Updated 6 months ago
- [NeurIPS 2024 Spotlight] Code and data for the paper "Finding Transformer Circuits with Edge Pruning".☆54Updated last month
- official code for paper Probing the Decision Boundaries of In-context Learning in Large Language Models. https://arxiv.org/abs/2406.11233…☆18Updated 8 months ago
- DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and Diffusion Models (ICLR 2024)☆64Updated 7 months ago
- [ICLR 2025] Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates (Oral)☆77Updated 6 months ago
- Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic☆24Updated 3 months ago
- AdaMerging: Adaptive Model Merging for Multi-Task Learning. ICLR, 2024.☆78Updated 6 months ago
- ☆14Updated last year
- Code for paper "Parameter Efficient Multi-task Model Fusion with Partial Linearization"☆21Updated 7 months ago
- ☆33Updated 4 months ago
- ☆18Updated last month
- A curated list of Model Merging methods.☆92Updated 7 months ago
- ☆76Updated last week