YefanZhou / TempBalanceLinks

[NeurIPS 2023 Spotlight] Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training

☆35

Alternatives and similar repositories for TempBalance

Users that are interested in TempBalance are comparing it to the libraries listed below

Sorting:

bethgelab / sober-reasoning
A Sober Look at Language Model Reasoning
☆81Updated last month
nik-dim / tall_masks
Official repository of "Localizing Task Information for Improved Model Merging and Compression" [ICML 2024]
☆47Updated 9 months ago
ryoungj / BoLT
Code for "Reasoning to Learn from Latent Thoughts"
☆114Updated 4 months ago
locuslab / massive-activations
Code accompanying the paper "Massive Activations in Large Language Models"
☆173Updated last year
sail-sg / Attention-Sink
[ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)
☆103Updated 3 weeks ago
gortizji / tangent_task_arithmetic
Source code of "Task arithmetic in the tangent space: Improved editing of pre-trained models".
☆103Updated 2 years ago
ycjing / Awesome-Model-Merging
A curated list of Model Merging methods.
☆92Updated 10 months ago
probabilistic-inference-scaling / probabilistic-inference-scaling
☆51Updated 4 months ago
Jiacheng-Zhu-AIML / AsymmetryLoRA
Preprint: Asymmetry in Low-Rank Adapters of Foundation Models
☆35Updated last year
haotiansun14 / BBox-Adapter
Lightweight Adapting for Black-Box Large Language Models
☆23Updated last year
Model-GLUE / Model-GLUE
☆15Updated 11 months ago
peterljq / Parsimonious-Concept-Engineering
PaCE: Parsimonious Concept Engineering for Large Language Models (NeurIPS 2024)
☆39Updated 9 months ago
tmlr-group / NoisyRationales
[NeurIPS 2024] "Can Language Models Perform Robust Reasoning in Chain-of-thought Prompting with Noisy Rationales?"
☆35Updated 2 weeks ago
socialfoundations / tttlm
Test-time-training on nearest neighbors for large language models
☆45Updated last year
abhishekpanigrahi1996 / Skill-Localization-by-grafting
☆51Updated last year
osehmathias / lisa
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning
☆33Updated last year
ericwtodd / function_vectors
Function Vectors in Large Language Models (ICLR 2024)
☆175Updated 3 months ago
locuslab / acr-memorization
☆35Updated 7 months ago
VITA-Group / SEAL
Official code for SEAL: Steerable Reasoning Calibration of Large Language Models for Free
☆39Updated 4 months ago
sail-sg / VeriFree
Reinforcing General Reasoning without Verifiers
☆76Updated last month
Lingkai-Kong / RE-Control
Code for paper: Aligning Large Language Models with Representation Editing: A Control Perspective
☆32Updated 6 months ago
Joshua-Ren / Learning_dynamics_LLM
☆155Updated 2 months ago
activatedgeek / calibration-tuning
☆51Updated 3 months ago
Dereck0602 / Awesome_Test_Time_LLMs
☆117Updated 4 months ago
siyan-zhao / ICL_decision_boundary
official code for paper Probing the Decision Boundaries of In-context Learning in Large Language Models. https://arxiv.org/abs/2406.11233…
☆19Updated last week
tmlr-group / landscape-of-thoughts
[ICLR 2025 Workshop] "Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models"
☆34Updated last month
harveyhuang18 / EMR_Merging
[NeurIPS 2024 Spotlight] EMR-Merging: Tuning-Free High-Performance Model Merging
☆62Updated 5 months ago
EnnengYang / AdaMerging
AdaMerging: Adaptive Model Merging for Multi-Task Learning. ICLR, 2024.
☆88Updated 9 months ago
Luckfort / CD
[COLING'25] Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?
☆79Updated 6 months ago
MadryLab / DsDm
☆50Updated last year