HanseulJo / position-couplingLinks
Position Coupling: Improving Length Generalization of Arithmetic Transformers Using Task Structure (NeurIPS 2024) + Arithmetic Transformers Can Length-Generalize in Both Operand Length and Count (ICLR 2025)
☆11Updated 2 months ago
Alternatives and similar repositories for position-coupling
Users that are interested in position-coupling are comparing it to the libraries listed below
Sorting:
- ☆30Updated 11 months ago
- Code for NeurIPS'23 paper "A Bayesian Approach To Analysing Training Data Attribution In Deep Learning"☆17Updated last year
- A modern look at the relationship between sharpness and generalization [ICML 2023]☆43Updated last year
- ☆20Updated 11 months ago
- [ICML 2024] Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity; Lu Yin*, Ajay Jaiswal*, Shiwei Liu, So…☆16Updated 2 months ago
- Towards Understanding Sharpness-Aware Minimization [ICML 2022]☆35Updated 3 years ago
- About Official PyTorch implementation of "Query-Efficient Black-Box Red Teaming via Bayesian Optimization" (ACL'23)☆15Updated last year
- ☆69Updated 3 years ago
- ☆40Updated last year
- ☆13Updated 6 months ago
- Source code of "Task arithmetic in the tangent space: Improved editing of pre-trained models".☆102Updated 2 years ago
- A Kernel-Based View of Language Model Fine-Tuning https://arxiv.org/abs/2210.05643☆75Updated last year
- ☆35Updated 6 months ago
- Official repo of Progressive Data Expansion: data, code and evaluation☆29Updated last year
- Code for "Training Neural Networks with Fixed Sparse Masks" (NeurIPS 2021).☆58Updated 3 years ago
- Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)☆80Updated last year
- ☆13Updated 4 months ago
- ☆60Updated 3 years ago
- Align your LM to express calibrated verbal statements of confidence in its long-form generations.☆26Updated last year
- ☆28Updated 4 months ago
- This is the repository for "Model Merging by Uncertainty-Based Gradient Matching", ICLR 2024.☆27Updated last year
- This is an official repository for "LAVA: Data Valuation without Pre-Specified Learning Algorithms" (ICLR2023).☆48Updated last year
- ☆28Updated last year
- Implementation of paper 'Reversing the Forget-Retain Objectives: An Efficient LLM Unlearning Framework from Logit Difference' [NeurIPS'24…☆20Updated last year
- Source code of "What can linearized neural networks actually say about generalization?☆20Updated 3 years ago
- ☆34Updated last year
- ☆68Updated 6 months ago
- Bayesian Low-Rank Adaptation for Large Language Models☆34Updated last year
- ☆18Updated 2 years ago
- ☆23Updated 9 months ago