automl / is_mamba_capable_of_icl
☆13Updated 6 months ago
Related projects ⓘ
Alternatives and complementary repositories for is_mamba_capable_of_icl
- ☆25Updated 4 months ago
- ☆50Updated 5 months ago
- ☆44Updated last year
- Code for NeurIPS'23 paper "A Bayesian Approach To Analysing Training Data Attribution In Deep Learning"☆13Updated 9 months ago
- Test-time-training on nearest neighbors for large language models☆25Updated 6 months ago
- Stick-breaking attention☆32Updated last week
- Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning [ICML 2024]☆14Updated 6 months ago
- ☆15Updated 4 months ago
- Online Adaptation of Language Models with a Memory of Amortized Contexts (NeurIPS 2024)☆53Updated 3 months ago
- ☆35Updated 9 months ago
- ☆34Updated 3 months ago
- ☆75Updated 9 months ago
- ☆61Updated 2 years ago
- Lightweight Adapting for Black-Box Large Language Models☆18Updated 8 months ago
- ☆44Updated 10 months ago
- A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity.☆51Updated this week
- Source codes for "Preference-grounded Token-level Guidance for Language Model Fine-tuning" (NeurIPS 2023).☆14Updated 11 months ago
- `dattri` is a PyTorch library for developing, benchmarking, and deploying efficient data attribution algorithms.☆30Updated last week
- A Kernel-Based View of Language Model Fine-Tuning https://arxiv.org/abs/2210.05643☆69Updated last year
- ☆25Updated 9 months ago
- Is In-Context Learning Sufficient for Instruction Following in LLMs?☆23Updated 5 months ago
- Long Context Extension and Generalization in LLMs☆39Updated last month
- ☆15Updated last week
- Source code of "Task arithmetic in the tangent space: Improved editing of pre-trained models".☆84Updated last year
- Directional Preference Alignment☆49Updated last month
- Bayesian low-rank adaptation for large language models☆23Updated 6 months ago
- ☆59Updated 2 years ago
- [NeurIPS'23] Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors☆69Updated 8 months ago
- ☆78Updated last year
- [ICML 2024] Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity; Lu Yin*, Ajay Jaiswal*, Shiwei Liu, So…☆15Updated 5 months ago