automl / is_mamba_capable_of_iclLinks
☆18Updated last year
Alternatives and similar repositories for is_mamba_capable_of_icl
Users that are interested in is_mamba_capable_of_icl are comparing it to the libraries listed below
Sorting:
- Official repository for our paper, Transformers Learn Higher-Order Optimization Methods for In-Context Learning: A Study with Linear Mode…☆17Updated 8 months ago
- ☆32Updated 2 years ago
- ☆33Updated 9 months ago
- Rewarded soups official implementation☆58Updated last year
- Code for Paper (Policy Optimization in RLHF: The Impact of Out-of-preference Data)☆28Updated last year
- Preprint: Asymmetry in Low-Rank Adapters of Foundation Models☆35Updated last year
- Code and data for the paper "Understanding Hidden Context in Preference Learning: Consequences for RLHF"☆30Updated last year
- ☆31Updated last year
- Official code for "Decoding-Time Language Model Alignment with Multiple Objectives".☆25Updated 9 months ago
- ☆43Updated last year
- What Makes a Reward Model a Good Teacher? An Optimization Perspective☆35Updated last month
- ☆11Updated 11 months ago
- ☆234Updated last year
- Source code for Stable Hadamard Memory☆18Updated 3 months ago
- Test-time-training on nearest neighbors for large language models☆45Updated last year
- Code repository for "The Clock and the Pizza: Two Stories in Mechanistic Explanation of Neural Networks"☆17Updated last year
- Experiments and code to generate the GINC small-scale in-context learning dataset from "An Explanation for In-context Learning as Implici…☆108Updated last year
- ☆13Updated last year
- Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)☆80Updated last year
- Benchmark for Natural Temporal Distribution Shift (NeurIPS 2022)☆67Updated 2 years ago
- Parallelizing non-linear sequential models over the sequence length☆53Updated last month
- A python package providing a benchmark with various specified distribution shift patterns.☆58Updated last year
- Bayesian low-rank adaptation for large language models☆23Updated last year
- ☆65Updated 3 years ago
- official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and…☆66Updated 4 months ago
- ☆23Updated 6 months ago
- ☆45Updated 2 years ago
- ☆50Updated last year
- Deep Learning & Information Bottleneck☆61Updated 2 years ago
- Code for NeurIPS'23 paper "A Bayesian Approach To Analysing Training Data Attribution In Deep Learning"☆17Updated last year