allenbai01 / transformers-as-statisticiansLinks

☆34

Alternatives and similar repositories for transformers-as-statisticians

Users that are interested in transformers-as-statisticians are comparing it to the libraries listed below

Sorting:

DeqingFu / transformers-icl-second-order
Official repository for our paper, Transformers Learn Higher-Order Optimization Methods for In-Context Learning: A Study with Linear Mode…
☆20Updated last year
automl / is_mamba_capable_of_icl
☆18Updated last year
locuslab / edge-of-stability
☆73Updated last year
dtsip / in-context-learning
☆242Updated last year
Jiacheng-Zhu-AIML / AsymmetryLoRA
Preprint: Asymmetry in Low-Rank Adapters of Foundation Models
☆37Updated last year
p-lambda / incontext-learning
Experiments and code to generate the GINC small-scale in-context learning dataset from "An Explanation for In-context Learning as Implici…
☆106Updated 2 years ago
reds-lab / LAVA
This is an official repository for "LAVA: Data Valuation without Pre-Specified Learning Algorithms" (ICLR2023).
☆51Updated last year
mansheej / icl-task-diversity
Code for the paper "Pretraining task diversity and the emergence of non-Bayesian in-context learning for regression"
☆23Updated 2 years ago
tml-epfl / sharpness-vs-generalization
A modern look at the relationship between sharpness and generalization [ICML 2023]
☆43Updated 2 years ago
cassidylaidlaw / orpo
☆19Updated last year
erosenfeld / disagree_discrep
Provably (and non-vacuously) bounding test error of deep neural networks under distribution shift with unlabeled test data.
☆10Updated last year
cassidylaidlaw / hidden-context
Code and data for the paper "Understanding Hidden Context in Preference Learning: Consequences for RLHF"
☆32Updated last year
MaximeRobeyns / bayesian_lora
Bayesian Low-Rank Adaptation for Large Language Models
☆36Updated last year
liziniu / policy_optimization
Code for Paper (Policy Optimization in RLHF: The Impact of Out-of-preference Data)
☆28Updated last year
aw31 / empirical-ntks
Efficient empirical NTKs in PyTorch
☆22Updated 3 years ago
jongharyu / neural-svd
Official PyTorch implementation of NeuralSVD (ICML 2024)
☆20Updated last year
tml-epfl / understanding-sam
Towards Understanding Sharpness-Aware Minimization [ICML 2022]
☆36Updated 3 years ago
tding1 / Neural-Collapse
[NeurIPS 2021] A Geometric Analysis of Neural Collapse with Unconstrained Features
☆59Updated 3 years ago
alexrame / rewardedsoups
Rewarded soups official implementation
☆62Updated 2 years ago
machine-discovery / deer
Parallelizing non-linear sequential models over the sequence length
☆56Updated 5 months ago
UW-Madison-Lee-Lab / Expressive_Power_of_LoRA
Code for "The Expressive Power of Low-Rank Adaptation".
☆20Updated last year
adamxyang / laplace-lora
Bayesian low-rank adaptation for large language models
☆27Updated last year
formll / resolving-scaling-law-discrepancies
☆20Updated last month
srzer / MOD
Official code for "Decoding-Time Language Model Alignment with Multiple Objectives".
☆29Updated last year
haotiansun14 / BBox-Adapter
Lightweight Adapting for Black-Box Large Language Models
☆24Updated last year
noanabeshima / matryoshka-saes
☆25Updated last year
tianjunz / TEMPERA
☆46Updated 2 years ago
lasgroup / SafetyPolytope
Learning Safety Constraints for Large Language Models (ICML2025)
☆25Updated 4 months ago
Silent-Zebra / twisted-smc-lm
☆31Updated 8 months ago
pratyushmaini / localizing-memorization
Official Repository for ICML 2023 paper "Can Neural Network Memorization Be Localized?"
☆20Updated 2 years ago