DeqingFu / transformers-icl-second-orderLinks
Official repository for our paper, Transformers Learn Higher-Order Optimization Methods for In-Context Learning: A Study with Linear Models.
☆18Updated 9 months ago
Alternatives and similar repositories for transformers-icl-second-order
Users that are interested in transformers-icl-second-order are comparing it to the libraries listed below
Sorting:
- ☆103Updated 6 months ago
- ☆23Updated 6 months ago
- ☆91Updated last year
- ☆32Updated 2 years ago
- ☆184Updated last year
- ☆83Updated last year
- ☆20Updated last year
- ☆34Updated 7 months ago
- Code for "Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining"☆20Updated 4 months ago
- A library for efficient patching and automatic circuit discovery.☆76Updated last month
- ☆44Updated last year
- Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"☆114Updated last year
- Sparse and discrete interpretability tool for neural networks☆63Updated last year
- Universal Neurons in GPT2 Language Models☆30Updated last year
- ☆53Updated last year
- Sparse Autoencoder Training Library☆54Updated 3 months ago
- ☆50Updated last year
- Code repo for the model organisms and convergent directions of EM papers.☆21Updated last month
- ☆33Updated last year
- ☆52Updated 4 months ago
- Stanford NLP Python library for benchmarking the utility of LLM interpretability methods☆124Updated 2 months ago
- Learning adapter weights from task descriptions☆19Updated last year
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆81Updated 9 months ago
- ☆30Updated 4 months ago
- Rewarded soups official implementation☆60Updated last year
- ☆238Updated last year
- Efficient Scaling laws and collaborative pretraining.☆17Updated 6 months ago
- Code for Paper (Preserving Diversity in Supervised Fine-tuning of Large Language Models)☆36Updated 3 months ago
- Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)☆81Updated last year
- ☆18Updated last year