lee-ny / teaching_arithmeticLinks
☆83Updated last year
Alternatives and similar repositories for teaching_arithmetic
Users that are interested in teaching_arithmetic are comparing it to the libraries listed below
Sorting:
- ☆183Updated last year
- ☆95Updated last year
- [NeurIPS 2023] Learning Transformer Programs☆161Updated last year
- ☆34Updated last year
- ☆121Updated 11 months ago
- Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"☆112Updated last year
- The accompanying code for "Transformer Feed-Forward Layers Are Key-Value Memories". Mor Geva, Roei Schuster, Jonathan Berant, and Omer Le…☆94Updated 3 years ago
- ☆87Updated 11 months ago
- A library for efficient patching and automatic circuit discovery.☆70Updated 2 months ago
- Code for the paper "The Impact of Positional Encoding on Length Generalization in Transformers", NeurIPS 2023☆136Updated last year
- Language models scale reliably with over-training and on downstream tasks☆97Updated last year
- ☆87Updated last year
- Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"☆166Updated last month
- ☆40Updated last month
- Inspecting and Editing Knowledge Representations in Language Models☆116Updated last year
- Can Language Models Solve Olympiad Programming?☆119Updated 6 months ago
- ☆83Updated 5 months ago
- Official code from the paper "Offline RL for Natural Language Generation with Implicit Language Q Learning"☆208Updated last year
- Experiments and code to generate the GINC small-scale in-context learning dataset from "An Explanation for In-context Learning as Implici…☆108Updated last year
- Algebraic value editing in pretrained language models☆65Updated last year
- ☆99Updated 5 months ago
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆123Updated 10 months ago
- ☆53Updated last year
- Official repository for our paper, Transformers Learn Higher-Order Optimization Methods for In-Context Learning: A Study with Linear Mode…☆17Updated 7 months ago
- ☆119Updated 11 months ago
- ☆98Updated last year
- ☆48Updated 2 months ago
- ☆233Updated last year
- Function Vectors in Large Language Models (ICLR 2024)☆170Updated 2 months ago
- Code and data used in the paper: "Training on Incorrect Synthetic Data via RL Scales LLM Math Reasoning Eight-Fold"☆30Updated last year