JackCai1206 / arithmetic-self-improveLinks

☆34

Alternatives and similar repositories for arithmetic-self-improve

Users that are interested in arithmetic-self-improve are comparing it to the libraries listed below

Sorting:

EleutherAI / nanoGPT-mup
The simplest, fastest repository for training/finetuning medium-sized GPTs.
☆147Updated 2 weeks ago
google-deepmind / mishax
☆134Updated 3 months ago
berlino / seq_icl
☆53Updated last year
epfml / llm-baselines
nanoGPT-like codebase for LLM training
☆100Updated 2 months ago
callummcdougall / sae_visualizer
☆28Updated last year
athms / mad-lab
A MAD laboratory to improve AI architecture designs 🧪
☆123Updated 7 months ago
mcleish7 / arithmetic
Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)
☆190Updated last year
HazyResearch / zoology
Understand and test language model architectures on synthetic tasks.
☆219Updated last month
OpenEvaByte / evabyte
EvaByte: Efficient Byte-level Language Models at Scale
☆103Updated 2 months ago
EleutherAI / delphi
Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models …
☆193Updated this week
ckkissane / crosscoder-model-diff-replication
Open source replication of Anthropic's Crosscoders for Model Diffing
☆57Updated 8 months ago
EleutherAI / improved-t5
Experiments for efforts to train a new and improved t5
☆76Updated last year
JacobPfau / fillerTokens
☆66Updated last year
PiotrNawrot / nano-sparse-attention
The simplest implementation of recent Sparse Attention patterns for efficient LLM inference.
☆78Updated last month
JoshEngels / MultiDimensionalFeatures
Code for reproducing our paper "Not All Language Model Features Are Linear"
☆77Updated 7 months ago
tilde-research / sieve
Applying SAEs for fine-grained control
☆22Updated 7 months ago
OSU-NLP-Group / GrokkedTransformer
Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'
☆223Updated 7 months ago
Alex-Gurung / ReasoningNCP
Official repo for Learning to Reason for Long-Form Story Generation
☆65Updated 2 months ago
hughbzhang / o1_inference_scaling_laws
Replicating O1 inference-time scaling laws
☆89Updated 7 months ago
kanishkg / stream-of-search
Repository for the paper Stream of Search: Learning to Search in Language
☆149Updated 5 months ago
epfml / schedules-and-scaling
Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"
☆75Updated 8 months ago
ConsequentAI / fneval
Functional Benchmarks and the Reasoning Gap
☆88Updated 9 months ago
justinchiu / openlogprobs
Extract full next-token probabilities via language model APIs
☆246Updated last year
wesg52 / universal-neurons
Universal Neurons in GPT2 Language Models
☆30Updated last year
lee-ny / teaching_arithmetic
☆83Updated last year
RobertCsordas / moeut
☆82Updated 10 months ago
mnoukhov / async_rlhf
Code and Configs for Asynchronous RLHF: Faster and More Efficient RL for Language Models
☆59Updated 2 months ago
callummcdougall / sae_vis
Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).
☆206Updated 7 months ago
goodfire-ai / r1-interpretability
Open source interpretability artefacts for R1.
☆154Updated 2 months ago
hamishivi / EasyLM
Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…
☆75Updated 11 months ago