Amplify-Partners / annotation-reading-listLinks

A reading list of relevant papers and projects on foundation model annotation

☆27

Alternatives and similar repositories for annotation-reading-list

Users that are interested in annotation-reading-list are comparing it to the libraries listed below

Sorting:

haizelabs / j1-micro
j1-micro (1.7B) & j1-nano (600M) are absurdly tiny but mighty reward models.
☆95Updated 3 weeks ago
google-deepmind / mishax
☆136Updated 4 months ago
tyler-romero / microR1
Simple repository for training small reasoning models
☆32Updated 6 months ago
xjdr-alt / muzero_sketch
☆38Updated last year
YuchenJin / llm.c
LLM training in simple, raw C/CUDA
☆15Updated 8 months ago
facebookresearch / llm-speedrunner
The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…
☆94Updated last week
goodfire-ai / r1-interpretability
Open source interpretability artefacts for R1.
☆157Updated 3 months ago
kevinwu23 / StanfordFineTuneBench
☆31Updated 8 months ago
nano-R1 / resources
Compiling useful links, papers, benchmarks, ideas, etc.
☆45Updated 4 months ago
Ziems / arbor
A framework for optimizing DSPy programs with RL
☆96Updated this week
Alex-Gurung / ReasoningNCP
Official repo for Learning to Reason for Long-Form Story Generation
☆68Updated 3 months ago
apoorvkh / academic-pretraining
$100K or 100 Days: Trade-offs when Pre-Training with Academic Resources
☆143Updated 2 months ago
ahstat / episodic-memory-benchmark
Synthetic data generation and benchmark implementation for "Episodic Memories Generation and Evaluation Benchmark for Large Language Mode…
☆49Updated 3 months ago
magicproduct / hash-hop
Long context evaluation for large language models
☆220Updated 5 months ago
thomasnormal / fewshot
☆28Updated last month
LeonGuertler / UnstableBaselines
☆96Updated last week
srush / GPTWorld
A puzzle to learn about prompting
☆132Updated 2 years ago
data-for-agents / insta
Official Repo for InSTA: Towards Internet-Scale Training For Agents
☆52Updated 3 weeks ago
METR / RE-Bench
☆95Updated 3 months ago
Pleias / Quest-Best-Tokens
An introduction to LLM Sampling
☆79Updated 7 months ago
open-thought / reasoning-gym-eval
Collection of LLM completions for reasoning-gym task datasets
☆26Updated last month
ScalingIntelligence / Archon
Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.
☆175Updated 5 months ago
haizelabs / verdict
Inference-time scaling for LLMs-as-a-judge.
☆267Updated 3 weeks ago
commit-0 / commit0
Commit0: Library Generation from Scratch
☆161Updated 3 months ago
minosvasilias / simple_grpo
Simple GRPO scripts and configurations.
☆59Updated 6 months ago
joshuacnf / Ctrl-G
☆88Updated 7 months ago
Danau5tin / calculator_agent_rl
Training an LLM to use a calculator with multi-turn reinforcement learning, achieving a **62% absolute increase in evaluation accuracy**.
☆45Updated 3 months ago
ZeroSumEval / ZeroSumEval
A framework for pitting LLMs against each other in an evolving library of games ⚔
☆32Updated 3 months ago
JoshuaPurtell / SmallBench
Small, simple agent task environments for training and evaluation
☆18Updated 9 months ago
HazyResearch / cartridges
Storing long contexts in tiny caches with self-study
☆121Updated last week