open-thought / system-2-research
System 2 Reasoning Link Collection
☆818Updated 2 weeks ago
Alternatives and similar repositories for system-2-research:
Users that are interested in system-2-research are comparing it to the libraries listed below
- ☆1,011Updated 3 months ago
- Verifiers for LLM Reinforcement Learning☆727Updated last week
- Recipes to scale inference-time compute of open models☆1,048Updated last month
- procedural reasoning datasets☆541Updated this week
- A bibliography and survey of the papers surrounding o1☆1,183Updated 4 months ago
- A comprehensive repository of reasoning tasks for LLMs (and beyond)☆426Updated 6 months ago
- Training Large Language Model to Reason in a Continuous Latent Space☆1,015Updated 2 months ago
- MLGym A New Framework and Benchmark for Advancing AI Research Agents☆459Updated this week
- Build your own visual reasoning model☆320Updated last week
- Synthetic data curation for post-training and structured data extraction☆1,097Updated last week
- Pretraining code for a large-scale depth-recurrent language model☆709Updated 2 weeks ago
- A reading list on LLM based Synthetic Data Generation 🔥☆1,223Updated last month
- Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"☆302Updated 4 months ago
- Code for Quiet-STaR☆728Updated 7 months ago
- [ICLR 2025] Automated Design of Agentic Systems☆1,241Updated 2 months ago
- Automatic evals for LLMs☆346Updated this week
- Deep learning for dummies. All the practical details and useful utilities that go into working with real models.☆782Updated 3 weeks ago
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆311Updated 3 months ago
- AIDE: AI-Driven Exploration in the Space of Code. State of the Art machine Learning engineering agents that automates AI R&D.☆821Updated this week
- Best practices & guides on how to write distributed pytorch training code☆383Updated last month
- Minimalistic 4D-parallelism distributed training framework for education purpose☆970Updated 3 weeks ago
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends☆1,358Updated this week
- Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym☆410Updated 3 weeks ago
- Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard a…☆1,105Updated 2 months ago
- ☆504Updated 4 months ago
- ☆493Updated last week
- MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering☆656Updated 2 months ago
- Automatically evaluate your LLMs in Google Colab☆613Updated 10 months ago
- Official codebase for "SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution"☆477Updated 2 weeks ago
- Textbook on reinforcement learning from human feedback☆505Updated this week