ekinakyurek / marc
Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"
☆278Updated last month
Alternatives and similar repositories for marc:
Users that are interested in marc are comparing it to the libraries listed below
- Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'☆175Updated last month
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆157Updated this week
- ☆96Updated 3 weeks ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆154Updated 2 months ago
- (ICML 2024) Alphazero-like Tree-Search can guide large language model decoding and training☆251Updated 7 months ago
- Repository for the paper Stream of Search: Learning to Search in Language☆118Updated 5 months ago
- Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym☆202Updated this week
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)☆182Updated 7 months ago
- ☆89Updated this week
- Code for the paper 🌳 Tree Search for Language Model Agents☆163Updated 5 months ago
- ☆264Updated 6 months ago
- Bootstrapping ARC☆90Updated last month
- ☆168Updated last year
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆277Updated last month
- AWM: Agent Workflow Memory☆231Updated last month
- Training Large Language Model to Reason in a Continuous Latent Space☆388Updated this week
- Draw more samples☆182Updated 6 months ago
- Long context evaluation for large language models☆195Updated this week
- Implementation of 🥥 Coconut, Chain of Continuous Thought, in Pytorch☆145Updated 2 weeks ago
- Automatic Evals for Instruction-Tuned Models☆100Updated this week
- Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"☆111Updated 2 months ago
- An Analytical Evaluation Board of Multi-turn LLM Agents☆270Updated 7 months ago
- A simple unified framework for evaluating LLMs☆164Updated 3 weeks ago
- System 2 Reasoning Link Collection☆722Updated this week
- Can Language Models Solve Olympiad Programming?☆108Updated this week
- Sparse autoencoders☆407Updated this week
- Implementation of the Quiet-STAR paper (https://arxiv.org/pdf/2403.09629.pdf)☆48Updated 5 months ago
- ☆135Updated 3 months ago
- RewardBench: the first evaluation tool for reward models.☆491Updated last week
- A toolkit for describing model features and intervening on those features to steer behavior.☆149Updated 2 months ago