tanchongmin / ARC-ChallengeLinks
☆27Updated 9 months ago
Alternatives and similar repositories for ARC-Challenge
Users that are interested in ARC-Challenge are comparing it to the libraries listed below
Sorting:
- ☆26Updated last year
- Repository for the paper Stream of Search: Learning to Search in Language☆148Updated 4 months ago
- [ICML 2025] Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples☆95Updated 2 weeks ago
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆41Updated last year
- ☆22Updated 3 months ago
- Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)☆29Updated last year
- Bootstrapping ARC☆127Updated 7 months ago
- A Gymnasium-based Environment of the Abstraction and Reasoning Corpus (ARC)☆65Updated 9 months ago
- ☆85Updated 5 months ago
- ☆123Updated 8 months ago
- Evaluation of neuro-symbolic engines☆35Updated 10 months ago
- ☆115Updated 4 months ago
- ☆51Updated 7 months ago
- Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"☆109Updated last year
- ☆26Updated last year
- Automated Design of Agentic Systems☆10Updated 9 months ago
- Learning to Retrieve by Trying - Source code for Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval☆39Updated 7 months ago
- ☆42Updated 9 months ago
- ☆95Updated 11 months ago
- ☆22Updated last month
- Implementation of the Quiet-STAR paper (https://arxiv.org/pdf/2403.09629.pdf)☆54Updated 10 months ago
- Replicating O1 inference-time scaling laws☆87Updated 6 months ago
- ARLC, a probabilistic abductive reasoner for solving Raven's progressive matrices.☆18Updated last month
- ☆97Updated 11 months ago
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format☆27Updated last year
- ☆27Updated this week
- a benchmark to evaluate the situated inductive reasoning☆16Updated 5 months ago
- Materials for ConceptARC paper☆95Updated 7 months ago
- 👻 Code and benchmark for our EMNLP 2023 paper - "FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions"☆55Updated last year
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆57Updated 9 months ago