arcprize / arc-agi-benchmarkingView external linksLinks
Testing baseline LLMs performance across various models
☆336Updated this week
Alternatives and similar repositories for arc-agi-benchmarking
Users that are interested in arc-agi-benchmarking are comparing it to the libraries listed below
Sorting:
- ☆625May 22, 2025Updated 8 months ago
- My submission to the ARC-AGI-3 Developer Preview Agent Compitition.☆34Jan 27, 2026Updated 2 weeks ago
- ☆15Jun 19, 2025Updated 7 months ago
- ☆141Updated this week
- Evaluating majors LLMs on the Abstraction and Reasoning Corpus☆17Nov 9, 2023Updated 2 years ago
- The Abstraction and Reasoning Corpus☆4,717Apr 4, 2025Updated 10 months ago
- Video Diffusion Model. Autoregressive, long context, efficient training and inference. WIP☆34Sep 4, 2025Updated 5 months ago
- Bootstrapping ARC☆154Nov 20, 2024Updated last year
- ☆30Aug 7, 2025Updated 6 months ago
- ☆19Jul 31, 2025Updated 6 months ago
- Like ARC, but code to generate visual puzzles. 1D puzzles first.☆22Aug 17, 2024Updated last year
- Draw more samples☆198Jun 23, 2024Updated last year
- ☆27Aug 16, 2025Updated 5 months ago
- Domain Specific Language for the Abstraction and Reasoning Corpus☆318Oct 11, 2024Updated last year
- Reverse Engineering the Abstraction and Reasoning Corpus☆332Feb 24, 2025Updated 11 months ago
- Information and artifacts for "LoRA Learns Less and Forgets Less" (TMLR, 2024)☆20Sep 27, 2024Updated last year
- ☆39Feb 25, 2024Updated last year
- Unit Scaling demo and experimentation code☆16Mar 12, 2024Updated last year
- A Gymnasium-based Environment of the Abstraction and Reasoning Corpus (ARC)☆69Aug 30, 2024Updated last year
- Implementation of SOAR☆49Sep 17, 2025Updated 4 months ago
- An Open Source SLM Trained for MCP☆23May 18, 2025Updated 8 months ago
- Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"☆343Nov 10, 2025Updated 3 months ago
- RAG Agent for the ARC AGI Challenge☆20Jul 1, 2024Updated last year
- Abstract Reasoning with Graph Abstractions (ARGA) implementation☆61Jul 5, 2024Updated last year
- ☆100Feb 1, 2026Updated 2 weeks ago
- Framework enabling modular interchange of language agents, environments, and optimizers☆121Updated this week
- A GPT with self-similar nested properties☆20Mar 19, 2024Updated last year
- ☆23Apr 4, 2024Updated last year
- Latent Program Network (from the "Searching Latent Program Spaces" paper)☆107Nov 25, 2025Updated 2 months ago
- Pretraining and inference code for a large-scale depth-recurrent language model☆863Dec 29, 2025Updated last month
- The rule-based evaluation subset and code implementation of Omni-MATH☆26Dec 23, 2024Updated last year
- my solution for Abstaction and reasoning challenge on kaggle☆10Jun 23, 2024Updated last year
- Some basic tools for interacting with `tcf-agent`☆11Jan 19, 2024Updated 2 years ago
- A Pytorch implementation of "Measuring abstract reasoning in neural networks" in ICML 2018 by DeepMind☆37Jul 8, 2023Updated 2 years ago
- A model-based API Fuzzer for SMT Solvers.☆14Oct 14, 2025Updated 4 months ago
- Run GEPA on your favorite non-python libraries.☆32Jan 22, 2026Updated 3 weeks ago
- Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym [ICML 2025]☆627Jul 29, 2025Updated 6 months ago
- Materials for ConceptARC paper☆114Updated this week
- Understanding R1-Zero-Like Training: A Critical Perspective☆1,205Aug 27, 2025Updated 5 months ago