Moatless Testbeds allows you to create isolated testbed environments in a Kubernetes cluster where you can apply code changes through git patches and run tests or SWE-Bench evaluations.
☆14Apr 9, 2025Updated last year
Alternatives and similar repositories for moatless-testbeds
Users that are interested in moatless-testbeds are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆12Nov 5, 2024Updated last year
- ☆141Jun 6, 2025Updated last year
- ☆643Sep 1, 2025Updated 10 months ago
- Legacy Code of ZJU Campus App for iOS☆11Jan 31, 2024Updated 2 years ago
- ☆13May 23, 2021Updated 5 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆13Dec 31, 2023Updated 2 years ago
- [ICLR 2025] "Training LMs on Synthetic Edit Sequences Improves Code Synthesis" (Piterbarg, Pinto, Fergus)☆19Feb 11, 2025Updated last year
- Allows Aider to use CEDARScript as an edit format☆30Dec 3, 2024Updated last year
- Collect papers related to personalized text generation☆18Sep 6, 2021Updated 4 years ago
- ☆12Aug 26, 2022Updated 3 years ago
- Code to compute AnthroScore, a computational linguistic measure of anthropomorphism in text☆19Mar 31, 2025Updated last year
- Utilities for efficient fine-tuning, inference and evaluation of code generation models☆21Oct 3, 2023Updated 2 years ago
- ☆23Dec 8, 2022Updated 3 years ago
- Code and data for EMNLP 2023 paper "Grounding Visual Illusions in Language: Do Vision-Language Models Perceive Illusions Like Humans?"☆15Jan 25, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- A full codebase for replicating the results of Nougat from downloading arXiv dataset to the final evaluation. It also contains a few fixe…☆11Dec 11, 2023Updated 2 years ago
- A concurrent LRU cache.☆23Feb 14, 2021Updated 5 years ago
- Landing page + leaderboard for SWE-Bench benchmark☆15Mar 29, 2026Updated 3 months ago
- Code for Neural Execution Engines: Learning to Execute Subroutines☆18Jan 11, 2021Updated 5 years ago
- Can It Edit? Evaluating the Ability of Large Language Models to Follow Code Editing Instructions☆49Sep 13, 2025Updated 9 months ago
- ☆21Jun 13, 2024Updated 2 years ago
- Agentless Lite: RAG-based SWE-Bench software engineering scaffold☆49Apr 15, 2025Updated last year
- [ICLR2024] Codes and Models for COSA: Concatenated Sample Pretrained Vision-Language Foundation Model☆43Dec 25, 2024Updated last year
- Syntax Error-Free and Generalizable Tool Use for LLMs via Finite-State Decoding☆31Jan 28, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Code for EMNLP 2022 Paper DANLI: Deliberative Agent for Following Natural Language Instructions☆18May 1, 2025Updated last year
- Socratic-Zero is a fully autonomous framework that generates high-quality training data for mathematical reasoning☆37Oct 26, 2025Updated 8 months ago
- ProgQuery is a system to extract useful syntactic and semantic information from source code programs and store it in a graph database for…☆17Jan 22, 2025Updated last year
- A minimal language for Isabelle/HOL, designed for easing machine learning.☆29Updated this week
- (ACL2025 Findings) Official code for the paper "STeCa: Step-level Trajectory Calibration for LLM Agent Learning"☆29Mar 2, 2026Updated 4 months ago
- ☆80Feb 5, 2026Updated 4 months ago
- [NeurIPS 2025 D&B] 🚀 SWE-bench Goes Live!☆200Jun 11, 2026Updated 3 weeks ago
- ☆23Dec 7, 2023Updated 2 years ago
- ARC gym: a data generation framework for the Abstraction & Reasoning Corpus☆25Mar 25, 2026Updated 3 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Official implementation of paper How to Understand Whole Repository? New SOTA on SWE-bench Lite (21.3%)☆98Mar 26, 2025Updated last year
- A PyTorch implementation of LDAST☆26Dec 17, 2023Updated 2 years ago
- Enhanced fork of SWE-bench, tailored for OpenDevin's ecosystem.☆29May 26, 2024Updated 2 years ago
- Synthesis of loop-free programs☆24Jun 16, 2026Updated 2 weeks ago
- Automated High-Performance GPU Kernel Generation☆119Jun 1, 2026Updated last month
- [TMLR 2026] A Searching-based Agent Model for Open-Domain Open-Ended Question Answering☆39Jun 20, 2025Updated last year
- ☆17Jan 7, 2024Updated 2 years ago