Moatless Testbeds allows you to create isolated testbed environments in a Kubernetes cluster where you can apply code changes through git patches and run tests or SWE-Bench evaluations.
☆14Apr 9, 2025Updated last year
Alternatives and similar repositories for moatless-testbeds
Users that are interested in moatless-testbeds are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆12Nov 5, 2024Updated last year
- ☆137Jun 6, 2025Updated last year
- ☆641Sep 1, 2025Updated 9 months ago
- Rust In-Memory Filesystem☆17Nov 28, 2019Updated 6 years ago
- Soar with Haskell, Published by Packt☆13Jan 10, 2024Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆13Dec 31, 2023Updated 2 years ago
- Allows Aider to use CEDARScript as an edit format☆30Dec 3, 2024Updated last year
- Collect papers related to personalized text generation☆18Sep 6, 2021Updated 4 years ago
- Code to compute AnthroScore, a computational linguistic measure of anthropomorphism in text☆19Mar 31, 2025Updated last year
- Utilities for efficient fine-tuning, inference and evaluation of code generation models☆21Oct 3, 2023Updated 2 years ago
- ☆23Dec 8, 2022Updated 3 years ago
- ☆11Nov 10, 2023Updated 2 years ago
- Project page for the 'CLAWS: Clustering Assisted Weakly Supervised Learning with Normalcy Suppression for Anomalous Event Detection', ECC…☆12May 29, 2021Updated 5 years ago
- Code for Neural Execution Engines: Learning to Execute Subroutines☆18Jan 11, 2021Updated 5 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆21Jun 13, 2024Updated 2 years ago
- Agentless Lite: RAG-based SWE-Bench software engineering scaffold☆48Apr 15, 2025Updated last year
- Syntax Error-Free and Generalizable Tool Use for LLMs via Finite-State Decoding☆31Jan 28, 2024Updated 2 years ago
- Socratic-Zero is a fully autonomous framework that generates high-quality training data for mathematical reasoning☆36Oct 26, 2025Updated 7 months ago
- ProgQuery is a system to extract useful syntactic and semantic information from source code programs and store it in a graph database for…☆17Jan 22, 2025Updated last year
- A minimal language for Isabelle/HOL, designed for easing machine learning.☆28Updated this week
- (ACL2025 Findings) Official code for the paper "STeCa: Step-level Trajectory Calibration for LLM Agent Learning"☆28Mar 2, 2026Updated 3 months ago
- [NeurIPS 2025 D&B] 🚀 SWE-bench Goes Live!☆194Jun 6, 2026Updated last week
- CRUXEval: Code Reasoning, Understanding, and Execution Evaluation☆169Oct 11, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆23Dec 7, 2023Updated 2 years ago
- Official implementation of paper How to Understand Whole Repository? New SOTA on SWE-bench Lite (21.3%)☆97Mar 26, 2025Updated last year
- Enhanced fork of SWE-bench, tailored for OpenDevin's ecosystem.☆30May 26, 2024Updated 2 years ago
- Synthesis of loop-free programs☆24Jun 3, 2026Updated last week
- A Searching-based Agent Model for Open-Domain Open-Ended Question Answering☆39Jun 20, 2025Updated 11 months ago
- ☆17Jan 7, 2024Updated 2 years ago
- Code for Aesop: Paraphrase Generation with Adaptive Syntactic Control (EMNLP 2021)☆26Jan 17, 2022Updated 4 years ago
- This is the repo for an incremental pointer analysis for Java programs. This repo has been adopted by WALA☆25Feb 13, 2023Updated 3 years ago
- The approach involves the usage of Multi-Criteria Decision Analyses, including Weighted Sum Model (WSM), Weighted Product Model (WPM) and…☆11Oct 22, 2021Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Harness used to benchmark aider against SWE Bench benchmarks☆84Jun 27, 2024Updated last year
- Fast training of unitary deep network layers from low-rank updates☆29Dec 11, 2022Updated 3 years ago
- OSWorld: A unified, real computer environment for multimodal agents to evaluate open-ended computer tasks involving arbitrary apps and in…☆21Apr 28, 2024Updated 2 years ago
- LinearArbitrary-SeaHorn is a CHC solver for LLVM-based languages.☆22Mar 13, 2023Updated 3 years ago
- AskIt (for JavaScript/TypeScript): Unified programming interface for large language models (GPT-4, GPT-3.5)☆35Oct 1, 2023Updated 2 years ago
- ☆19Jul 9, 2023Updated 2 years ago
- ☆36May 25, 2023Updated 3 years ago