☆132May 8, 2025Updated 9 months ago
Alternatives and similar repositories for SWE-Fixer
Users that are interested in SWE-Fixer are comparing it to the libraries listed below
Sorting:
- ☆12Mar 5, 2025Updated 11 months ago
- Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym [ICML 2025]☆632Jul 29, 2025Updated 7 months ago
- Run SWE-bench evaluations remotely☆58Aug 14, 2025Updated 6 months ago
- ☆132Jun 6, 2025Updated 8 months ago
- [NeurIPS 2025 D&B Spotlight] Scaling Data for SWE-agents☆577Updated this week
- Official repository for ACL 2025 paper "ProcessBench: Identifying Process Errors in Mathematical Reasoning"☆184May 20, 2025Updated 9 months ago
- Modified Beam Search with periodical restart☆12Sep 12, 2024Updated last year
- [NeurIPS'25] Official codebase for "SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution"☆678Mar 16, 2025Updated 11 months ago
- code for training and using chess embeddings models☆13Jun 9, 2024Updated last year
- Sandboxed code execution for AI agents, locally or on the cloud. Massively parallel, easy to extend. Powering SWE-agent and more.☆443Updated this week
- RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment☆16Dec 19, 2024Updated last year
- Plancraft is a minecraft environment and agent suite to test planning capabilities in LLMs☆26Nov 7, 2025Updated 3 months ago
- Agentless Lite: RAG-based SWE-Bench software engineering scaffold☆45Apr 15, 2025Updated 10 months ago
- [NeurIPS 2025 D&B] 🚀 SWE-bench Goes Live!☆165Updated this week
- Set-Encoder: Permutation-Invariant Inter-Passage Attention for Listwise Passage Re-Ranking with Cross-Encoders☆18May 23, 2025Updated 9 months ago
- A dataset of LLM-generated chain-of-thought steps annotated with mistake location.☆86Aug 10, 2024Updated last year
- SWE-Swiss: A Multi-Task Fine-Tuning and RL Recipe for High-Performance Issue Resolution☆104Sep 24, 2025Updated 5 months ago
- Agentless🐱: an agentless approach to automatically solve software development problems☆2,010Dec 22, 2024Updated last year
- [NeurIPS 2024] The official implementation of "Image Copy Detection for Diffusion Models"☆18Oct 1, 2024Updated last year
- Inference code of Lingma SWE-GPT☆253Dec 2, 2024Updated last year
- SWE-Flow: Synthesizing Software Engineering Data in a Test-Driven Manner☆33Jun 29, 2025Updated 8 months ago
- The official repo of continuous speculative decoding☆31Mar 28, 2025Updated 11 months ago
- [EMNLP 2024] Tree of Problems: Improving structured problem solving with compositionality☆19Mar 4, 2025Updated 11 months ago
- ☆17Dec 16, 2024Updated last year
- ☆16Jul 23, 2024Updated last year
- ☆104Jul 17, 2024Updated last year
- ☆159Aug 27, 2024Updated last year
- [ICML 2025 Oral] CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction☆568May 6, 2025Updated 9 months ago
- [ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆125Jun 11, 2025Updated 8 months ago
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆40Jun 10, 2024Updated last year
- [ACL 2025 Main] Official Repository for "Evaluating Language Models as Synthetic Data Generators"☆41Dec 13, 2024Updated last year
- This project showcases engaging interactions between two AI chatbots.☆10Jan 10, 2024Updated 2 years ago
- Nexusflow function call, tool use, and agent benchmarks.☆30Dec 13, 2024Updated last year
- ☆20Nov 4, 2025Updated 3 months ago
- ☆19Jan 10, 2025Updated last year
- ✨ RepoBench: Benchmarking Repository-Level Code Auto-Completion Systems - ICLR 2024☆189Aug 16, 2024Updated last year
- 🚀 Scale your RAG pipeline using Ragswift: A scalable centralized embeddings management platform☆38Jan 29, 2024Updated 2 years ago
- ☆50Aug 21, 2025Updated 6 months ago
- Systematic evaluation framework that automatically rates overthinking behavior in large language models.☆96May 16, 2025Updated 9 months ago