yingweima2022 / SWE-ReasonerLinks
☆22Updated last month
Alternatives and similar repositories for SWE-Reasoner
Users that are interested in SWE-Reasoner are comparing it to the libraries listed below
Sorting:
- ☆19Updated 3 months ago
- Neural Code Intelligence Survey 2024; Reading lists and resources☆268Updated last month
- Code, benchmark and environment for "ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows"☆110Updated 3 weeks ago
- Awesome-Long2short-on-LRMs is a collection of state-of-the-art, novel, exciting long2short methods on large reasoning models. It contains…☆245Updated last month
- ☆53Updated 3 months ago
- An Evolving Code Generation Benchmark Aligned with Real-world Code Repositories☆63Updated last year
- Reproducing R1 for Code with Reliable Rewards☆256Updated 4 months ago
- ☆267Updated 2 months ago
- The repository for paper "DebugBench: "Evaluating Debugging Capability of Large Language Models".☆81Updated last year
- 😎 A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, Agent, and Beyond☆294Updated last week
- [ICLR 2025] Code and Data Repo for Paper "Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation"☆79Updated 9 months ago
- A Lightweight Visual Reasoning Benchmark for Evaluating Large Multimodal Models through Complex Diagrams in Coding Tasks☆12Updated 6 months ago
- Repo-Level Code generation papers☆210Updated 2 months ago
- ☆67Updated 3 months ago
- A research repo for experiments about Reinforcement Finetuning☆52Updated 5 months ago
- ☆111Updated this week
- [EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"☆137Updated last year
- ☆12Updated last year
- ☆49Updated 10 months ago
- [EMNLP 2024] Multi-modal reasoning problems via code generation.☆25Updated 7 months ago
- A comprehensive collection of process reward models.☆108Updated 2 months ago
- [ACL 2025] A Neural-Symbolic Self-Training Framework☆112Updated 3 months ago
- Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".☆81Updated 3 months ago
- this is an implementation for the paper Improve Mathematical Reasoning in Language Models by Automated Process Supervision from google de…☆39Updated 2 months ago
- Code and data for "Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning?" (ACL 2024)☆32Updated last year
- Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)☆60Updated 11 months ago
- ☆31Updated 3 months ago
- [ACL' 25] The official code repository for PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models.☆81Updated 7 months ago
- [ACL2024] Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex Scenarios☆63Updated last month
- ☆287Updated 3 months ago