☆28Nov 10, 2025Updated 3 months ago
Alternatives and similar repositories for swecomm
Users that are interested in swecomm are comparing it to the libraries listed below
Sorting:
- ☆12Mar 5, 2025Updated 11 months ago
- ☆36May 25, 2023Updated 2 years ago
- ☆132Jun 6, 2025Updated 8 months ago
- ☆11Jan 3, 2024Updated 2 years ago
- ☆12Nov 5, 2024Updated last year
- Human Oversight for Autonomous AI Agents using Azure Logic Apps + Python☆20Feb 20, 2026Updated last week
- Repo for the paper: Towards Few-shot Entity Recognition in Document Images:A Label-aware Sequence-to-Sequence Framework☆14May 31, 2023Updated 2 years ago
- The Infibench variant of bigcode-evaluation-harness --- a framework for the evaluation of autoregressive code generation language models.☆14Oct 19, 2024Updated last year
- Run SWE-bench evaluations remotely☆58Aug 14, 2025Updated 6 months ago
- [ICLR 2025] "Training LMs on Synthetic Edit Sequences Improves Code Synthesis" (Piterbarg, Pinto, Fergus)☆19Feb 11, 2025Updated last year
- A Comprehensive Benchmark for Robust Multi-image Understanding☆19Sep 4, 2024Updated last year
- ☆629Sep 1, 2025Updated 6 months ago
- Official repo for "Imagination-Augmented Natural Language Understanding", NAACL 2022.☆17Aug 30, 2022Updated 3 years ago
- [NeurIPS'25] Official codebase for "SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution"☆678Mar 16, 2025Updated 11 months ago
- The official repo for the paper Can ChatGPT replace StackOverflow? A Study on Robustness and Reliability of Large Language Model Code Gen…☆20Feb 27, 2024Updated 2 years ago
- (ACL 2025 Main) Code for MultiAgentBench : Evaluating the Collaboration and Competition of LLM agents https://www.arxiv.org/pdf/2503.019…☆34Jun 21, 2025Updated 8 months ago
- Code for the paper: CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models☆31Apr 1, 2025Updated 11 months ago
- Source code for InBedder, an instruction-following text embedder☆30Oct 11, 2024Updated last year
- Official implementation of NeurIPS'23 paper, Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced Datasets☆27Jan 29, 2024Updated 2 years ago
- EMNLP2023 - InfoSeek: A New VQA Benchmark focus on Visual Info-Seeking Questions☆25May 30, 2024Updated last year
- ☆37Jan 25, 2024Updated 2 years ago
- ☆47Oct 28, 2025Updated 4 months ago
- [NeurIPS 2024] Evaluation harness for SWT-Bench, a benchmark for evaluating LLM repository-level test-generation☆71Jan 15, 2026Updated last month
- ☆37Oct 15, 2024Updated last year
- ☆123Feb 21, 2025Updated last year
- [ICLR 2025] Benchmarking Agentic Workflow Generation☆144Feb 19, 2025Updated last year
- [AAAI 2025 oral] Evaluating Mathematical Reasoning Beyond Accuracy☆77Oct 9, 2025Updated 4 months ago
- The official repo for the code and data of paper SMART☆38Feb 20, 2025Updated last year
- Bayes-Adaptive RL for LLM Reasoning☆45May 28, 2025Updated 9 months ago
- ☆14Updated this week
- A Manually-Annotated Code Generation Benchmark Aligned with Real-World Code Repositories☆36Sep 4, 2024Updated last year
- Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym [ICML 2025]☆644Jul 29, 2025Updated 7 months ago
- A full-stack AI-powered business intelligence tool for non-experts, featuring serverless backend processing and a secure Streamlit fronte…☆27Feb 13, 2026Updated 2 weeks ago
- ☆17Jan 23, 2026Updated last month
- e☆43Apr 23, 2025Updated 10 months ago
- Official code for the paper "CodeChain: Towards Modular Code Generation Through Chain of Self-revisions with Representative Sub-modules"☆49Nov 10, 2025Updated 3 months ago
- Baselines for all tasks from Long Code Arena benchmarks 🏟️☆39Mar 30, 2025Updated 11 months ago
- Agentless Lite: RAG-based SWE-Bench software engineering scaffold☆45Apr 15, 2025Updated 10 months ago
- GitHub Copilot Adoption Plan - Workshops - Full Solution☆18Feb 18, 2026Updated last week