Aider-AI / refactor-benchmark
Aider's refactoring benchmark exercises based on popular python repos
☆61Updated 4 months ago
Alternatives and similar repositories for refactor-benchmark:
Users that are interested in refactor-benchmark are comparing it to the libraries listed below
- Coding problems used in aider's polyglot benchmark☆62Updated 2 months ago
- Harness used to benchmark aider against SWE Bench benchmarks☆66Updated 8 months ago
- ☆38Updated last year
- Sandboxed code execution for AI agents, locally or on the cloud. Massively parallel, easy to extend. Powering SWE-agent and more.☆101Updated this week
- proof-of-concept of Cursor's Instant Apply feature☆67Updated 6 months ago
- A Python library to orchestrate LLMs in a neural network-inspired structure☆46Updated 5 months ago
- Leveraging DSPy for AI-driven task understanding and solution generation, the Self-Discover Framework automates problem-solving through r…☆59Updated 7 months ago
- Client Code Examples, Use Cases and Benchmarks for Enterprise h2oGPTe RAG-Based GenAI Platform☆83Updated 3 weeks ago
- DSPy program/pipeline inspector widget for Jupyter/VSCode Notebooks.☆32Updated last year
- Agent computer interface for AI software engineer.☆38Updated this week
- A better way of testing, inspecting, and analyzing AI Agent traces.☆28Updated last week
- Simple Graph Memory for AI applications☆83Updated 7 months ago
- GPT-4 Level Conversational QA Trained In a Few Hours☆58Updated 6 months ago
- auto fine tune of models with synthetic data☆74Updated last year
- A Ruby on Rails style framework for the DSPy (Demonstrate, Search, Predict) project for Language Models like GPT, BERT, and LLama.☆121Updated 4 months ago
- ☆68Updated last month
- ☆74Updated last year
- Anthropic Computer Use with Modal Sandboxes☆28Updated 4 months ago
- Reactive DDD with DSPy☆22Updated last year
- A framework for evaluating function calls made by LLMs☆37Updated 7 months ago
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆91Updated last month
- ☆20Updated last year
- Chat Markup Language conversation library☆55Updated last year
- Just a bunch of benchmark logs for different LLMs☆119Updated 7 months ago
- ☆14Updated last month
- Embed anything.☆29Updated 9 months ago
- Synthetic data derived by templating, few shot prompting, transformations on public domain corpora, and monte carlo tree search.☆31Updated this week
- Routing on Random Forest (RoRF)☆124Updated 5 months ago
- Score LLM pretraining data with classifiers☆54Updated last year
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments☆72Updated 5 months ago